Figure 8.

Size distribution of clusters of the HIV data set. Clustering the active and moderately-active compounds from the NCI HIV screen using the QT (Quality Threshold) clustering algorithm [28], using the Tanimoto distance and 0.5 as the cluster diameter parameter, produces a few large clusters and many smaller clusters. The size of the one hundred largest clusters found by this method are plotted in decreasing order. These sizes follow a power-law distribution.

Nasr et al. Journal of Cheminformatics 2009 1:7   doi:10.1186/1758-2946-1-7
Download authors' original image