Skip to main content
Article thumbnail
Location of Repository

Identification of diverse database subsets using property-based and fragment-based molecular descriptions\ud

By M. Ashton, J. Barnard, F. Casset, M. Charlton, G. Downs, D. Gorse, J.D. Holliday, R. Lahana and P. Willett


This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means cluster-based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k-means subsets; that the property-based descriptors are marginally superior to the fragment-based descriptors; and that both approaches are noticeably superior to random selection

Publisher: Wiley
Year: 2003
OAI identifier:

Suggested articles

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.