Location of Repository

Identification of diverse database subsets using property-based and fragment-based molecular descriptions\ud

By M. Ashton, J. Barnard, F. Casset, M. Charlton, G. Downs, D. Gorse, J.D. Holliday, R. Lahana and P. Willett

Abstract

This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means cluster-based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k-means subsets; that the property-based descriptors are marginally superior to the fragment-based descriptors; and that both approaches are noticeably superior to random selection

Publisher: Wiley
Year: 2003
OAI identifier: oai:eprints.whiterose.ac.uk:3570

Suggested articles

Preview


To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.