4 research outputs found
Similarity searching in files of three-dimensional chemical structures: Comparison of fragment-based measures of shape similarity
This paper compares several fragment-based measures that can be used to quantify the degree of similarity
between pairs of three-dimensional (3-D) chemical structures. The fragments that are considered contain two,
three, or four atoms and encode distance information, angular information or both but do not involve chemical
information such as atomic type. The effectiveness of the various measures is compared using eight literature
datasets for which biological-activity data and calculated 3-D structures are available, and a set of carbohydrate
structures from the Cambridge Structural Database that have been classified into eight distinct groups.
Similarity searches on these datasets suggest that the four-atom fragments are the most effective
Similarity searching in databases of three-dimensional molecules and macromolecules
This paper discusses algorithmic techniques for measuring the degree of similarity between pairs of threedimensional
(3-D) chemical molecules represented by interatomic distance matrices. A comparison of four
methods for the calculation of 3-D structural similarity suggests that the most effective one is a procedure
that identifies pairs of atoms, one from each of the molecules that are being compared, that lie at the center
of geometrically-related volumes of 3-D space. This atom mapping method enables the calculation of a wide
range of types of intermolecular similarity coefficient, including measures that are based on physicochemical
data. Massively-parallel implementations of the method are discussed, using the AMT Distributed Array
Processor, that achieve a substantial increase in performance when compared with a sequential implementation
on a UNIX workstation. Current work involves the use of angular information and the extension of the method
to field-based similarity searching. Similarity searching in 3-D macromolecules is effected by the use of a
maximal common subgraph (MCS) isomorphism algorithm with a novel, graph-based representation of the
tertiary structures of proteins. This algorithm is being used to identify similarities between the 3-D structures
of proteins in the Brookhaven Protein Data Bank; its use is exemplified by searches involving the NAD-binding
fold motif
Searching for angular patterns in files of three-dimensional chemical structures
SIGLEAvailable from British Library Document Supply Centre- DSC:2113.56F(BLRDR--6065) / BLDSC - British Library Document Supply CentreGBUnited Kingdo