15 research outputs found
Similarity searching in databases of three-dimensional molecules and macromolecules
This paper discusses algorithmic techniques for measuring the degree of similarity between pairs of threedimensional
(3-D) chemical molecules represented by interatomic distance matrices. A comparison of four
methods for the calculation of 3-D structural similarity suggests that the most effective one is a procedure
that identifies pairs of atoms, one from each of the molecules that are being compared, that lie at the center
of geometrically-related volumes of 3-D space. This atom mapping method enables the calculation of a wide
range of types of intermolecular similarity coefficient, including measures that are based on physicochemical
data. Massively-parallel implementations of the method are discussed, using the AMT Distributed Array
Processor, that achieve a substantial increase in performance when compared with a sequential implementation
on a UNIX workstation. Current work involves the use of angular information and the extension of the method
to field-based similarity searching. Similarity searching in 3-D macromolecules is effected by the use of a
maximal common subgraph (MCS) isomorphism algorithm with a novel, graph-based representation of the
tertiary structures of proteins. This algorithm is being used to identify similarities between the 3-D structures
of proteins in the Brookhaven Protein Data Bank; its use is exemplified by searches involving the NAD-binding
fold motif
Lessons learned from exploring the backtracking paradigm on the GPU
Abstract. We explore the backtracking paradigm with properties seen as sub-optimal for GPU architectures, using as a case study the maximal clique enumeration problem, and find that the presence of these properties limit GPU performance to approximately 1.4–2.25 times a single CPU core. The GPU performance “lessons ” we find critical to providing this performance include a coarse-and-fine-grain parallelization of the search space, a low-overhead load-balanced distribution of work, global memory latency hiding through coalescence, saturation, and shared memory utilization, and the use of GPU output buffering as a solution to irregular workloads and a large solution domain. We also find a strong reliance on an efficient global problem structure representation that bounds any efficiencies gained from these lessons, and discuss the meanings of these results to backtracking problems in general.
Representation of protein secondary structure using bond-orientational order parameters
Structural studies of proteins for motif mining and other pattern recognition techniques require the abstraction of the structure into simpler elements for robust matching. In this study, we propose the use of bond-orientational order parameters, a well-established metric usually employed to compare atom packing in crystals and liquids. Creating a vector of orientational order parameters of residue centers in a sliding window fashion provides us with a descriptor of local structure and connectivity around each residue that is easy to calculate and compare. To test whether this representation is feasible and applicable to protein structures, we tried to predict the secondary structure of protein segments from those descriptors, resulting in 0.99 AUC (area under the ROC curve). Clustering those descriptors to 6 clusters also yield 0.93 AUC, showing that these descriptors can be used to capture and distinguish local structural information
Sur les grands massifs karstiques d'Andalousie : Manuel C. Pezzi
Nicod Jean. Sur les grands massifs karstiques d'Andalousie : Manuel C. Pezzi . In: Annales de Géographie, t. 88, n°490, 1979. pp. 753-755