12,614 research outputs found
Euclidean distance geometry and applications
Euclidean distance geometry is the study of Euclidean geometry based on the
concept of distance. This is useful in several applications where the input
data consists of an incomplete set of distances, and the output is a set of
points in Euclidean space that realizes the given distances. We survey some of
the theory of Euclidean distance geometry and some of the most important
applications: molecular conformation, localization of sensor networks and
statics.Comment: 64 pages, 21 figure
GreMuTRRR: A Novel Genetic Algorithm to Solve Distance Geometry Problem for Protein Structures
Nuclear Magnetic Resonance (NMR) Spectroscopy is a widely used technique to
predict the native structure of proteins. However, NMR machines are only able
to report approximate and partial distances between pair of atoms. To build the
protein structure one has to solve the Euclidean distance geometry problem
given the incomplete interval distance data produced by NMR machines. In this
paper, we propose a new genetic algorithm for solving the Euclidean distance
geometry problem for protein structure prediction given sparse NMR data. Our
genetic algorithm uses a greedy mutation operator to intensify the search, a
twin removal technique for diversification in the population and a random
restart method to recover stagnation. On a standard set of benchmark dataset,
our algorithm significantly outperforms standard genetic algorithms.Comment: Accepted for publication in the 8th International Conference on
Electrical and Computer Engineering (ICECE 2014
Protein structure determination via an efficient geometric build-up algorithm
Abstract Background A protein structure can be determined by solving a so-called distance geometry problem whenever a set of inter-atomic distances is available and sufficient. However, the problem is intractable in general and has proved to be a NP hard problem. An updated geometric build-up algorithm (UGB) has been developed recently that controls numerical errors and is efficient in protein structure determination for cases where only sparse exact distance data is available. In this paper, the UGB method has been improved and revised with aims at solving distance geometry problems more efficiently and effectively. Methods An efficient algorithm (called the revised updated geometric build-up algorithm (RUGB)) to build up a protein structure from atomic distance data is presented and provides an effective way of determining a protein structure with sparse exact distance data. In the algorithm, the condition to determine an unpositioned atom iteratively is relaxed (when compared with the UGB algorithm) and data structure techniques are used to make the algorithm more efficient and effective. The algorithm is tested on a set of proteins selected randomly from the Protein Structure Database-PDB. Results We test a set of proteins selected randomly from the Protein Structure Database-PDB. We show that the numerical errors produced by the new RUGB algorithm are smaller when compared with the errors of the UGB algorithm and that the novel RUGB algorithm has a significantly smaller runtime than the UGB algorithm. Conclusions The RUGB algorithm relaxes the condition for updating and incorporates the data structure for accessing neighbours of an atom. The revisions result in an improvement over the UGB algorithm in two important areas: a reduction on the overall runtime and decrease of the numeric error.Peer Reviewe
New error measures and methods for realizing protein graphs from distance data
The interval Distance Geometry Problem (iDGP) consists in finding a
realization in of a simple undirected graph with
nonnegative intervals assigned to the edges in such a way that, for each edge,
the Euclidean distance between the realization of the adjacent vertices is
within the edge interval bounds. In this paper, we focus on the application to
the conformation of proteins in space, which is a basic step in determining
protein function: given interval estimations of some of the inter-atomic
distances, find their shape. Among different families of methods for
accomplishing this task, we look at mathematical programming based methods,
which are well suited for dealing with intervals. The basic question we want to
answer is: what is the best such method for the problem? The most meaningful
error measure for evaluating solution quality is the coordinate root mean
square deviation. We first introduce a new error measure which addresses a
particular feature of protein backbones, i.e. many partial reflections also
yield acceptable backbones. We then present a set of new and existing quadratic
and semidefinite programming formulations of this problem, and a set of new and
existing methods for solving these formulations. Finally, we perform a
computational evaluation of all the feasible solverformulation combinations
according to new and existing error measures, finding that the best methodology
is a new heuristic method based on multiplicative weights updates
Machine Learning, Quantum Mechanics, and Chemical Compound Space
We review recent studies dealing with the generation of machine learning
models of molecular and solid properties. The models are trained and validated
using standard quantum chemistry results obtained for organic molecules and
materials selected from chemical space at random
- …