3 research outputs found
New error measures and methods for realizing protein graphs from distance data
The interval Distance Geometry Problem (iDGP) consists in finding a
realization in of a simple undirected graph with
nonnegative intervals assigned to the edges in such a way that, for each edge,
the Euclidean distance between the realization of the adjacent vertices is
within the edge interval bounds. In this paper, we focus on the application to
the conformation of proteins in space, which is a basic step in determining
protein function: given interval estimations of some of the inter-atomic
distances, find their shape. Among different families of methods for
accomplishing this task, we look at mathematical programming based methods,
which are well suited for dealing with intervals. The basic question we want to
answer is: what is the best such method for the problem? The most meaningful
error measure for evaluating solution quality is the coordinate root mean
square deviation. We first introduce a new error measure which addresses a
particular feature of protein backbones, i.e. many partial reflections also
yield acceptable backbones. We then present a set of new and existing quadratic
and semidefinite programming formulations of this problem, and a set of new and
existing methods for solving these formulations. Finally, we perform a
computational evaluation of all the feasible solverformulation combinations
according to new and existing error measures, finding that the best methodology
is a new heuristic method based on multiplicative weights updates
Cycle-based formulations in Distance Geometry
The distance geometry problem asks to find a realization of a given simple
edge-weighted graph in a Euclidean space of given dimension K, where the edges
are realized as straight segments of lengths equal (or as close as possible) to
the edge weights. The problem is often modelled as a mathematical programming
formulation involving decision variables that determine the position of the
vertices in the given Euclidean space. Solution algorithms are generally
constructed using local or global nonlinear optimization techniques. We present
a new modelling technique for this problem where, instead of deciding vertex
positions, formulations decide the length of the segments representing the
edges in each cycle in the graph, projected in every dimension. We propose an
exact formulation and a relaxation based on a Eulerian cycle. We then compare
computational results from protein conformation instances obtained with
stochastic global optimization techniques on the new cycle-based formulation
and on the existing edge-based formulation. While edge-based formulations take
less time to reach termination, cycle-based formulations are generally better
on solution quality measures.Comment: 16 page