21 research outputs found

    Computational Protein Design Using AND/OR Branch-and-Bound Search

    Full text link
    The computation of the global minimum energy conformation (GMEC) is an important and challenging topic in structure-based computational protein design. In this paper, we propose a new protein design algorithm based on the AND/OR branch-and-bound (AOBB) search, which is a variant of the traditional branch-and-bound search algorithm, to solve this combinatorial optimization problem. By integrating with a powerful heuristic function, AOBB is able to fully exploit the graph structure of the underlying residue interaction network of a backbone template to significantly accelerate the design process. Tests on real protein data show that our new protein design algorithm is able to solve many prob- lems that were previously unsolvable by the traditional exact search algorithms, and for the problems that can be solved with traditional provable algorithms, our new method can provide a large speedup by several orders of magnitude while still guaranteeing to find the global minimum energy conformation (GMEC) solution.Comment: RECOMB 201

    BALL - biochemical algorithms library 1.3

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Biochemical Algorithms Library (BALL) is a comprehensive rapid application development framework for structural bioinformatics. It provides an extensive C++ class library of data structures and algorithms for molecular modeling and structural bioinformatics. Using BALL as a programming toolbox does not only allow to greatly reduce application development times but also helps in ensuring stability and correctness by avoiding the error-prone reimplementation of complex algorithms and replacing them with calls into the library that has been well-tested by a large number of developers. In the ten years since its original publication, BALL has seen a substantial increase in functionality and numerous other improvements.</p> <p>Results</p> <p>Here, we discuss BALL's current functionality and highlight the key additions and improvements: support for additional file formats, molecular edit-functionality, new molecular mechanics force fields, novel energy minimization techniques, docking algorithms, and support for cheminformatics.</p> <p>Conclusions</p> <p>BALL is available for all major operating systems, including Linux, Windows, and MacOS X. It is available free of charge under the Lesser GNU Public License (LPGL). Parts of the code are distributed under the GNU Public License (GPL). BALL is available as source code and binary packages from the project web site at <url>http://www.ball-project.org</url>. Recently, it has been accepted into the debian project; integration into further distributions is currently pursued.</p

    A Peaceman-Rachford Splitting Method for the Protein Side-Chain Positioning Problem

    Full text link
    We formulate a doubly nonnegative (DNN) relaxation of the protein side-chain positioning (SCP) problem. We take advantage of the natural splitting of variables that stems from the facial reduction technique in the semidefinite relaxation, and we solve the relaxation using a variation of the Peaceman-Rachford splitting method. Our numerical experiments show that we solve all our instances of the SCP problem to optimality

    Optimization of van der Waals Energy for Protein Side-Chain Placement and Design

    Get PDF
    AbstractComputational determination of optimal side-chain conformations in protein structures has been a long-standing and challenging problem. Solving this problem is important for many applications including homology modeling, protein docking, and for placing small molecule ligands on protein-binding sites. Programs available as of this writing are very fast and reasonably accurate, as measured by deviations of side-chain dihedral angles; however, often due to multiple atomic clashes, they produce structures with high positive energies. This is problematic in applications where the energy values are important, for example when placing small molecules in docking applications; the relatively small binding energy of the small molecule is drowned by the large energy due to atomic clashes that hampers finding the lowest energy state of the docked ligand. To address this we have developed an algorithm for generating a set of side-chain conformations that is dense enough that at least one of its members would have a root mean-square deviation of no more than R Å from any possible side-chain conformation of the amino acid. We call such a set a side-chain cover set of order R for the amino acid. The size of the set is constrained by the energy of the interaction of the side chain to the backbone atoms. Then, side-chain cover sets are used to optimize the conformation of the side chains given the coordinates of the backbone of a protein. The method we use is based on a variety of dead-end elimination methods and the recently discovered dynamic programming algorithm for this problem. This was implemented in a computer program called Octopus where we use side-chain cover sets with very small values for R, such as 0.1 Å, which ensures that for each amino-acid side chain the set contains a conformation with a root mean-square deviation of, at most, R from the optimal conformation. The side-chain dihedral-angle accuracy of the program is comparable to other implementations; however, it has the important advantage that the structures produced by the program have negative energies that are very close to the energies of the crystal structure for all tested proteins

    Bidimensionality and Geometric Graphs

    Full text link
    In this paper we use several of the key ideas from Bidimensionality to give a new generic approach to design EPTASs and subexponential time parameterized algorithms for problems on classes of graphs which are not minor closed, but instead exhibit a geometric structure. In particular we present EPTASs and subexponential time parameterized algorithms for Feedback Vertex Set, Vertex Cover, Connected Vertex Cover, Diamond Hitting Set, on map graphs and unit disk graphs, and for Cycle Packing and Minimum-Vertex Feedback Edge Set on unit disk graphs. Our results are based on the recent decomposition theorems proved by Fomin et al [SODA 2011], and our algorithms work directly on the input graph. Thus it is not necessary to compute the geometric representations of the input graph. To the best of our knowledge, these results are previously unknown, with the exception of the EPTAS and a subexponential time parameterized algorithm on unit disk graphs for Vertex Cover, which were obtained by Marx [ESA 2005] and Alber and Fiala [J. Algorithms 2004], respectively. We proceed to show that our approach can not be extended in its full generality to more general classes of geometric graphs, such as intersection graphs of unit balls in R^d, d >= 3. Specifically we prove that Feedback Vertex Set on unit-ball graphs in R^3 neither admits PTASs unless P=NP, nor subexponential time algorithms unless the Exponential Time Hypothesis fails. Additionally, we show that the decomposition theorems which our approach is based on fail for disk graphs and that therefore any extension of our results to disk graphs would require new algorithmic ideas. On the other hand, we prove that our EPTASs and subexponential time algorithms for Vertex Cover and Connected Vertex Cover carry over both to disk graphs and to unit-ball graphs in R^d for every fixed d

    Recognizing Geometric Intersection Graphs Stabbed by a Line

    Full text link
    In this paper, we determine the computational complexity of recognizing two graph classes, \emph{grounded L}-graphs and \emph{stabbable grid intersection} graphs. An L-shape is made by joining the bottom end-point of a vertical (∣\vert) segment to the left end-point of a horizontal (−-) segment. The top end-point of the vertical segment is known as the {\em anchor} of the L-shape. Grounded L-graphs are the intersection graphs of L-shapes such that all the L-shapes' anchors lie on the same horizontal line. We show that recognizing grounded L-graphs is NP-complete. This answers an open question asked by Jel{\'\i}nek \& T{\"o}pfer (Electron. J. Comb., 2019). Grid intersection graphs are the intersection graphs of axis-parallel line segments in which two vertical (similarly, two horizontal) segments cannot intersect. We say that a (not necessarily axis-parallel) straight line ℓ\ell stabs a segment ss, if ss intersects ℓ\ell. A graph GG is a stabbable grid intersection graph (StabGIGStabGIG) if there is a grid intersection representation of GG in which the same line stabs all its segments. We show that recognizing StabGIGStabGIG graphs is NPNP-complete, even on a restricted class of graphs. This answers an open question asked by Chaplick \etal (\textsc{O}rder, 2018).Comment: 18 pages, 11 Figure

    A protein-dependent side-chain rotamer library

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries.</p> <p>Methods</p> <p>In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities.</p> <p>Results</p> <p>Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods.</p

    DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing

    Full text link
    Proteins play a critical role in carrying out biological functions, and their 3D structures are essential in determining their functions. Accurately predicting the conformation of protein side-chains given their backbones is important for applications in protein structure prediction, design and protein-protein interactions. Traditional methods are computationally intensive and have limited accuracy, while existing machine learning methods treat the problem as a regression task and overlook the restrictions imposed by the constant covalent bond lengths and angles. In this work, we present DiffPack, a torsional diffusion model that learns the joint distribution of side-chain torsional angles, the only degrees of freedom in side-chain packing, by diffusing and denoising on the torsional space. To avoid issues arising from simultaneous perturbation of all four torsional angles, we propose autoregressively generating the four torsional angles from \c{hi}1 to \c{hi}4 and training diffusion models for each torsional angle. We evaluate the method on several benchmarks for protein side-chain packing and show that our method achieves improvements of 11.9% and 13.5% in angle accuracy on CASP13 and CASP14, respectively, with a significantly smaller model size (60x fewer parameters). Additionally, we show the effectiveness of our method in enhancing side-chain predictions in the AlphaFold2 model. Code will be available upon the accept.Comment: Under revie

    Finding Geometric Representations of Apex Graphs is NP-Hard

    Get PDF
    Planar graphs can be represented as intersection graphs of different types of geometric objects in the plane, e.g., circles (Koebe, 1936), line segments (Chalopin \& Gon{\c{c}}alves, 2009), \textsc{L}-shapes (Gon{\c{c}}alves et al, 2018). For general graphs, however, even deciding whether such representations exist is often NPNP-hard. We consider apex graphs, i.e., graphs that can be made planar by removing one vertex from them. We show, somewhat surprisingly, that deciding whether geometric representations exist for apex graphs is NPNP-hard. More precisely, we show that for every positive integer kk, recognizing every graph class G\mathcal{G} which satisfies \textsc{PURE-2-DIR} \subseteq \mathcal{G} \subseteq \textsc{1-STRING} is NPNP-hard, even when the input graphs are apex graphs of girth at least kk. Here, PURE−2−DIRPURE-2-DIR is the class of intersection graphs of axis-parallel line segments (where intersections are allowed only between horizontal and vertical segments) and \textsc{1-STRING} is the class of intersection graphs of simple curves (where two curves share at most one point) in the plane. This partially answers an open question raised by Kratochv{\'\i}l \& Pergel (2007). Most known NPNP-hardness reductions for these problems are from variants of 3-SAT. We reduce from the \textsc{PLANAR HAMILTONIAN PATH COMPLETION} problem, which uses the more intuitive notion of planarity. As a result, our proof is much simpler and encapsulates several classes of geometric graphs
    corecore