143 research outputs found

    Protein Design Using Continuous Rotamers

    Get PDF
    Optimizing amino acid conformation and identity is a central problem in computational protein design. Protein design algorithms must allow realistic protein flexibility to occur during this optimization, or they may fail to find the best sequence with the lowest energy. Most design algorithms implement side-chain flexibility by allowing the side chains to move between a small set of discrete, low-energy states, which we call rigid rotamers. In this work we show that allowing continuous side-chain flexibility (which we call continuous rotamers) greatly improves protein flexibility modeling. We present a large-scale study that compares the sequences and best energy conformations in 69 protein-core redesigns using a rigid-rotamer model versus a continuous-rotamer model. We show that in nearly all of our redesigns the sequence found by the continuous-rotamer model is different and has a lower energy than the one found by the rigid-rotamer model. Moreover, the sequences found by the continuous-rotamer model are more similar to the native sequences. We then show that the seemingly easy solution of sampling more rigid rotamers within the continuous region is not a practical alternative to a continuous-rotamer model: at computationally feasible resolutions, using more rigid rotamers was never better than a continuous-rotamer model and almost always resulted in higher energies. Finally, we present a new protein design algorithm based on the dead-end elimination (DEE) algorithm, which we call iMinDEE, that makes the use of continuous rotamers feasible in larger systems. iMinDEE guarantees finding the optimal answer while pruning the search space with close to the same efficiency of DEE. Availability: Software is available under the Lesser GNU Public License v3. Contact the authors for source code

    Maximum Persistency in Energy Minimization

    Full text link
    We consider discrete pairwise energy minimization problem (weighted constraint satisfaction, max-sum labeling) and methods that identify a globally optimal partial assignment of variables. When finding a complete optimal assignment is intractable, determining optimal values for a part of variables is an interesting possibility. Existing methods are based on different sufficient conditions. We propose a new sufficient condition for partial optimality which is: (1) verifiable in polynomial time (2) invariant to reparametrization of the problem and permutation of labels and (3) includes many existing sufficient conditions as special cases. We pose the problem of finding the maximum optimal partial assignment identifiable by the new sufficient condition. A polynomial method is proposed which is guaranteed to assign same or larger part of variables than several existing approaches. The core of the method is a specially constructed linear program that identifies persistent assignments in an arbitrary multi-label setting.Comment: Extended technical report for the CVPR 2014 paper. Update: correction to the proof of characterization theore

    Systematic conformational search with constraint satisfaction

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 170-177).Determining the conformations of biological molecules is a high scientific priority for biochemists and for the pharmaceutical industry. This thesis describes a systematic method for conformational search, an application of the method to determining the structure of the formyl-Met-Leu-Phe-OH (fMLF)peptide by solid-state NMR spectroscopy, and a separate project to determine the structure of a protein-DNA complex by X-ray crystallography. The purpose of the systematic search method is to enumerate all conformations of a molecule (at a given level of torsion angle resolution) that satisfy a set of local geometric constraints. Constraints would typically come from NMR experiments, but applications such as docking or homology modelling could also give rise to similar constraints. The molecule to be searched is partitioned into small subchains so that the set of possible conformations for the whole molecule may be constructed by merging the feasible conformations for the parts. However, instead of using a binary tree for straightforward divide-and-conquer, four innovations are introduced: (1) OMNIMERGE searches a subproblem for every possible subchain of the molecule. Searching every subchain provides the advantage that every possible merge is available; by choosing the most favorable merge for each subchain, the bottleneck subchain(s) and therefore the whole search may be completed more efficiently. (2) A cost function evaluates alternative divide-and-conquer trees, provided that a preliminary OMNIMERGE search of the molecule has been completed. Then dynamic programming determines the optimal partitioning or "merge-tree" for the molecule; this merge-tree can be used to improve the efficiency of future searches.(cont.) (3) PROPAGATION shares information by enforcing arc consistency between the solution sets of overlapping subchains. By filtering the solution set of each subchain, infeasible conformations are discarded rapidly. (4) An A* function prioritizes each subchain based on estimated future costs. Subchains with sufficiently low priority can be skipped, which improves efficiency. A common theme of these four ideas is to make good choices about how to break the large search problem into lower-dimensional subproblems. These novel algorithms were implemented and the effectiveness of each is demonstrated on a well-constrained peptide with 40 degrees of freedom.by Lisa Tucker-Kellogg.Ph.D

    Protein structure prediction: improving and automating knowledge-based approaches

    Full text link
    This work presents a computational approach to improve the automatic prediction of protein structures from sequence. Its main focus was twofold. An automated method for guiding the modeling process was first developed. This was tested and found to be state of the art in the CASP4 structure prediction contest in 2000. The second focus was the development of a novel divide and conquer algorithm for modeling flexible loops in proteins. Implementation of the search procedure and subsequent ranking is presented. The results are again compared with state of the art methods

    Exact rotamer optimization for computational protein design

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 235-244).The search for the global minimum energy conformation (GMEC) of protein side chains is an important computational challenge in protein structure prediction and design. Using rotamer models, the problem is formulated as a NP-hard optimization problem. Dead-end elimination (DEE) methods combined with systematic A* search (DEE/A*) have proven useful, but may not be strong enough as we attempt to solve protein design problems where a large number of similar rotamers is eligible and the network of interactions between residues is dense. In this thesis, we present an exact solution method, named BroMAP (branch-and-bound rotamer optimization using MAP estimation), for such protein design problems. The design goal of BroMAP is to be able to expand smaller search trees than conventional branch-and-bound methods while performing only a moderate amount of computation in each node, thereby reducing the total running time. To achieve that, BroMAP attempts reduction of the problem size within each node through DEE and elimination by energy lower bounds from approximate maximurn-a-posteriori (MAP) estimation. The lower bounds are also exploited in branching and subproblem selection for fast discovery of strong upper bounds. Our computational results show that BroMAP tends to be faster than DEE/A* for large protein design cases. BroMAP also solved cases that were not solvable by DEE/A* within the maximum allowed time, and did not incur significant disadvantage for cases where DEE/A* performed well. In the second part of the thesis, we explore several ways of improving the energy lower bounds by using Lagrangian relaxation. Through computational experiments, solving the dual problem derived from cyclic subgraphs, such as triplets, is shown to produce stronger lower bounds than using the tree-reweighted max-product algorithm.(cont.) In the second approach, the Lagrangian relaxation is tightened through addition of violated valid inequalities. Finally, we suggest a way of computing individual lower bounds using the dual method. The preliminary results from evaluating BroMAP employing the dual bounds suggest that the use of the strengthened bounds does not in general improve the running time of BroMAP due to the longer running time of the dual method.by Eun-Jong Hong.Ph.D

    Grid-enabling Non-computer Resources

    Get PDF

    Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021

    Get PDF
    The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing

    Geometrical and probabilistic methods for determining association models and structures of protein complexes

    Get PDF
    Protein complexes play vital roles in cellular processes within living organisms. They are formed by interactions between either different proteins (hetero-oligomers) or identical proteins (homo-oligomers). In order to understand the functions of the complexes, it is important to know the manner in which they are assembled from the component subunits and their three dimensional structure. This thesis addresses both of these questions by developing geometrical and probabilistic methods for analyzing data from two complementary experiment types: Small Angle Scattering (SAS) and Nuclear Magnetic Resonance (NMR) spectroscopy. Data from an SAS experiment is a set of scattering intensities that can give the interatomic probability distributions. NMR experimental data used in this thesis is set of atom pairs and the maximum distance between them. From SAS data, this thesis determines the association model of the complex and intensities through an approach that is robust to noise and contaminants in solution. Using NMR data, this thesis computes the complex structure by using probabilistic inference and geometry of convex shapes. The structure determination methods are complete, that is they identify all consistent conformations and are data driven wherein the structures are evaluated separately for consistency to data and biophysical energy
    • …
    corecore