278 research outputs found

    Optimization of van der Waals Energy for Protein Side-Chain Placement and Design

    Get PDF
    AbstractComputational determination of optimal side-chain conformations in protein structures has been a long-standing and challenging problem. Solving this problem is important for many applications including homology modeling, protein docking, and for placing small molecule ligands on protein-binding sites. Programs available as of this writing are very fast and reasonably accurate, as measured by deviations of side-chain dihedral angles; however, often due to multiple atomic clashes, they produce structures with high positive energies. This is problematic in applications where the energy values are important, for example when placing small molecules in docking applications; the relatively small binding energy of the small molecule is drowned by the large energy due to atomic clashes that hampers finding the lowest energy state of the docked ligand. To address this we have developed an algorithm for generating a set of side-chain conformations that is dense enough that at least one of its members would have a root mean-square deviation of no more than R Å from any possible side-chain conformation of the amino acid. We call such a set a side-chain cover set of order R for the amino acid. The size of the set is constrained by the energy of the interaction of the side chain to the backbone atoms. Then, side-chain cover sets are used to optimize the conformation of the side chains given the coordinates of the backbone of a protein. The method we use is based on a variety of dead-end elimination methods and the recently discovered dynamic programming algorithm for this problem. This was implemented in a computer program called Octopus where we use side-chain cover sets with very small values for R, such as 0.1 Å, which ensures that for each amino-acid side chain the set contains a conformation with a root mean-square deviation of, at most, R from the optimal conformation. The side-chain dihedral-angle accuracy of the program is comparable to other implementations; however, it has the important advantage that the structures produced by the program have negative energies that are very close to the energies of the crystal structure for all tested proteins

    A Generic Program for Multistate Protein Design

    Get PDF
    Some protein design tasks cannot be modeled by the traditional single state design strategy of finding a sequence that is optimal for a single fixed backbone. Such cases require multistate design, where a single sequence is threaded onto multiple backbones (states) and evaluated for its strengths and weaknesses on each backbone. For example, to design a protein that can switch between two specific conformations, it is necessary to to find a sequence that is compatible with both backbone conformations. We present in this paper a generic implementation of multistate design that is suited for a wide range of protein design tasks and demonstrate in silico its capabilities at two design tasks: one of redesigning an obligate homodimer into an obligate heterodimer such that the new monomers would not homodimerize, and one of redesigning a promiscuous interface to bind to only a single partner and to no longer bind the rest of its partners. Both tasks contained negative design in that multistate design was asked to find sequences that would produce high energies for several of the states being modeled. Success at negative design was assessed by computationally redocking the undesired protein-pair interactions; we found that multistate design's accuracy improved as the diversity of conformations for the undesired protein-pair interactions increased. The paper concludes with a discussion of the pitfalls of negative design, which has proven considerably more challenging than positive design

    DISCRETIZED GEOMETRIC APPROACHES TO THE ANALYSIS OF PROTEIN STRUCTURES

    Get PDF
    Proteins play crucial roles in a variety of biological processes. While we know that their amino acid sequence determines their structure, which in turn determines their function, we do not know why particular sequences fold into particular structures. My work focuses on discretized geometric descriptions of protein structure—conceptualizing native structure space as composed of mostly discrete, geometrically defined fragments—to better understand the patterns underlying why particular sequence elements correspond to particular structure elements. This discretized geometric approach is applied to multiple levels of protein structure, from conceptualizing contacts between residues as interactions between discrete structural elements to treating protein structures as an assembly of discrete fragments. My earlier work focused on better understanding inter-residue contacts and estimating their energies statistically. By scoring structures with energies derived from a stricter notion of contact, I show that native protein structures can be identified out of a set of decoy structures more often than when using energies derived from traditional definitions of contact and how this has implications for the evaluation of predictions that rely on structurally defined contacts for validation. Demonstrating how useful simple geometric descriptors of structure can be, I then show that these energies identify native structures on par with well-validated, detailed, atomistic energy functions. Moving to a higher level of structure, in my later work I demonstrate that discretized, geometrically defined structural fragments make good objects for the interactive assembly of protein backbones and present a software application which lets users do so. Finally, I use these fragments to generate structure-conditioned statistical energies, generalizing the classic idea of contact energies by incorporating specific structural context, enabling these energies to reflect the interaction geometries they come from. These structure-conditioned energies contain more information about native sequence preferences, correlate more highly with experimentally determined energies, and show that pairwise sequence preferences are tightly coupled to their structural context. Considered jointly, these projects highlight the degree to which protein structures and the interactions they comprise can be understood as geometric elements coming together in finely tuned ways

    Structure-activity relationships of constrained phenylethylamine ligands for the serotonin 5-ht2 receptors

    Get PDF
    Serotonergic ligands have proven effective drugs in the treatment of migraine, pain, obesity, and a wide range of psychiatric and neurological disorders. There is a clinical need for more highly 5-HT(2) receptor subtype-selective ligands and the most attention has been given to the phenethylamine class. Conformationally constrained phenethylamine analogs have demonstrated that for optimal activity the free lone pair electrons of the 2-oxygen must be oriented syn and the 5-oxygen lone pairs anti relative to the ethylamine moiety. Also the ethyl linker has been constrained providing information about the bioactive conformation of the amine functionality. However, combined 1,2-constriction by cyclization has only been tested with one compound. Here, we present three new 1,2-cyclized phenylethylamines, 9–11, and describe their synthetic routes. Ligand docking in the 5-HT(2B) crystal structure showed that the 1,2-heterocyclized compounds can be accommodated in the binding site. Conformational analysis showed that 11 can only bind in a higher-energy conformation, which would explain its absent or low affinity. The amine and 2-oxygen interactions with D3.32 and S3.36, respectively, can form but shift the placement of the core scaffold. The constraints in 9–11 resulted in docking poses with the 4-bromine in closer vicinity to 5.46, which is polar only in the human 5-HT(2A) subtype, for which 9–11 have the lowest affinity. The new ligands, conformational analysis and docking expand the structure-activity relationships of constrained phenethylamines and contributes towards the development of 5-HT(2) receptor subtype-selective ligands

    Capturing Atomic Interactions with a Graphical Framework in Computational Protein Design

    Get PDF
    A protein's amino acid sequence determines both its chemical and its physical structures, and together these two structures determine its function. Protein designers seek new amino acid sequences with chemical and physical structures capable of performing some function. The vast size of sequence space frustrates efforts to find useful sequences. Protein designers model proteins on computers and search through amino acid sequence space computationally. They represent the three-dimensional structures for the sequences they examine, specifying the location of each atom, and evaluate the stability of these structures. Good structures are tightly packed but are free of collisions. Designers seek a sequence with a stable structure that meets the geometric and chemical requirements to function as desired; they frame their search as an optimization problem. In this dissertation, I present a graphical model of the central optimization problem in protein design, the side-chain-placement problem. This model allows the formulation of a dynamic programming solution, thus connecting side-chain placement with the class of NP-complete problems for which certain instances admit polynomial time solutions. Moreover, the graphical model suggests a natural data structure for storing the energies used in design. With this data structure, I have created an extensible framework for the representation of energies during side-chain-placement optimization and have incorporated this framework into the Rosetta molecular modeling program. I present one extension that incorporates a new degree of structural variability into the optimization process. I present another extension that includes a non-pairwise decomposable energy function, the first of its kind in protein design, laying the ground-work to capture aspects of protein stability that could not previously be incorporated into the optimization of side-chain placement

    Molecular Modeling in Enzyme Design, Toward In Silico Guided Directed Evolution

    Get PDF
    Directed evolution (DE) creates diversity in subsequent rounds of mutagenesis in the quest of increased protein stability, substrate binding, and catalysis. Although this technique does not require any structural/mechanistic knowledge of the system, the frequency of improved mutations is usually low. For this reason, computational tools are increasingly used to focus the search in sequence space, enhancing the efficiency of laboratory evolution. In particular, molecular modeling methods provide a unique tool to grasp the sequence/structure/function relationship of the protein to evolve, with the only condition that a structural model is provided. With this book chapter, we tried to guide the reader through the state of the art of molecular modeling, discussing their strengths, limitations, and directions. In addition, we suggest a possible future template for in silico directed evolution where we underline two main points: a hierarchical computational protocol combining several different techniques and a synergic effort between simulations and experimental validation.Peer ReviewedPostprint (author's final draft

    Enhancing Human Spermine Synthase Activity by Engineered Mutations

    Get PDF
    Spermine synthase (SMS) is an enzyme which function is to convert spermidine into spermine. It was shown that gene defects resulting in amino acid changes of the wild type SMS cause Snyder-Robinson syndrome, which is a mild-to-moderate mental disability associated with osteoporosis, facial asymmetry, thin habitus, hypotonia, and a nonspecific movement disorder. These disease-causing missense mutations were demonstrated, both in silico and in vitro, to affect the wild type function of SMS by either destabilizing the SMS dimer/monomer or directly affecting the hydrogen bond network of the active site of SMS. In contrast to these studies, here we report an artificial engineering of a more efficient SMS variant by transferring sequence information from another organism. It is confirmed experimentally that the variant, bearing four amino acid substitutions, is catalytically more active than the wild type. The increased functionality is attributed to enhanced monomer stability, lowering the pKa of proton donor catalytic residue, optimized spatial distribution of the electrostatic potential around the SMS with respect to substrates, and increase of the frequency of mechanical vibration of the clefts presumed to be the gates toward the active sites. The study demonstrates that wild type SMS is not particularly evolutionarily optimized with respect to the reaction spermidine → spermine. Having in mind that currently there are no variations (non-synonymous single nucleotide polymorphism, nsSNP) detected in healthy individuals, it can be speculated that the human SMS function is precisely tuned toward its wild type and any deviation is unwanted and disease-causing

    Exact rotamer optimization for computational protein design

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 235-244).The search for the global minimum energy conformation (GMEC) of protein side chains is an important computational challenge in protein structure prediction and design. Using rotamer models, the problem is formulated as a NP-hard optimization problem. Dead-end elimination (DEE) methods combined with systematic A* search (DEE/A*) have proven useful, but may not be strong enough as we attempt to solve protein design problems where a large number of similar rotamers is eligible and the network of interactions between residues is dense. In this thesis, we present an exact solution method, named BroMAP (branch-and-bound rotamer optimization using MAP estimation), for such protein design problems. The design goal of BroMAP is to be able to expand smaller search trees than conventional branch-and-bound methods while performing only a moderate amount of computation in each node, thereby reducing the total running time. To achieve that, BroMAP attempts reduction of the problem size within each node through DEE and elimination by energy lower bounds from approximate maximurn-a-posteriori (MAP) estimation. The lower bounds are also exploited in branching and subproblem selection for fast discovery of strong upper bounds. Our computational results show that BroMAP tends to be faster than DEE/A* for large protein design cases. BroMAP also solved cases that were not solvable by DEE/A* within the maximum allowed time, and did not incur significant disadvantage for cases where DEE/A* performed well. In the second part of the thesis, we explore several ways of improving the energy lower bounds by using Lagrangian relaxation. Through computational experiments, solving the dual problem derived from cyclic subgraphs, such as triplets, is shown to produce stronger lower bounds than using the tree-reweighted max-product algorithm.(cont.) In the second approach, the Lagrangian relaxation is tightened through addition of violated valid inequalities. Finally, we suggest a way of computing individual lower bounds using the dual method. The preliminary results from evaluating BroMAP employing the dual bounds suggest that the use of the strengthened bounds does not in general improve the running time of BroMAP due to the longer running time of the dual method.by Eun-Jong Hong.Ph.D
    • …
    corecore