18 research outputs found

    Serverification of Molecular Modeling Applications: the Rosetta Online Server that Includes Everyone (ROSIE)

    Get PDF
    The Rosetta molecular modeling software package provides experimentally tested and rapidly evolving tools for the 3D structure prediction and high-resolution design of proteins, nucleic acids, and a growing number of non-natural polymers. Despite its free availability to academic users and improving documentation, use of Rosetta has largely remained confined to developers and their immediate collaborators due to the code's difficulty of use, the requirement for large computational resources, and the unavailability of servers for most of the Rosetta applications. Here, we present a unified web framework for Rosetta applications called ROSIE (Rosetta Online Server that Includes Everyone). ROSIE provides (a) a common user interface for Rosetta protocols, (b) a stable application programming interface for developers to add additional protocols, (c) a flexible back-end to allow leveraging of computer cluster resources shared by RosettaCommons member institutions, and (d) centralized administration by the RosettaCommons to ensure continuous maintenance. This paper describes the ROSIE server infrastructure, a step-by-step 'serverification' protocol for use by Rosetta developers, and the deployment of the first nine ROSIE applications by six separate developer teams: Docking, RNA de novo, ERRASER, Antibody, Sequence Tolerance, Supercharge, Beta peptide design, NCBB design, and VIP redesign. As illustrated by the number and diversity of these applications, ROSIE offers a general and speedy paradigm for serverification of Rosetta applications that incurs negligible cost to developers and lowers barriers to Rosetta use for the broader biological community. ROSIE is available at http://rosie.rosettacommons.org

    Prediction of Mutational Tolerance in HIV-1 Protease and Reverse Transcriptase Using Flexible Backbone Protein Design

    Get PDF
    <div><p>Predicting which mutations proteins tolerate while maintaining their structure and function has important applications for modeling fundamental properties of proteins and their evolution; it also drives progress in protein design. Here we develop a computational model to predict the tolerated sequence space of HIV-1 protease reachable by single mutations. We assess the model by comparison to the observed variability in more than 50,000 HIV-1 protease sequences, one of the most comprehensive datasets on tolerated sequence space. We then extend the model to a second protein, reverse transcriptase. The model integrates multiple structural and functional constraints acting on a protein and uses ensembles of protein conformations. We find the model correctly captures a considerable fraction of protease and reverse-transcriptase mutational tolerance and shows comparable accuracy using either experimentally determined or computationally generated structural ensembles. Predictions of tolerated sequence space afforded by the model provide insights into stability-function tradeoffs in the emergence of resistance mutations and into strengths and limitations of the computational model.</p> </div

    Performance of specific model features and DRMs for reverse transcriptase.

    No full text
    <p>Data representation in panels <b>A</b>, <b>B</b> and <b>C</b> is the same as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi-1002639-g003" target="_blank">Figure 3C, D and E</a>. (<b>D</b>) Recapitulation of reverse transcriptase DRMs by the HIV database and the selective model: A set of 31 literature-documented DRMs of reverse transcriptase <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639-Castro1" target="_blank">[63]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639-Gotte1" target="_blank">[64]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639-CeccheriniSilberstein1" target="_blank">[65]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639-Ren1" target="_blank">[66]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639-Deval1" target="_blank">[67]</a> was categorized according to whether or not they are present in the Stanford HIV database (post-drug treatment data) and whether or not they are recapitulated by the selective model. Subscript and superscript numbers list mutation frequencies according to the HIV database and the selective model, respectively.</p

    Predicted energetic contributions of HIV-1 protease DRMs.

    No full text
    <p>(<b>A</b>) DRMs within 4 Å of the substrate-binding site <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639-Johnson1" target="_blank">[23]</a>. Predicted changes in <i>ERES<sub>Fold</sub></i>, <i>ERES<sub>Dimer</sub></i>, and <i>ERES<sub>Peptide</sub></i> scores are relative to the <i>ERES</i> scores of the native residue type. <i>ERES<sub>Peptide</sub></i> scores are represented by the change in the sum of <i>ERES</i> scores for all 10 peptides before and after introducing the mutation. <i>ERES</i> scores are given in color codes, from −1.7 to 4 (blue to red), and >4 (framed red boxes) in Rosetta energy units (approximating kcal/mol), and columns are sorted in ascending order of the <i>ERES<sub>Fold</sub></i> scores. Mutations denoted as “Predicted" and “Not Predicted" were predicted by the selective model to have >0.01% and ≀0.01% frequencies, respectively. Mutations that required more than one nucleotide substitution are denoted as “disfavored". Boxes with “X" indicate clashes in the wild-type structure. (<b>B</b>) As (A), but showing DRMs outside of the substrate-binding site. Major and minor DRMs are as defined in the text.</p

    Recapitulation of reverse transcriptase mutational tolerance by the neutral and selective models.

    No full text
    <p>Panels <b>A</b> and <b>B</b> have same representation as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi-1002639-g002" target="_blank">Figures 2A and B</a>.</p

    Model performance for protease and importance of specific model features: multiple constraints and backbone flexibility.

    No full text
    <p>(<b>A</b>) ROC curves for predictions using three functional constraints and a crystallographic ensemble of protease structures are shown for the neutral (top) and selective (bottom) models. For reference, these curves are also duplicated in panels B–D. (<b>B</b>) ROC curves for predictions using fold stability as a single constraint and a crystallographic ensemble of protease structures are shown for the neutral (cyan, top) and selective (cyan, bottom) models. (<b>C</b>) ROC curves for predictions using three constraints and a single crystallographic protease structure are shown for the neutral (grey, top) and selective (grey, bottom) models. Curves are shown for 11 single protease structures. (<b>D</b>) ROC curves for predictions using three constraints and a computationally generated ensemble of protease structures are shown for the neutral (orange, top) and selective (orange, bottom) models. Curves are shown for 11 computational ensembles each generated from one of the 11 single protease structures used in (C). (<b>E</b>) AUC values are shown for each of the ROC curves depicted in (A–D). For ROC curves in (A–D), true positive tolerated mutations are defined as those observed with a frequency above 1% in the Stanford database (57 and 93 mutations for the neutral and selective models, respectively). The subset of all amino acids reachable by one nucleotide change and the subset of all amino acids that are chemically similar to the native are denoted by a red triangle and blue square, respectively (see text). Dashed lines connect the last ROC value (lowest frequency threshold) and the (100%, 100%) point.</p

    Computational model for predicting mutational tolerance.

    No full text
    <p>(<b>A</b>) Flowchart illustrating key steps. (<b>B</b>) Example calculations for position 50 in HIV-1 protease. For each position in the protein of interest, all amino acid residue types (except cysteine) are computationally modeled onto each structure in an ensemble of backbone structures. For each mutation, the Rosetta per-residue energy contribution (<i>ERES</i>) is recorded for each structure. These values are depicted as boxplots showing the variation in the <i>ERES</i> scores calculated over the ensemble (<i>ERES<sub>Fold</sub></i> and <i>ERES<sub>Dimer</sub></i> scores are shown for 263 experimentally determined protease structures; <i>ERES<sub>Peptide</sub></i> scores are shown for 19 structures with a substrate peptide bound). Next, the minimum (<i>i.e.</i> most favorable) <i>ERES</i> score observed among all structures in the ensemble is determined with respect to fold stability (<b>left</b> boxplot, blue circles), dimer stability (<b>middle</b> boxplot, green circles) and binding to 10 substrate peptides (<b>right</b> boxplot, red circles). These minimum scores are then weighted and summed for each point mutation to yield <i>W<sub>Sum</sub></i> for each position j and amino acid i (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639.e002" target="_blank"><b>Equation 1</b></a>). Sums are performed using either neutral or selective weights (see <b><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639.s014" target="_blank">Table S4</a></b>). scores are combined using <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002639#pcbi.1002639.e008" target="_blank"><b>Equation (2)</b></a> to give predicted frequencies for each residue type (superscript). For comparison, the mutational frequencies for position 50 observed in the Stanford HIV-1 database before and after inhibitor treatment are shown below the predicted frequencies (superscripts for each observed residue type).</p

    Accurate positioning of functional residues with robotics-inspired computational protein design.

    No full text
    SignificanceComputational protein design promises to advance applications in medicine and biotechnology by creating proteins with many new and useful functions. However, new functions require the design of specific and often irregular atom-level geometries, which remains a major challenge. Here, we develop computational methods that design and predict local protein geometries with greater accuracy than existing methods. Then, as a proof of concept, we leverage these methods to design new protein conformations in the enzyme ketosteroid isomerase that change the protein's preference for a key functional residue. Our computational methods are openly accessible and can be applied to the design of other intricate geometries customized for new user-defined protein functions

    A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design

    No full text
    <div><p>The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (<a href="https://kortemmelab.ucsf.edu/benchmarks" target="_blank">https://kortemmelab.ucsf.edu/benchmarks</a>) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a “best practice” set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.</p></div
    corecore