10 research outputs found

    VarMod: modelling the functional effects of non-synonymous variants.

    Get PDF
    Unravelling the genotype–phenotype relationship in humans remains a challenging task in genomics studies. Recent advances in sequencing technologies mean there are now thousands of sequenced human genomes, revealing millions of single nucleotide variants (SNVs). For non-synonymous SNVs present in proteins the difficulties of the problem lie in first identifying those nsSNVs that result in a functional change in the protein among the many non-functional variants and in turn linking this functional change to phenotype. Here we present VarMod (Variant Modeller) a method that utilises both protein sequence and structural features to predict nsSNVs that alter protein function. VarMod develops recent observations that functional nsSNVs are enriched at protein–protein interfaces and protein–ligand binding sites and uses these characteristics to make predictions. In benchmarking on a set of nearly 3000 nsSNVs VarMod performance is comparable to an existing state of the art method. The VarMod web server provides extensive resources to investigate the sequence and structural features associated with the predictions including visualisation of protein models and complexes via an interactive JSmol molecular viewer. VarMod is available for use at http://www.wasslab.org/varmod

    PDBe-KB: collaboratively defining the biological context of structural data

    Get PDF
    The Protein Data Bank in Europe - Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive

    PDBe-KB: collaboratively defining the biological context of structural data

    Get PDF
    The Protein Data Bank in Europe – Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive.ELIXIR [IDP implementation study]; Biotechnology and Biological Sciences Research Council via the 3D-Gateway [BB/T01959X/1]; FunPDBe [BB/P024351/1]; European Molecular Biology Laboratory-European Bioinformatics Institute who supported this work; J.D. acknowledges support from the Ministry of Education, Youth and Sport of the Czech Republic [INBIO CZ.02.1.01/0.0/0.0/16_026/0008451]; R.S., K.B. and J.D. also acknowledge support from the Ministry of Education, Youth and Sport of the Czech Republic [ELIXIR-CZ LM2018131]; L.M. acknowledges support from the European Union's Horizon 2020 Programme (H2020-INFRAIA-2018-1) [823839]; Research Foundation Flanders (FWO) [G032816N, G042518N, G028821N]; W.V. acknowledges support from the Research Foundation Flanders (FWO) [G032816N, G028821N]; A.R. acknowledges support from the Fondazione Cassa Di Risparmio di Firenze [24316]; European Commission [101017567]; M.H.C. acknowledges the AIRC project to MHC [IG 23539]; J.F.-R. acknowledges support from the Spanish Ministry of Science and Innovation [PID2019-110167RB-I00]; N.R. acknowledges support from the Norwegian Research Council (Norges Forskningsråd) [288008]; E.D.L. acknowledges support from the European Union's Horizon 2020 research and innovation programme [819318]; M.J.E.S. acknowledges support from the Wellcome Trust [104955/Z/14/Z, 218242/Z/19/Z]. Funding for open access charge: Biotechnology and Biological Sciences Research Council grant [BB/T01959X/1]; Wellcome Trust [104955/Z/14/Z and 218242/Z/19/Z].Peer ReviewedCurrent PDBe-KB Consortium Members with Affiliations. (BSC author: Gonzalo Parra) Mihaly Varadi1, Stephen Anyango, David Armstrong, John Berrisford, Preeti Choudhary, Mandar Deshpande, Nurul Nadzirin, Sreenath S. Nair, Lukas Pravda, Ahsan Tanweer, Bissan Al-Lazikani, Claudia Andreini, Geoffrey J. Barton, David Bednar, Karel Berka, Tom Blundell, Kelly P Brock, Jose Maria Carazo, Jiri Damborsky, Alessia David, Sucharita Dey, Roland Dunbrack, Juan Fernandez Recio, Franca Fraternali, Toby Gibson, Manuela Helmer-Citterich, David Hoksza, Thomas Hopf, David Jakubec, Natarajan Kannan, Radoslav Krivak, Manjeet Kumar, Emmanuel D Levy, Nir London, Jose Ramon Macias, Madhusudhan M. Srivatsan, Debora S Marks, Lennart Martens, Stuart A McGowan, Jake E McGreig, Vivek Modi, R. Gonzalo Parra, Gerardo Pepe16, Damiano Piovesan, Jaime Prilusky, Valeria Putignano, Leandro G. Radusky, Pathmanaban Ramasamy, Atilio O. Rausch, Nathalie Reuter, Luis A. Rodriguez, Nathan J Rollins, Antonio Rosato, Paweł Rubach, Luis Serrano, Gulzar Singh,Petr Skoda, Carlos Oscar S. Sorzano, Jan Stourac, Joanna I Sulkowska, Radka Svobodova, Natalia Tichshenko, Silvio C.E. Tosatto, Wim Vranken, Mark N Wass, Dandan Xue, Daniel Zaidman, Janet Thornton, Michael Sternberg, Christine Orengo, Sameer VelankarPostprint (published version

    3DLigandSite: Structure-based prediction of protein-ligand binding sites

    Get PDF
    3DLigandSite is a web tool for the prediction of ligand-binding sites in proteins. Here, we report a significant update since the first release of 3DLigandSite in 2010. The overall methodology remains the same, with candidate binding sites in proteins inferred using known binding sites in related protein structures as templates. However, the initial structural modelling step now uses the newly available structures from the AlphaFold database or alternatively Phyre2 when AlphaFold structures are not available. Further, a sequence-based search using HHSearch has been introduced to identify template structures with bound ligands that are used to infer the ligand-binding residues in the query protein. Finally, we introduced a machine learning element as the final prediction step, which improves the accuracy of predictions and provides a confidence score for each residue predicted to be part of a binding site. Validation of 3DLigandSite on a set of 6416 binding sites obtained 92% recall at 75% precision for non-metal binding sites and 52% recall at 75% precision for metal binding sites. 3DLigandSite is available at https://www.wass-michaelislab.org/3dligandsite. Users submit either a protein sequence or structure. Results are displayed in multiple formats including an interactive Mol* molecular visualization of the protein and the predicted binding sites

    PDBE-KB:A community-driven resource for structural and functional annotations

    Get PDF

    PrankWeb: a web server for ligand binding site prediction and visualization.

    Get PDF
    PrankWeb is an online resource providing an interface to P2Rank, a state-of-the-art method for ligand binding site prediction. P2Rank is a template-free machine learning method based on the prediction of local chemical neighborhood ligandability centered on points placed on a solvent-accessible protein surface. Points with a high ligandability score are then clustered to form the resulting ligand binding sites. In addition, PrankWeb provides a web interface enabling users to easily carry out the prediction and visually inspect the predicted binding sites via an integrated sequence-structure view. Moreover, PrankWeb can determine sequence conservation for the input molecule and use this in both the prediction and result visualization steps. Alongside its online visualization options, PrankWeb also offers the possibility of exporting the results as a PyMOL script for offline visualization. The web frontend communicates with the server side via a REST API. In high-throughput scenarios, therefore, users can utilize the server API directly, bypassing the need for a web-based frontend or installation of the P2Rank application. PrankWeb is available at http://prankweb.cz/, while the web application source code and the P2Rank method can be accessed at https://github.com/jendelel/PrankWebApp and https://github.com/rdk/p2rank, respectively

    PDBe-KB: a community-driven resource for structural and functional annotations.

    Get PDF
    The Protein Data Bank in Europe-Knowledge Base (PDBe-KB, https://pdbe-kb.org) is a community-driven, collaborative resource for literature-derived, manually curated and computationally predicted structural and functional annotations of macromolecular structure data, contained in the Protein Data Bank (PDB). The goal of PDBe-KB is two-fold: (i) to increase the visibility and reduce the fragmentation of annotations contributed by specialist data resources, and to make these data more findable, accessible, interoperable and reusable (FAIR) and (ii) to place macromolecular structure data in their biological context, thus facilitating their use by the broader scientific community in fundamental and applied research. Here, we describe the guidelines of this collaborative effort, the current status of contributed data, and the PDBe-KB infrastructure, which includes the data exchange format, the deposition system for added value annotations, the distributable database containing the assembled data, and programmatic access endpoints. We also describe a series of novel web-pages-the PDBe-KB aggregated views of structure data-which combine information on macromolecular structures from many PDB entries. We have recently released the first set of pages in this series, which provide an overview of available structural and functional information for a protein of interest, referenced by a UniProtKB accession

    PDBe-KB: collaboratively defining the biological context of structural data

    Get PDF
    The Protein Data Bank in Europe – Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive

    One origin for metallo-β-lactamase activity, or two? An investigation assessing a diverse set of reconstructed ancestral sequences based on a sample of phylogenetic trees

    Get PDF
    This work was supported by BBSRC (grant BB/F016778/1)Bacteria use metallo-β-lactamase enzymes to hydrolyse lactam rings found in many antibiotics, rendering them ineffective. Metallo-β-lactamase activity is thought to be polyphyletic, having arisen on more than one occasion within a single functionally diverse homologous superfamily. Since discovery of multiple origins of enzymatic activity conferring antibiotic resistance has broad implications for the continued clinical use of antibiotics, we test the hypothesis of polyphyly further; if lactamase function has arisen twice independently, the most recent common ancestor (MRCA) is not expected to possess lactam-hydrolysing activity. Two major problems present themselves. Firstly, even with a perfectly known phylogeny, ancestral sequence reconstruction is error prone. Secondly, the phylogeny is not known, and in fact reconstructing a single, unambiguous phylogeny for the superfamily has proven impossible. To obtain a more statistical view of the strength of evidence for or against MRCA lactamase function, we reconstructed a sample of 98 MRCAs of the metallo-β-lactamases, each based on a different tree in a bootstrap sample of reconstructed phylogenies. InterPro sequence signatures and homology modelling were then used to assess our sample of MRCAs for lactamase functionality. Only 5 % of these models conform to our criteria for metallo-β-lactamase functionality, suggesting that the ancestor was unlikely to have been a metallo-β-lactamase. On the other hand, given that ancestral proteins may have had metallo-β-lactamase functionality with variation in sequence and structural properties compared with extant enzymes, our criteria are conservative, estimating a lower bound of evidence for metallo-β-lactamase functionality but not an upper bound.Publisher PDFPeer reviewe

    Tracking the evolution of function in diverse enzyme superfamilies

    Get PDF
    Tracking the evolution of function in enzyme superfamilies is key in understanding how important biological functions and mechanisms have evolved. New genes are being sequenced at a rate that far surpasses the ability of characterization by wet-lab techniques. Moreover, bioinformatics allows for the use of methods not amenable to wet lab experimentation. We now face a situation in which we are aware of the existence of many gene families but are ignorant of what they do and how they function. Even for families with many structurally and functionally characterized members, the prediction of function of ancestral sequences can be used to elucidate past patterns of evolution and highlight likely future trajectories. In this thesis, we apply in silico structure and function methods to predict the functions of protein sequences from two diverse superfamily case studies. In the first, the metallo-β-lactamase superfamily, many members have been structurally and functionally characterised. In this work, we asked how many times the same function has independently evolved in the same superfamily using ancestral sequence reconstruction, homology modelling and alignment to catalytic templates. We found that in only 5% of evolutionary scenarios assessed, was there evidence of a lactam hydrolysing ancestor. This could be taken as strong evidence that metallo-β-lactamase function has evolved independently on multiple occasions. This finding has important implications for predicting the evolution of antibiotic resistance in this protein fold. However, as discussed, the interpretation of this statistic is not clear-cut. In the second case study, we analysed protein sequences of the DUF-62 superfamily. In contrast to the metallo-β-lactmase superfamily, very few members of this superfamily have been structurally and functionally characterised. We used the analysis of alignment, gene context, species tree reconciliation and comparison of the rates of evolution to ask if other functions or cellular roles might exist in this family other than the ones already established. We find that multiple lines of evidence present a compelling case for the evolution of different functions within the Archaea, and propose possible cellular interactions and roles for members of this enzyme family