Search CORE

13 research outputs found

Recommended from our members

Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation.

Author: Andreeva Antonina
Blundell Tom L
Buchan Daniel WA
Finn Robert D
Gough Julian
Jones David
Kelley Lawrence A
Lam Su Datt
Murzin Alexey G
Orengo Christine
Pandurangan Arun Prasad
Paysan-Lafosse Typhaine
Salazar Gustavo A
Sillitoe Ian
Skwark Marcin J
Sternberg Michael JE
Velankar Sameer
Publication venue: Nucleic Acids Res
Publication date: 08/01/2020
Field of study

Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms

Apollo (Cambridge)

PDBe: towards reusable data delivery infrastructure at protein data bank in Europe

Author: Alhroub Younes
Anyango Stephen
Armstrong David R
Berrisford John M
Clark Alice R
Conroy Matthew J
Dana Jose M
Deshpande Mandar
Gupta Deepti
Gutmanas Aleksandras
Haslam Pauline
Kleywegt Gerard J
Mak Lora
Mir Saqib
Mukhopadhyay Abhik
Nadzirin Nurul
Paysan-Lafosse Typhaine
Sehnal David
Sen Sanchayita
Smart Oliver S
Varadi Mihaly
Velankar Sameer
Publication venue: 'Oxford University Press (OUP)'
Publication date: 26/10/2017
Field of study

© 2017 The Authors. Published by OUP. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1093/nar/gkx1070The Protein Data Bank in Europe (PDBe, pdbe.org) is actively engaged in the deposition, annotation, remediation, enrichment and dissemination of macromolecular structure data. This paper describes new developments and improvements at PDBe addressing three challenging areas: data enrichment, data dissemination and functional reusability. New features of the PDBe Web site are discussed, including a context dependent menu providing links to raw experimental data and improved presentation of structures solved by hybrid methods. The paper also summarizes the features of the LiteMol suite, which is a set of services enabling fast and interactive 3D visualization of structures, with associated experimental maps, annotations and quality assessment information. We introduce a library of Web components which can be easily reused to port data and functionality available at PDBe to other services. We also introduce updates to the SIFTS resource which maps PDB data to other bioinformatics resources, and the PDBe REST API.Wellcome Trust [104948]; UK Biotechnology and Biological Sciences Research Council [BB/M011674/1, BB/N019172/1, BB/M020347/1]; European Union [284209]; European Molecular Biology Laboratory (EMBL). Funding for open access charge: EMBL.Published versio

Crossref

Wolverhampton Intellectual Repository and E-theses

PDBe-KB: a community-driven resource for structural and functional annotations.

Author: Al-Lazikani Bissan
Anyango Stephen
Armstrong David
Barton Geoffrey J
Berka Karel
Berrisford John
Blundell Tom
Borkakoti Neera
Dana Jose
Das Sayoni
Deshpande Mandar
Dey Sucharita
Fernandez Eloy Villasclaras
Fraternali Franca
Gibson Toby
Gutmanas Aleksandras
Helmer Citterich Manuela
Hoksza David
Huang Liang-Chin
Jain Rishabh
Jubb Harry
Kannan Natarajan
Kannas Christos
Koca Jaroslav
Krivak Radoslav
Kumar Manjeet
Levy Emmanuel D
MacGowan Stuart
Madeira F
Madhusudhan M S
Martell Henry J
McGreig Jake E
Micco Patrizio Di
Mir Saqib
Mukhopadhyay Abhik
Nair Sreenath S
Orengo Christine
Parca Luca
Paysan-Lafosse Typhaine
Pravda Lukas
Radusky Leandro
Ribeiro Antonio
Serrano Luis
Sillitoe Ian
Singh Gulzar
Skoda Petr
Sternberg Michael
Svobodova Radka
Thornton Janet
Tyzack Jonathan
Valencia Alfonso
Varadi Mihaly
Velankar Sameer
Vranken Wim
Wass Mark
Publication venue: Nucleic Acids Res
Publication date: 01/10/2019
Field of study

The Protein Data Bank in Europe-Knowledge Base (PDBe-KB, https://pdbe-kb.org) is a community-driven, collaborative resource for literature-derived, manually curated and computationally predicted structural and functional annotations of macromolecular structure data, contained in the Protein Data Bank (PDB). The goal of PDBe-KB is two-fold: (i) to increase the visibility and reduce the fragmentation of annotations contributed by specialist data resources, and to make these data more findable, accessible, interoperable and reusable (FAIR) and (ii) to place macromolecular structure data in their biological context, thus facilitating their use by the broader scientific community in fundamental and applied research. Here, we describe the guidelines of this collaborative effort, the current status of contributed data, and the PDBe-KB infrastructure, which includes the data exchange format, the deposition system for added value annotations, the distributable database containing the assembled data, and programmatic access endpoints. We also describe a series of novel web-pages-the PDBe-KB aggregated views of structure data-which combine information on macromolecular structures from many PDB entries. We have recently released the first set of pages in this series, which provide an overview of available structural and functional information for a protein of interest, referenced by a UniProtKB accession

UCL Discovery

Spiral - Imperial College Digital Repository

Kent Academic Repository

UPF Digital Repository

Apollo (Cambridge)

King's Research Portal

ART

University of Dundee Online Publications

Institute of Cancer Research Repository

PDBe: improved findability of macromolecularstructure data in the PDB

Author: Abbott
Abhik Mukhopadhyay
Agarwala
Aleksandras Gutmanas
Alice R Clark
Altschul
Bateman
Berman
Burley
Callaway
Chambers
Cook
Dana
David R Armstrong
David Sehnal
Dawson
Deepti Gupta
El-Gebali
Fabregat
Favuzza
Finn
Gaulton
Gerard J Kleywegt
Grabowski
Groom
Hastings
Hossam Zaki
Hunt
Iudin
James Tolchard
Jaroslav Koča
John M Berrisford
Jose M Dana
Kalvari
Kinjo
Krissinel
Lipman
Lora Mak
Lo Conte
Lukas Pravda
Mandar Deshpande
Matthew J Conroy
McCoy
Meldal
Mihaly Varadi
Mir
Mirdita
Mitchell
Morin
Mukhopadhyay
Niggli
Nurul Nadzirin
Oliver Smart
Osman Salih
Paul Gane
Pauline Haslam
PDBe-KB consortium
Preeti Choudhary
Radka Svobodova-Vařeková
Roisin Dunlop
Romana Gáborová
Sameer Velankar
Saqib Mir
Sehnal
Sehnal
Sreenath Nair
Stephen Anyango
Sterling
Thieker
Typhaine Paysan-Lafosse
Ulrich
Velankar
Watkins
Westbrook
Wilkinson
Wishart
wwPDB consortium
Yamada
Young
Young
Publication venue: 'Oxford University Press (OUP)'
Publication date: 25/10/2019
Field of study

© 2019 The Authors. Published by OUP. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1093/nar/gkz990The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), actively participates in the deposition, curation, validation, archiving and dissemination of macromolecular structure data. PDBe supports diverse research communities in their use of macromolecular structures by enriching the PDB data and by providing advanced tools and services for effective data access, visualization and analysis. This paper details the enrichment of data at PDBe, including mapping of RNA structures to Rfam, and identification of molecules that act as cofactors. PDBe has developed an advanced search facility with ∼100 data categories and sequence searches. New features have been included in the LiteMol viewer at PDBe, with updated visualization of carbohydrates and nucleic acids. Small molecules are now mapped more extensively to external databases and their visual representation has been enhanced. These advances help users to more easily find and interpret macromolecular structure data in order to solve scientific problems.The Protein Data Bank in Europe is supported by European Molecular Biology Laboratory-European Bioinformatics Institute; Wellcome Trust [104948]; Biotechnology and Biological Sciences Research Council [BB/N019172/1, BB/G022577/1, BB/J007471/1, BB/K016970/1, BB/K020013/1, BB/M013146/1, BB/M011674/1, BB/M020347/1, BB/M020428/1, BB/P024351/1]; European Union [284209]; ELIXIR and Open Targets. Funding for open access charge: EMB

Crossref

Wolverhampton Intellectual Repository and E-theses

Recommended from our members

Reciprocal Best Structure Hits

Author: Bateman Alex
Monzon Vivian
Paysan-Lafosse Typhaine
Wood Valerie
Publication venue: European Bioinformatics Institute
Publication date: 14/06/2022
Field of study

In this work, we are using AlphaFold structure models to find the closest homologues proteins between Homo sapiens and D. melanogaster, C. elegans, S. cerevisiae and S. pombe as well as between S. cerevisiae and S. pombe. We are using the structure aligner Foldseek to run all against all and search for the best scoring hit in both directions to detect the Reciprocal Best Structure Hits (RBSH). We compare the results to protein pairs detected by their sequence similarity as Reciprocal Best Hits (RBH) and verify the results using the PANTHER family classification files.

\

Note: This dataset is an earlier version of a more up-to-date dataset at https://doi.org/10.17863/CAM.8787

Apollo (Cambridge)

Recommended from our members

Reciprocal Best Structure Hits (RBSH)

Author: Bateman Alex
Monzon Vivian
Paysan-Lafosse Typhaine
Wood Valerie
Publication venue: European Bioinformatics Institute
Publication date: 25/08/2022
Field of study

\

Note: This dataset is an updated version of the dataset at https://doi.org/10.17863/CAM.85487

Apollo (Cambridge)

Reciprocal best structure hits: using AlphaFold models to discover distant homologues.

Author: Bateman Alex
Monzon Vivian
Paysan-Lafosse Typhaine
Wood Valerie
Publication venue: Bioinform Adv
Publication date: 28/09/2022
Field of study

MOTIVATION: The conventional methods to detect homologous protein pairs use the comparison of protein sequences. But the sequences of two homologous proteins may diverge significantly and consequently may be undetectable by standard approaches. The release of the AlphaFold 2.0 software enables the prediction of highly accurate protein structures and opens many opportunities to advance our understanding of protein functions, including the detection of homologous protein structure pairs. RESULTS: In this proof-of-concept work, we search for the closest homologous protein pairs using the structure models of five model organisms from the AlphaFold database. We compare the results with homologous protein pairs detected by their sequence similarity and show that the structural matching approach finds a similar set of results. In addition, we detect potential novel homologs solely with the structural matching approach, which can help to understand the function of uncharacterized proteins and make previously overlooked connections between well-characterized proteins. We also observe limitations of our implementation of the structure-based approach, particularly when handling highly disordered proteins or short protein structures. Our work shows that high accuracy protein structure models can be used to discover homologous protein pairs, and we expose areas for improvement of this structural matching approach. AVAILABILITY AND IMPLEMENTATION: Information to the discovered homologous protein pairs can be found at the following URL: https://doi.org/10.17863/CAM.87873. The code can be accessed here: https://github.com/VivianMonzon/Reciprocal_Best_Structure_Hits. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online

PubMed Central

Apollo (Cambridge)

InterPro in 2022

Author: Bateman Alex
Bileschi Maxwell L
Blum Matthias
Bork Peer
Bridge Alan
Chuguransky Sara
Colwell Lucy
Gough Julian
Grego Tiago
Haft Daniel H
Letunić Ivica
Marchler-Bauer Aron
Mi Huaiyu
Natale Darren A
Orengo Christine A
Pandurangan Arun P
Paysan-Lafosse Typhaine
Pinto Beatriz Lázaro
Rivoire Catherine
Salazar Gustavo A
Sigrist Christian J A
Sillitoe Ian
Thanki Narmada
Thomas Paul D
Tosatto Silvio C E
Wu Cathy H
Publication venue: OXFORD UNIV PRESS
Publication date: 01/01/2023
Field of study

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction

Archivio istituzionale della ricerca - Università di Padova

Recommended from our members

The InterPro protein families and domains database: 20 years on

Author: Bateman Alex
Blum Matthias
Bork Peer
Bridge Alan
Chang Hsin-Yu
Chuguransky Sara
Finn Robert D
Gough Julian
Grego Tiago
Haft Daniel H
Kandasaamy Swaathi
Letunic Ivica
Marchler-Bauer Aron
Mi Huaiyu
Mitchell Alex
Natale Darren A
Necci Marco
Nuka Gift
Orengo Christine A
Pandurangan Arun P
Paysan-Lafosse Typhaine
Qureshi Matloob
Raj Shriya
Richardson Lorna
Rivoire Catherine
Salazar Gustavo A
Sigrist Christian J A
Sillitoe Ian
Thanki Narmada
Thomas Paul D
Tosatto Silvio C E
Williams Lowri
Wu Cathy H
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan

Apollo (Cambridge)

Archivio istituzionale della ricerca - Università di Padova