Search CORE

48 research outputs found

Evidence attribution in the UniProt Knowledgebase

Author: Michele Magrane
UniProt Consortium
Publication venue
Publication date: 22/04/2009
Field of study

UniProtKB provides the scientific community with a comprehensive collection of protein sequence records containing extensive curated information including functional and sequence annotation. This information is derived from a variety of sources such as scientific literature and sequence analysis programs as well as data imported from automatic annotation systems and external databases. To allow users to ascertain the origin of each data item in a UniProtKB record, an evidence attribution system is being introduced which links each piece of information to its original source. This system allows users to trace the origin of all information, to differentiate easily between experimental and computational data, and to assess data reliability. The current system and plans for its future development and enhancement will be presented

Nature Precedings

UniProt Knowledgebase: a hub of integrated data

Author: Michele Magrane
UniProt Consortium
Publication venue
Publication date: 22/10/2010
Field of study

Data integration plays an increasingly important role in bringing together the large amounts of diverse information spread across disparate resources and presenting a comprehensive overview of these data to the scientific community. The UniProt Knowledgebase (UniProtKB) acts as a central hub of protein knowledge by providing a unified view of protein sequence and functional information. Manual and automatic annotation procedures are used to add data directly to the database while extensive cross-referencing to more than 120 external databases provides access to additional relevant information in more specialised data collections. UniProtKB also integrates data such as protein sequences, protein-protein interactions, Gene Ontology terms and official gene nomenclature from a range of resources. All information in UniProtKB is attributed to its original source, allowing users to trace the provenance of all data. In addition, UniProtKB data is made freely available in a range of formats to facilitate integration with other databases and the UniProt Consortium is committed to using and promoting common data exchange formats and technologies. This approach ensures that information is captured in the most appropriate resource for subsequent integration with other databases and also ensures maximum curation efficiency by preventing duplication of efforts across multiple resources. How UniProt achieves this data capture and integration will be presented. The UniProt resource is available at "www.uniprot.org":http://www.uniprot.org

Crossref

Nature Precedings

The DNA polymerases of Drosophila melanogaster.

Author: Attrill Helen
Berloco Maria
Cotterill Sue
Magrane Michele
Marygold Steven J
McVey Mitch
Rong Yikang
Speretta Elena
Warner Kate
Yamaguchi Masamitsu
Publication venue: Fly (Austin)
Publication date: 01/01/2020
Field of study

DNA synthesis during replication or repair is a fundamental cellular process that is catalyzed by a set of evolutionary conserved polymerases. Despite a large body of research, the DNA polymerases of Drosophila melanogaster have not yet been systematically reviewed, leading to inconsistencies in their nomenclature, shortcomings in their functional (Gene Ontology, GO) annotations and an under-appreciation of the extent of their characterization. Here, we describe the complete set of DNA polymerases in D. melanogaster, applying nomenclature already in widespread use in other species, and improving their functional annotation. A total of 19 genes encode the proteins comprising three replicative polymerases (alpha-primase, delta, epsilon), five translesion/repair polymerases (zeta, eta, iota, Rev1, theta) and the mitochondrial polymerase (gamma). We also provide an overview of the biochemical and genetic characterization of these factors in D. melanogaster. This work, together with the incorporation of the improved nomenclature and GO annotation into key biological databases, including FlyBase and UniProtKB, will greatly facilitate access to information about these important proteins

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Bari

Apollo (Cambridge)

St George's Online Research Archive

An evaluation of GO annotation retrieval for BioCreAtIvE and GOA

Author: Bmc Bioinformatics
Daniel G Barrell
David Binns
Emily C Dimmer
Evelyn B Camon
Evelyn B Camon
John Maslen
Michele Magrane
Rolf Apweiler
Vivian Lee
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

from A critical assessment of text mining methods in molecular biolog

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Universal Protein Resource (UniProt)

Author: Apweiler Rolf
Bairoch Amos
Barker Winona C.
Boeckmann Brigitte
Ferro Serenella
Gasteiger Elisabeth
Huang Hongzhan
Lopez Rodrigo
Magrane Michele
Martin Maria J.
Natale Darren A.
O'Donovan Claire
Redaschi Nicole
Wu Cathy H.
Yeh Lai-Su L.
Publication venue
Publication date: 02/08/2017
Field of study

The Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references. This centrepiece consists of two sections: UniProt/Swiss-Prot, with fully, manually curated entries; and UniProt/TrEMBL, enriched with automated classification and annotation. During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; we introduced a new comment line topic: TOXIC DOSE to store information on the acute toxicity of a toxin; the UniProt keyword list got augmented by additional keywords; we improved the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications. Furthermore, we introduced a new documentation file of the strains and their synonyms. Many new database cross-references were introduced and we started to make use of Digital Object Identifiers. We also achieved in collaboration with the Macromolecular Structure Database group at EBI an improved integration with structural databases by residue level mapping of sequences from the Protein Data Bank entries onto corresponding UniProt entries. For convenient sequence searches we provide the UniRef non-redundant sequence databases. The comprehensive UniParc database stores the complete body of publicly available protein sequence data. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every two week

RERO DOC Digital Library

The Universal Protein Resource (UniProt): an expanding universe of protein information

Author: Apweiler Rolf
Bairoch Amos
Barker Winona C.
Boeckmann Brigitte
Ferro Serenella
Gasteiger Elisabeth
Huang Hongzhan
Lopez Rodrigo
Magrane Michele
Martin Maria J.
Mazumder Raja
Natale Darren A.
O'Donovan Claire
Redaschi Nicole
Suzek Baris
Wu Cathy H.
Publication venue
Publication date: 02/08/2017
Field of study

The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/database

RERO DOC Digital Library

UniProt: the Universal Protein knowledgebase

Author: Apweiler Rolf
Bairoch Amos
Barker Winona C.
Boeckmann Brigitte
Ferro Serenella
Gasteiger Elisabeth
Huang Hongzhan
Lopez Rodrigo
Magrane Michele
Martin Maria J.
Natale Darren A.
O'Donovan Claire
Redaschi Nicole
Wu Cathy H.
Yeh Lai‐Su L.
Publication venue
Publication date: 02/08/2017
Field of study

To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss‐Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross‐references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss‐Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross‐references). For convenient sequence searches, UniProt also provides several non‐redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniPro

RERO DOC Digital Library

The Universal Protein Resource (UniProt)

Author: Apweiler Rolf
Bairoch Amos
Barker Winona C.
Boeckmann Brigitte
Ferro Serenella
Gasteiger Elisabeth
Huang Hongzhan
Lopez Rodrigo
Magrane Michele
Martin Maria J.
Natale Darren A.
O'Donovan Claire
Redaschi Nicole
Wu Cathy H.
Yeh Lai-Su L.
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

Crossref

PubMed Central

Archive ouverte UNIGE

The Gene Ontology Annotation (GOA) Project—Application of GO in SWISS-PROT, TrEMBL and InterPro

Author: AmiGO Browser
Apweiler
Bairoch
Biswas
Butler
Catherine Brooksbank
Daniel Barrell
EBI FTP Server
EC2GO Mapping
Ensembl
Evelyn Camon
Gene Ontology Annotation Home Page
Gene Ontology Home Page
Hubbard
International Protein Index
InterPro2GO Mapping
Michele Magrane
Proteome Analysis Pages
QuickGO Browser
Rolf Apweiler
Sequence Retrieval System (SRS)
SWKW2GO Mapping
The Gene Ontology Consortium
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2003
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

The UniProt-GO Annotation database in 2011

Author: Alam-Faruque Yasmin
Apweiler Rolf
Argoud-Puy Ghislaine
Auchincloss Andrea
Axelsen Kristian
Bely Benoit
Blatter Marie-Claude
Bougueleret Lydie
Boutet Emmanuel
Braconi-Quintaje Silvia
Breuza Lionel
Bridge Alan
Browne Paul
Coudert Elizabeth
Cusin Isabelle
Dimmer Emily C.
Duek- Roggli Paula
Eberhardt Ruth
Estreicher Anne
Famiglietti Livia
Ferro-Rojas Serenella
Feuermann Marc
Gardner Michael
Gos Arnaud
Gruaz-Gumowski Nadine
Hinz Ursula
Hulo Chantal
Huntley Rachael P.
James Janet
Jimenez Silvia
Jungo Florence
Keller Guillaume
Laiho Kati
Legge Duncan
Lemercier Phillippe
Lieberherr Damien
Magrane Michele
Martin Maria J.
Masson Patrick
Moinat Madelaine
Mun Chan Wei
O'Donovan Claire
Pedruzzi Ivo
Pichler Klemens
Poggioli Diego
Poux Sylvain
Rivoire Catherine
Roechert Bernd
Sawford Tony
Schneider Michael
Sehra Harminder
Stutz Andre
Sundaram Shyamala
Tognolli Michael
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360 000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data se

RERO DOC Digital Library