1,143 research outputs found
UniProtKB amid the turmoil of plant proteomics research
The UniProt KnowledgeBase (UniProtKB) provides a single, centralized, authoritative resource for protein sequences and functional information. The majority of its records is based on automatic translation of coding sequences (CDS) provided by submitters at the time of initial deposition to the nucleotide sequence databases (INSDC). This article will give a general overview of the current situation, with some specific illustrations extracted from our annotation of Arabidopsis and rice proteomes. More and more frequently, only the raw sequence of a complete genome is deposited to the nucleotide sequence databases and the gene model predictions and annotations are kept in separate, specialized model organism databases (MODs). In order to be able to provide the complete proteome of model organisms, UniProtKB had to implement pipelines for import of protein sequences from Ensembl and EnsemblGenomes. A single genome can be the target of several unrelated sequencing projects and the final assembly and gene model predictions may diverge quite significantly. In addition, several cultivars of the same species are often sequenced – 1001 Arabidopsis cultivars are currently under way – and the resulting proteomes are far from being identical. Therefore, one challenge for UniProtKB is to store and organize these data in a convenient way and to clearly defined reference proteomes that should be made available to users. Manual annotation is one of the landmarks of the Swiss-Prot section of UniProtKB. Besides adding functional annotation, curators are checking, and often correcting, gene model predictions. For plants, this task is limited to Arabidopsis thaliana and Oryza sativa subsp. japonica. Proteomics data providing experimental evidences confirming the existence of proteins or identifying sequence features such as post-translational modifications are also imported into UniProtKB records and the knowledgebase is cross-referenced to numerous proteomics resource
Manual Curation of Vertebrate Proteins in the UniProt Knowledgebase.
The UniProt Knowledgebase (UniProtKB) aims to provide the scientific community with a comprehensive, consistent and authoritative resource for protein sequence and functional information. Given the importance of human and vertebrate model data in biomedical research, a major focus is the high-quality manual curation of human proteins and their vertebrate orthologues. Manual curation involves (1) the extraction of experimental results from scientific literature to enrich protein records with a wide range of information including function, structure, interactions and subcellular location, (2) the manual verification of each sequence and clarification of discrepancies between sequence reports, and (3) the assessment of the output of a range of analysis programmes to ensure that sequence features are correctly reported. Manual curation also facilitates the standardization of experimental data – a step necessary for development of methods that enable the semi-automated transfer of manual annotation to uncharacterised or related proteins. Consequently, manual curation of vertebrate proteins plays a vital role in providing users with a complete overview of available data while ensuring its accuracy, reliability and accessibility. UniProtKB/Swiss-Prot currently contains the complete manually reviewed human proteome, comprising approximately 20’300 proteins, and an additional 61’000 reviewed entries from model vertebrates such as mouse, rat, apes, cow, chicken, zebrafish and Xenopus. Ongoing efforts continue to improve the quality of vertebrate sequences in collaboration with HAVANA, Ensembl, HGNC and RefSeq, to include new functional information as it becomes available, and to extend the coverage of curated proteins in vertebrate species. All data are freely available from "http://www.uniprot.org":www.uniprot.org
The Encyclopedia of Proteome Dynamics – A big data ecosystem for (prote)omics
Driven by improvements in speed and resolution of mass spectrometers (MS), the field of proteomics, which involves the large-scale detection and analysis of proteins in cells, tissues and organisms, continues to expand in scale and complexity. There is a resulting growth in datasets of both raw MS files and processed peptide and protein identifications. MS-based proteomics technology is also used increasingly to measure additional protein properties affecting cellular function and disease mechanisms, including post-translational modifications, protein-protein interactions, subcellular and tissue distributions. Consequently, biologists and clinicians need innovative tools to conveniently analyse, visualise and explore such large, complex proteomics data and to integrate it with genomics and other related large-scale datasets. We have created the Encyclopedia of Proteome Dynamics (EPD) to meet this need (https://peptracker.com/epd/). The EPD combines a polyglot persistent database and webapplication that provides open access to integrated proteomics data for >30,000 proteins from published studies on human cells and model organisms. It is designed to provide a user-friendly interface, featuring graphical navigation with interactive visualisations that facilitate powerful data exploration in an intuitive manner. The EPD offers a flexible and scalable ecosystem to integrate proteomics data with genomics information, RNA expression and other related, large-scale datasets
Experimental data from flesh quality assessment and shelf life monitoring of high pressure processed European sea bass (Dicentrarchus labrax) fillets
Fresh fish are highly perishable food products and their short shelf-life limits their commercial exploitation and leads to waste, which has a negative impact on aquaculture sustainability. New non-thermal food processing methods, such as high pressure (HP) processing, prolong shelf-life while assuring high food quality. The effect of HP processing (600MPa, 25 °C, 5min) on European sea bass (Dicentrarchus labrax) fillet quality and shelf life was investigated. The data presented comprises microbiome and proteome profiles of control and HP-processed sea bass fillets from 1 to 67 days of isothermal storage at 2 °C. Bacterial diversity was analysed by Illumina high-throughput sequencing of the 16S rRNA gene in pooled DNAs from control or HP-processed fillets after 1, 11 or 67 days and the raw reads were deposited in the NCBI-SRA database with accession number PRJNA517618. Yeast and fungi diversity were analysed by high-throughput sequencing of the internal transcribed spacer (ITS) region for control and HP-processed fillets at the end of storage (11 or 67 days, respectively) and have the SRA accession number PRJNA517779. Quantitative label-free proteomics profiles were analysed by SWATH-MS (Sequential Windowed data independent Acquisition of the Total High-resolution-Mass Spectra) in myofibrillar or sarcoplasmic enriched protein extracts pooled for control or HP-processed fillets after 1, 11 and 67 days of storage. Proteome data was deposited in the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifiers PXD012737. These data support the findings reported in the associated manuscript "High pressure processing of European sea bass (Dicentrarchus labrax) fillets and tools for flesh quality and shelf life monitoring", Tsironi et al., 2019, JFE 262:83-91, doi.org/10.1016/j.jfoodeng.2019.05.010.FCT (Foundation of Science and Technology)
COFASP/0002/2015;
Portuguese Foundation for Science and Technology
UID/Multi/04326/2019
POCI-01-0145-FEDER007440
UID/NEU/04539/2019info:eu-repo/semantics/publishedVersio
Mechanism and Catalytic Site Atlas (M-CSA): a database of enzyme reaction mechanisms and active sites.
M-CSA (Mechanism and Catalytic Site Atlas) is a database of enzyme active sites and reaction mechanisms that can be accessed at www.ebi.ac.uk/thornton-srv/m-csa. Our objectives with M-CSA are to provide an open data resource for the community to browse known enzyme reaction mechanisms and catalytic sites, and to use the dataset to understand enzyme function and evolution. M-CSA results from the merging of two existing databases, MACiE (Mechanism, Annotation and Classification in Enzymes), a database of enzyme mechanisms, and CSA (Catalytic Site Atlas), a database of catalytic sites of enzymes. We are releasing M-CSA as a new website and underlying database architecture. At the moment, M-CSA contains 961 entries, 423 of these with detailed mechanism information, and 538 with information on the catalytic site residues only. In total, these cover 81% (195/241) of third level EC numbers with a PDB structure, and 30% (840/2793) of fourth level EC numbers with a PDB structure, out of 6028 in total. By searching for close homologues, we are able to extend M-CSA coverage of PDB and UniProtKB to 51 993 structures and to over five million sequences, respectively, of which about 40% and 30% have a conserved active site
ProteomeScout: A repository and analysis resource for post-translational modifications and proteins
ProteomeScout (https://proteomescout.wustl.edu) is a resource for the study of proteins and their post-translational modifications (PTMs) consisting of a database of PTMs, a repository for experimental data, an analysis suite for PTM experiments, and a tool for visualizing the relationships between complex protein annotations. The PTM database is a compendium of public PTM data, coupled with user-uploaded experimental data. ProteomeScout provides analysis tools for experimental datasets, including summary views and subset selection, which can identify relationships within subsets of data by testing for statistically significant enrichment of protein annotations. Protein annotations are incorporated in the ProteomeScout database from external resources and include terms such as Gene Ontology annotations, domains, secondary structure and non-synonymous polymorphisms. These annotations are available in the database download, in the analysis tools and in the protein viewer. The protein viewer allows for the simultaneous visualization of annotations in an interactive web graphic, which can be exported in Scalable Vector Graphics (SVG) format. Finally, quantitative data measurements associated with public experiments are also easily viewable within protein records, allowing researchers to see how PTMs change across different contexts. ProteomeScout should prove useful for protein researchers and should benefit the proteomics community by providing a stable repository for PTM experiments
The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces.
The Orthologous Matrix (OMA) is a leading resource to relate genes across many species from all of life. In this update paper, we review the recent algorithmic improvements in the OMA pipeline, describe increases in species coverage (particularly in plants and early-branching eukaryotes) and introduce several new features in the OMA web browser. Notable improvements include: (i) a scalable, interactive viewer for hierarchical orthologous groups; (ii) protein domain annotations and domain-based links between orthologous groups; (iii) functionality to retrieve phylogenetic marker genes for a subset of species of interest; (iv) a new synteny dot plot viewer; and (v) an overhaul of the programmatic access (REST API and semantic web), which will facilitate incorporation of OMA analyses in computational pipelines and integration with other bioinformatic resources. OMA can be freely accessed at https://omabrowser.org
Rock geochemistry induces stress and starvation responses in the bacterial proteome
Interactions between microorganisms and rocks play an important role in Earth system processes. However, little is known about the molecular capabilities microorganisms require to live in rocky environments. Using a quantitative label-free proteomics approach, we show that a model bacterium (Cupriavidus metallidurans CH34) can use volcanic rock to satisfy some elemental requirements, resulting in increased rates of cell division in both magnesium- and iron-limited media. However, the rocks also introduced multiple new stresses via chemical changes associated with pH, elemental leaching and surface adsorption of nutrients that were reflected in the proteome. For example, the loss of bioavailable phosphorus was observed and resulted in the upregulation of diverse phosphate limitation proteins, which facilitate increase phosphate uptake and scavenging within the cell. Our results revealed that despite the provision of essential elements, rock chemistry drives complex metabolic reorganization within rock-dwelling organisms, requiring tight regulation of cellular processes at the protein level. This study advances our ability to identify key microbial responses that enable life to persist in rock environments
GONUTS: the Gene Ontology Normal Usage Tracking System
The Gene Ontology Normal Usage Tracking System (GONUTS) is a community-based browser and usage guide for Gene Ontology (GO) terms and a community system for general GO annotation of proteins. GONUTS uses wiki technology to allow registered users to share and edit notes on the use of each term in GO, and to contribute annotations for specific genes of interest. By providing a site for generation of third-party documentation at the granularity of individual terms, GONUTS complements the official documentation of the Gene Ontology Consortium. To provide examples for community users, GONUTS displays the complete GO annotations from seven model organisms: Saccharomyces cerevisiae, Dictyostelium discoideum, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus and Arabidopsis thaliana. To support community annotation, GONUTS allows automated creation of gene pages for gene products in UniProt. GONUTS will improve the consistency of annotation efforts across genome projects, and should be useful in training new annotators and consumers in the production of GO annotations and the use of GO terms. GONUTS can be accessed at http://gowiki.tamu.edu. The source code for generating the content of GONUTS is available upon request
- …
