Search CORE

17 research outputs found

The PRIDE database and related tools and resources in 2019: improving support for quantification data

Author: Audain E.
Bai J.
Bernal-Llinares M.
Brazma A.
Cox J.
Csordas A.
Eisenacher M.
Griss J.
Hewapathirana S.
Inuganti A.
Jarnuczak A.
Kundu D.
Mayer G.
Perez E.
Perez-Riverol Y.
Pfeuffer J.
Sachsenberg T.
Ternent T.
Tiwary S.
Uszkoreit J.
Vizcaino J.
Walzer M.
Yilmaz S.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas

MPG.PuRe

Expression Atlas: gene and protein expression across multiple studies and organisms

Author: Alfonso Munoz-Pomer Fuentes
Alvis Brazma
Andrew F. Jarnuczak
Anja Fullgrabe
Elisabet Barrera
Juan Antonio Vizcaino
Justin Preece
Laura Huerta
Maria Keays
Matthew Geniza
Melissa Burke
Nancy George
Nuno A. Fonseca
Oliver Stegle
Pankaj Jaiswal
rene Papatheodorou
Robert Petryszak
Satu Koskinen
Suhaib Mohammed
Wojciech Bazant
Wolfgang Huber
Y. Amy Tang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/10/2022
Field of study

Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions

UTUPub

Recommended from our members

Expression Atlas update: from tissues to single cells.

Author: Brazma Alvis
Fexova Silvie
Fonseca Nuno A
Fuentes Alfonso Muñoz-Pomer
Füllgrabe Anja
George Nancy
Green Matthew
Huang Ni
Huerta Laura
Iqbal Haider
Jarnuczak Andrew F
Jianu Monica
Jupp Simon
Manning Jonathan
Marioni John
Meyer Kerstin
Mohammed Suhaib
Moreno Pablo
Papatheodorou Irene
Petryszak Robert
Prada Medina Cesar Augusto
Talavera-López Carlos
Teichmann Sarah
Vizcaino Juan Antonio
Zhao Lingyun
Publication venue: Nucleic Acids Res
Publication date: 08/01/2020
Field of study

Expression Atlas is EMBL-EBI's resource for gene and protein expression. It sources and compiles data on the abundance and localisation of RNA and proteins in various biological systems and contexts and provides open access to this data for the research community. With the increased availability of single cell RNA-Seq datasets in the public archives, we have now extended Expression Atlas with a new added-value service to display gene expression in single cells. Single Cell Expression Atlas was launched in 2018 and currently includes 123 single cell RNA-Seq studies from 12 species. The website can be searched by genes within or across species to reveal experiments, tissues and cell types where this gene is expressed or under which conditions it is a marker gene. Within each study, cells can be visualized using a pre-calculated t-SNE plot and can be coloured by different features or by cell clusters based on gene expression. Within each experiment, there are links to downloadable files, such as RNA quantification matrices, clustering results, reports on protocols and associated metadata, such as assigned cell types

Apollo (Cambridge)

The ProteomeXchange consortium in 2020: enabling 'big data' approaches in proteomics

Author: Bandeira Nuno
Carver Jeremy J
Deutsch Eric W
García-Seisdedos David
Hermjakob Henning
Hewapathirana Suresh
Ishihama Yasushi
Jarnuczak Andrew F
Kawano Shin
Kundu Deepti J
MacCoss Michael J
MacLean Brendan
Okuda Shujiro
Perez-Riverol Yasset
Pullman Benjamin S
Sharma Vagisha
Sun Zhi
Vizcaíno Juan A
Watanabe Yu
Wertz Julie
Zhu Yunping
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/11/2019
Field of study

The ProteomeXchange (PX) consortium of proteomics resources (http://www.proteomexchange.org) has standardized data submission and dissemination of mass spectrometry proteomics data worldwide since 2012. In this paper, we describe the main developments since the previous update manuscript was published in Nucleic Acids Research in 2017. Since then, in addition to the four PX existing members at the time (PRIDE, PeptideAtlas including the PASSEL resource, MassIVE and jPOST), two new resources have joined PX: iProX (China) and Panorama Public (USA). We first describe the updated submission guidelines, now expanded to include six members. Next, with current data submission statistics, we demonstrate that the proteomics field is now actively embracing public open data policies. At the end of June 2019, more than 14 100 datasets had been submitted to PX resources since 2012, and from those, more than 9 500 in just the last three years. In parallel, an unprecedented increase of data re-use activities in the field, including ‘big data’ approaches, is enabling novel research and new data resources. At last, we also outline some of our future plans for the coming years

Kyoto University Research Information Repository

Using Deep Learning to Extrapolate Protein Expression Measurements

Author: Barzine MP
Brazma A
Celms E
Choudhary JS
Freivalds K
Ghavidel FZ
Jarnuczak AF
Jonassen I
Lace L
Opmanis M
Rituma D
Viksna J
Vizcaíno JA
Wright JC
Čerāns K
Publication venue: 'Wiley'
Publication date: 01/11/2020
Field of study

Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, including human cell lines and human and mouse tissues. This method predicts the protein expression values with average R2 scores between 0.46 and 0.54, which is significantly better than predictions based on correlations using the RNA expression data alone. Moreover, it is demonstrated that the derived models can be "transferred" across experiments and species. For instance, the model derived from human tissues gave a R2=0.51 when applied to mouse tissue data. It is concluded that protein abundances generated in label-free MS experiments can be computationally predicted using functional annotated attributes and can be used to highlight aberrant protein abundance values

Crossref

Institute of Cancer Research Repository

Using Deep Learning to Extrapolate Protein Expression Measurements.

Author: Antonio Vizcaíno J
Barzine MP
Brazma A
Celms E
Choudhary JS
Freivalds K
Ghavidel FZ
Jarnuczak AF
Jonassen I
Lace L
Opmanis M
Rituma D
Viksna J
Wright JC
Čerāns K
Publication venue: 'Wiley'
Publication date: 01/11/2020
Field of study

Institute of Cancer Research Repository