16 research outputs found
An integrated landscape of protein expression in human cancer
Using 11 proteomics datasets, mostly available through the PRIDE database, we assembled a reference expression map for 191 cancer cell lines and 246 clinical tumour samples, across 13 lineages. We found unique peptides identified only in tumour samples despite a much higher coverage in cell lines. These were mainly mapped to proteins related to regulation of signalling receptor activity. Correlations between baseline expression in cell lines and tumours were calculated. We found these to be highly similar across all samples with most similarity found within a given sample type. Integration of proteomics and transcriptomics data showed median correlation across cell lines to be 0.58 (range between 0.43 and 0.66). Additionally, in agreement with previous studies, variation in mRNA levels was often a poor predictor of changes in protein abundance. To our knowledge, this work constitutes the first meta-analysis focusing on cancer-related public proteomics datasets. We therefore also highlight shortcomings and limitations of such studies. All data is available through PRIDE dataset identifier PXD013455 and in Expression Atlas.publishedVersio
Expression Atlas: gene and protein expression across multiple studies and organisms
Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions
Recommended from our members
Expression Atlas update: from tissues to single cells.
Expression Atlas is EMBL-EBI's resource for gene and protein expression. It sources and compiles data on the abundance and localisation of RNA and proteins in various biological systems and contexts and provides open access to this data for the research community. With the increased availability of single cell RNA-Seq datasets in the public archives, we have now extended Expression Atlas with a new added-value service to display gene expression in single cells. Single Cell Expression Atlas was launched in 2018 and currently includes 123 single cell RNA-Seq studies from 12 species. The website can be searched by genes within or across species to reveal experiments, tissues and cell types where this gene is expressed or under which conditions it is a marker gene. Within each study, cells can be visualized using a pre-calculated t-SNE plot and can be coloured by different features or by cell clusters based on gene expression. Within each experiment, there are links to downloadable files, such as RNA quantification matrices, clustering results, reports on protocols and associated metadata, such as assigned cell types
A quantitative and temporal map of proteostasis during heat shock in Saccharomyces cerevisiae
Temporal changes in the yeast proteome under heat stress are mapped and integrated to protein networks to reveal cognate groups of chaperones (orange and blue circles) acting on coherent groups of substrate proteins (red and green).</p
An integrated landscape of protein expression in human cancer
Using 11 proteomics datasets, mostly available through the PRIDE database, we assembled a reference expression map for 191 cancer cell lines and 246 clinical tumour samples, across 13 lineages. We found unique peptides identified only in tumour samples despite a much higher coverage in cell lines. These were mainly mapped to proteins related to regulation of signalling receptor activity. Correlations between baseline expression in cell lines and tumours were calculated. We found these to be highly similar across all samples with most similarity found within a given sample type. Integration of proteomics and transcriptomics data showed median correlation across cell lines to be 0.58 (range between 0.43 and 0.66). Additionally, in agreement with previous studies, variation in mRNA levels was often a poor predictor of changes in protein abundance. To our knowledge, this work constitutes the first meta-analysis focusing on cancer-related public proteomics datasets. We therefore also highlight shortcomings and limitations of such studies. All data is available through PRIDE dataset identifier PXD013455 and in Expression Atlas
Analysis of Intrinsic Peptide Detectability via Integrated Label-Free and SRM-Based Absolute Quantitative Proteomics
Quantitative
mass spectrometry-based proteomics of complex biological
samples remains challenging in part due to the variability and charge
competition arising during electrospray ionization (ESI) of peptides
and the subsequent transfer and detection of ions. These issues preclude
direct quantification from signal intensity alone in the absence of
a standard. A deeper understanding of the governing principles of
peptide ionization and exploitation of the inherent ionization and
detection parameters of individual peptides is thus of great value.
Here, using the yeast proteome as a model system, we establish the
concept of peptide F-factor as a measure of detectability, closely
related to ionization efficiency. F-factor is calculated by normalizing
peptide precursor ion intensity by absolute abundance of the parent
protein. We investigated F-factor characteristics in different shotgun
proteomics experiments, including across multiple ESI-based LC–MS
platforms. We show that F-factors mirror previously observed physicochemical
predictors as peptide detectability but demonstrate a nonlinear relationship
between hydrophobicity and peptide detectability. Similarly, we use
F-factors to show how peptide ion coelution adversely affects detectability
and ionization. We suggest that F-factors have great utility for understanding
peptide detectability and gas-phase ion chemistry in complex peptide
mixtures, selection of surrogate peptides in targeted MS studies,
and for calibration of peptide ion signal in label-free workflows.
Data are available via ProteomeXchange with identifier PXD003472