Comparative Analysis of
Different Label-Free Mass
Spectrometry Based Protein Abundance Estimates and Their Correlation
with RNA-Seq Gene Expression Data
- Publication date
- Publisher
Abstract
An increasing number of studies involve integrative analysis
of
gene and protein expression data taking advantage of new technologies
such as next-generation transcriptome sequencing (RNA-Seq) and highly
sensitive mass spectrometry (MS) instrumentation. Thus, it becomes
interesting to revisit the correlative analysis of gene and protein
expression data using more recently generated data sets. Furthermore,
within the proteomics community there is a substantial interest in
comparing the performance of different label-free quantitative proteomic
strategies. Gene expression data can be used as an indirect benchmark
for such protein-level comparisons. In this work we use publicly available
mouse data to perform a joint analysis of genomic and proteomic data
obtained on the same organism. First, we perform a comparative analysis
of different label-free protein quantification methods (intensity
based and spectral count based and using various associated data normalization
steps) using several software tools on the proteomic side. Similarly,
we perform correlative analysis of gene expression data derived using
microarray and RNA-Seq methods on the genomic side. We also investigate
the correlation between gene and protein expression data, and various
factors affecting the accuracy of quantitation at both levels. It
is observed that spectral count based protein abundance metrics, which
are easy to extract from any published data, are comparable to intensity
based measures with respect to correlation with gene expression data.
The results of this work should be useful for designing robust computational
pipelines for extraction and joint analysis of gene and protein expression
data in the context of integrative studies