Utility of RNA-seq and GPMDB
Protein Observation Frequency
for Improving the Sensitivity of Protein Identification by Tandem
MS
- Publication date
- Publisher
Abstract
Tandem mass spectrometry (MS/MS)
followed by database search is
the method of choice for protein identification in proteomic studies.
Database searching methods employ spectral matching algorithms and
statistical models to identify and quantify proteins in a sample.
In general, these methods do not utilize any information other than
spectral data for protein identification. However, considering the
wealth of external data available for many biological systems, analysis
methods can incorporate such information to improve the sensitivity
of protein identification. In this study, we present a method to utilize
Global Proteome Machine Database identification frequencies and RNA-seq
transcript abundances to adjust the confidence scores of protein identifications.
The method described is particularly useful for samples with low-to-moderate
proteome coverage (i.e., <2000β3000 proteins), where we
observe up to an 8% improvement in the number of proteins identified
at a 1% false discovery rate