1 research outputs found
Special Enrichment Strategies Greatly Increase the Efficiency of Missing Proteins Identification from Regular Proteome Samples
As part of the Chromosome-Centric
Human Proteome Project (C-HPP)
mission, laboratories all over the world have tried to map the entire
missing proteins (MPs) since 2012. On the basis of the first and second
Chinese Chromosome Proteome Database (CCPD 1.0 and 2.0) studies, we
developed systematic enrichment strategies to identify MPs that fell
into four classes: (1) low molecular weight (LMW) proteins, (2) membrane
proteins, (3) proteins that contained various post-translational modifications
(PTMs), and (4) nucleic acid-associated proteins. Of 8845 proteins
identified in 7 data sets, 79 proteins were classified as MPs. Among
data sets derived from different enrichment strategies, data sets
for LMW and PTM yielded the most novel MPs. In addition, we found
that some MPs were identified in multiple-data sets, which implied
that tandem enrichments methods might improve the ability to identify
MPs. Moreover, low expression at the transcription level was the major
cause of the āmissingā of these MPs; however, MPs with
higher expression level also evaded identification, most likely due
to other characteristics such as LMW, high hydrophobicity and PTM.
By combining a stringent manual check of the MS<sub>2</sub> spectra
with peptides synthesis verification, we confirmed 30 MPs (neXtProt
PE2 ā¼ PE4) and 6 potential MPs (neXtProt PE5) with authentic
MS evidence. By integrating our large-scale data sets of CCPD 2.0,
the number of identified proteins has increased considerably beyond
simulation saturation. Here, we show that special enrichment strategies
can break through the data saturation bottleneck, which could increase
the efficiency of MP identification in future C-HPP studies. All 7
data sets have been uploaded to ProteomeXchange with the identifier
PXD002255