17 research outputs found

    Improved prediction of peptide detectability for targeted proteomics using a rank-based algorithm and organism-specific data.

    Full text link
    UNLABELLED: The in silico prediction of the best-observable "proteotypic" peptides in mass spectrometry-based workflows is a challenging problem. Being able to accurately predict such peptides would enable the informed selection of proteotypic peptides for targeted quantification of previously observed and non-observed proteins for any organism, with a significant impact for clinical proteomics and systems biology studies. Current prediction algorithms rely on physicochemical parameters in combination with positive and negative training sets to identify those peptide properties that most profoundly affect their general detectability. Here we present PeptideRank, an approach that uses learning to rank algorithm for peptide detectability prediction from shotgun proteomics data, and that eliminates the need to select a negative dataset for the training step. A large number of different peptide properties are used to train ranking models in order to predict a ranking of the best-observable peptides within a protein. Empirical evaluation with rank accuracy metrics showed that PeptideRank complements existing prediction algorithms. Our results indicate that the best performance is achieved when it is trained on organism-specific shotgun proteomics data, and that PeptideRank is most accurate for short to medium-sized and abundant proteins, without any loss in prediction accuracy for the important class of membrane proteins. BIOLOGICAL SIGNIFICANCE: Targeted proteomics approaches have been gaining a lot of momentum and hold immense potential for systems biology studies and clinical proteomics. However, since only very few complete proteomes have been reported to date, for a considerable fraction of a proteome there is no experimental proteomics evidence that would allow to guide the selection of the best-suited proteotypic peptides (PTPs), i.e. peptides that are specific to a given proteoform and that are repeatedly observed in a mass spectrometer. We describe a novel, rank-based approach for the prediction of the best-suited PTPs for targeted proteomics applications. By building on methods developed in the field of information retrieval (e.g. web search engines like Google's PageRank), we circumvent the delicate step of selecting positive and negative training sets and at the same time also more closely reflect the experimentalist´s need for selecting e.g. the 5 most promising peptides for targeting a protein of interest. This approach allows to predict PTPs for not yet observed proteins or for organisms without prior experimental proteomics data such as many non-model organisms

    Characterization of the larval hemolymph proteome.

    No full text
    <p>(A) Workflow of the analyses. Hemolymph samples from fed and starved larvae were digested in solution. Tryptic peptides were separated by isoelectric focusing for complexity reduction. Peptides were analyzed using microcapillary liquid chromatography–electrospray ionization–tandem MS (µLC-ESI-MS/MS). SEQUEST spectral search was performed for peptide spectrum matching. (B) Venn diagram illustrating the number of gene models detected in hemolymph from fed and starved larvae, respectively.</p

    Effects of starvation on hemolymph proteome.

    No full text
    <p>The magnitude versus amplitude (MA) plot shows the log2 fold change of the expression of the identified <i>D. melanogaster</i> proteins in the starved versus fed condition against the mean normalized spectral count. The top 10% differentially expressed proteins are highlighted, including 50 up-regulated proteins (red dots) and 22 down-regulated proteins (green dots). Protein identifiers are shown for selected proteins discussed in the text. Unambiguous protein identifications by class 1a, 1b, and 3a peptides are shown as full circles. Protein groups identified by class 2a or 2b peptides (which unambiguously imply a gene model) are shown as open circles, ambiguous identifications by 3b peptides are shown as open diamonds (the respective identifiers are listed in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0067208#pone.0067208.s002" target="_blank">Table S2</a>).</p

    Abundance of larval serum proteins.

    No full text
    <p>Hemolymph was isolated from fed (f) and starved (s) larvae (see Fig. 1). Proteins in samples of 10, 3.3, 1.7 or 1 µl hemolymph were resolved by SDS-PAGE and stained with Coomassie Blue. The position of the major larval serum proteins (LSPs) is indicated by an arrowhead. Position and size (kDa) of molecular weight markers (m) are indicated on the right side.</p

    Starvation-associated protein abundance changes in larval hemolymph.

    No full text
    a)<p>Change in transcript levels during development in rich medium was estimated based on expression profiling data from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0067208#pone.0067208-Burmester2" target="_blank">[77]</a>. For transcript levels around the time when starvation was started (early) the values observed at L2 and L3/12hours were averaged. For transcript levels around the time of hemolymph collection (late) the values at L3/puff stage 1–2 were used. The given values correspond to log2(early/late).</p

    Summary of identified spectra, peptides, proteins and estimated FDR levels.

    No full text
    a)<p>According to our peptide classification scheme <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0067208#pone.0067208-Qeli1" target="_blank">[38]</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0067208#pone.0067208-Grobei1" target="_blank">[46]</a>, class 1a peptides unambiguously identify a single unique protein sequence encoded by a unique transcript. Class 1b peptides also unambiguously identify a unique protein sequence encoded by several transcripts of the same gene model with identical coding region and differences in the 5′ and/or 3′ untranslated regions. Class 2a peptides identify a subset and class 2b peptides all protein sequences encoded by a gene model. Class 3a peptides unambiguously identify one protein sequence, but this sequence could be encoded by several gene models from distinct loci (e.g. histones). Finally, class 3b peptides can be derived from different protein sequences encoded by several gene models from distinct loci and have the least information content.</p>b)<p>For protein groups identified by class 2a or 2b peptides (a gene model identification) all possible protein accessions are listed in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0067208#pone.0067208.s001" target="_blank">Table S1</a>.</p>c)<p>The minimal number of additional protein identifications by 3b peptides is shown.</p>d)<p>Based on the total hits in target and decoy databases (DB), the FDR was estimated at the spectra, peptide and protein level.</p

    Starvation protocol and developmental effects.

    No full text
    <p>(A) At 65 hours after egg deposition (AED), half of the larvae were transferred to starvation medium (20% sucrose). Twenty-four hours later, hemolymph from fed and starved larvae was collected for deep shotgun proteomics. Developmental timing of ecdysone titer, larval stages L2 and L3, acquisition of critical weight, wandering behavior and pupation under optimal conditions is indicated as well. Numbers indicate time in hours AED. (B) Size of fed and starved larvae at time of hemolymph collection. (C) At 65 hours AED, larvae were either shifted to starvation medium or further maintained on rich medium followed by analysis of the fraction of pupae over time (n = 278 fed and 141 starved) (D) Size of pupae formed by either fed or starved larvae. Bars = 0.5 mm.</p
    corecore