20 research outputs found

    Accurate Peptide Fragment Mass Analysis: Multiplexed Peptide Identification and Quantification

    No full text
    Fourier transform-all reaction monitoring (FT-ARM) is a novel approach for the identification and quantification of peptides that relies upon the selectivity of high mass accuracy data and the specificity of peptide fragmentation patterns. An FT-ARM experiment involves continuous, data-independent, high mass accuracy MS/MS acquisition spanning a defined <i>m</i>/<i>z</i> range. Custom software was developed to search peptides against the multiplexed fragmentation spectra by comparing theoretical or empirical fragment ions against every fragmentation spectrum across the entire acquisition. A dot product score is calculated against each spectrum to generate a score chromatogram used for both identification and quantification. Chromatographic elution profile characteristics are not used to cluster precursor peptide signals to their respective fragment ions. FT-ARM identifications are demonstrated to be complementary to conventional data-dependent shotgun analysis, especially in cases where the data-dependent method fails because of fragmenting multiple overlapping precursors. The sensitivity, robustness, and specificity of FT-ARM quantification are shown to be analogous to selected reaction monitoring-based peptide quantification with the added benefit of minimal assay development. Thus, FT-ARM is demonstrated to be a novel and complementary data acquisition, identification, and quantification method for the large scale analysis of peptides

    Decreased Gap Width in a Cylindrical High-Field Asymmetric Waveform Ion Mobility Spectrometry Device Improves Protein Discovery

    No full text
    High-field asymmetric waveform ion mobility spectrometry (FAIMS) is an atmospheric pressure ion mobility technique that separates gas phase ions according to their characteristic dependence of ion mobility on electric field strength. FAIMS can be implemented as a means of automated gas-phase fractionation in liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiments. We modified a commercially available cylindrical FAIMS device by enlarging the inner electrode, thereby narrowing the gap and increasing the effective field strength. This modification provided a nearly 4-fold increase in FAIMS peak capacity over the optimally configured unmodified device. We employed the modified FAIMS device for on-line fractionation in a proteomic analysis of a complex sample and observed major increases in protein discovery. NanoLC-FAIMS-MS/MS of an unfractionated yeast tryptic digest using the modified FAIMS device identified 53% more proteins than were identified using an unmodified FAIMS device and 98% more proteins than were identified with unaided nanoLC-MS/MS. We describe here the development of a nanoLC-FAIMS-MS/MS protocol that provides automated gas-phase fractionation for proteomic analysis of complex protein digests. We compare this protocol against prefractionation of peptides with isoelectric focusing and demonstrate that FAIMS fractionation yields comparable protein recovery while significantly reducing the amount of sample required and eliminating the need for additional sample handling

    Kojak: Efficient Analysis of Chemically Cross-Linked Protein Complexes

    No full text
    Protein chemical cross-linking and mass spectrometry enable the analysis of proteinā€“protein interactions and protein topologies; however, complicated cross-linked peptide spectra require specialized algorithms to identify interacting sites. The Kojak cross-linking software application is a new, efficient approach to identify cross-linked peptides, enabling large-scale analysis of proteinā€“protein interactions by chemical cross-linking techniques. The algorithm integrates spectral processing and scoring schemes adopted from traditional database search algorithms and can identify cross-linked peptides using many different chemical cross-linkers with or without heavy isotope labels. Kojak was used to analyze both novel and existing data sets and was compared to existing cross-linking algorithms. The algorithm provided increased cross-link identifications over existing algorithms and, equally importantly, the results in a fraction of computational time. The Kojak algorithm is open-source, cross-platform, and freely available. This software provides both existing and new cross-linking researchers alike an effective way to derive additional cross-link identifications from new or existing data sets. For new users, it provides a simple analytical resource resulting in more cross-link identifications than other methods

    The State of the Human Proteome in 2012 as Viewed through PeptideAtlas

    No full text
    The Human Proteome Project was launched in September 2010 with the goal of characterizing at least one protein product from each protein-coding gene. Here we assess how much of the proteome has been detected to date via tandem mass spectrometry by analyzing PeptideAtlas, a compendium of human derived LCā€“MS/MS proteomics data from many laboratories around the world. All data sets are processed with a consistent set of parameters using the Trans-Proteomic Pipeline and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas. Therefore, PeptideAtlas contains only high confidence protein identifications. To increase proteome coverage, we explored new comprehensive public data sources for data likely to add new proteins to the Human PeptideAtlas. We then folded these data into a Human PeptideAtlas 2012 build and mapped it to Swiss-Prot, a protein sequence database curated to contain one entry per human protein coding gene. We find that this latest PeptideAtlas build includes at least one peptide for each of āˆ¼12500 Swiss-Prot entries, leaving āˆ¼7500 gene products yet to be confidently cataloged. We characterize these ā€œPA-unseenā€ proteins in terms of tissue localization, transcript abundance, and Gene Ontology enrichment, and propose reasons for their absence from PeptideAtlas and strategies for detecting them in the future

    The State of the Human Proteome in 2012 as Viewed through PeptideAtlas

    No full text
    The Human Proteome Project was launched in September 2010 with the goal of characterizing at least one protein product from each protein-coding gene. Here we assess how much of the proteome has been detected to date via tandem mass spectrometry by analyzing PeptideAtlas, a compendium of human derived LCā€“MS/MS proteomics data from many laboratories around the world. All data sets are processed with a consistent set of parameters using the Trans-Proteomic Pipeline and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas. Therefore, PeptideAtlas contains only high confidence protein identifications. To increase proteome coverage, we explored new comprehensive public data sources for data likely to add new proteins to the Human PeptideAtlas. We then folded these data into a Human PeptideAtlas 2012 build and mapped it to Swiss-Prot, a protein sequence database curated to contain one entry per human protein coding gene. We find that this latest PeptideAtlas build includes at least one peptide for each of āˆ¼12500 Swiss-Prot entries, leaving āˆ¼7500 gene products yet to be confidently cataloged. We characterize these ā€œPA-unseenā€ proteins in terms of tissue localization, transcript abundance, and Gene Ontology enrichment, and propose reasons for their absence from PeptideAtlas and strategies for detecting them in the future

    <i>In Vivo</i> Application of Photocleavable Protein Interaction Reporter Technology

    No full text
    <i>In vivo</i> protein structures and proteinā€“protein interactions are critical to the function of proteins in biological systems. As a complementary approach to traditional protein interaction identification methods, cross-linking strategies are beginning to provide additional data on protein and protein complex topological features. Previously, photocleavable protein interaction reporter (pcPIR) technology was demonstrated by cross-linking pure proteins and protein complexes and the use of ultraviolet light to cleave or release cross-linked peptides to enable identification. In the present report, the pcPIR strategy is applied to <i>Escherichia coli</i> cells, and <i>in vivo</i> protein interactions and topologies are measured. More than 1600 labeled peptides from <i>E. coli</i> were identified, indicating that many protein sites react with pcPIR <i>in vivo</i>. From those labeled sites, 53 <i>in vivo</i> intercross-linked peptide pairs were identified and manually validated. Approximately half of the interactions have been reported using other techniques, although detailed structures exist for very few. Three proteins or protein complexes with detailed crystallography structures are compared to the cross-linking results obtained from <i>in vivo</i> application of pcPIR technology

    The State of the Human Proteome in 2012 as Viewed through PeptideAtlas

    No full text
    The Human Proteome Project was launched in September 2010 with the goal of characterizing at least one protein product from each protein-coding gene. Here we assess how much of the proteome has been detected to date via tandem mass spectrometry by analyzing PeptideAtlas, a compendium of human derived LCā€“MS/MS proteomics data from many laboratories around the world. All data sets are processed with a consistent set of parameters using the Trans-Proteomic Pipeline and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas. Therefore, PeptideAtlas contains only high confidence protein identifications. To increase proteome coverage, we explored new comprehensive public data sources for data likely to add new proteins to the Human PeptideAtlas. We then folded these data into a Human PeptideAtlas 2012 build and mapped it to Swiss-Prot, a protein sequence database curated to contain one entry per human protein coding gene. We find that this latest PeptideAtlas build includes at least one peptide for each of āˆ¼12500 Swiss-Prot entries, leaving āˆ¼7500 gene products yet to be confidently cataloged. We characterize these ā€œPA-unseenā€ proteins in terms of tissue localization, transcript abundance, and Gene Ontology enrichment, and propose reasons for their absence from PeptideAtlas and strategies for detecting them in the future

    The State of the Human Proteome in 2012 as Viewed through PeptideAtlas

    No full text
    The Human Proteome Project was launched in September 2010 with the goal of characterizing at least one protein product from each protein-coding gene. Here we assess how much of the proteome has been detected to date via tandem mass spectrometry by analyzing PeptideAtlas, a compendium of human derived LCā€“MS/MS proteomics data from many laboratories around the world. All data sets are processed with a consistent set of parameters using the Trans-Proteomic Pipeline and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas. Therefore, PeptideAtlas contains only high confidence protein identifications. To increase proteome coverage, we explored new comprehensive public data sources for data likely to add new proteins to the Human PeptideAtlas. We then folded these data into a Human PeptideAtlas 2012 build and mapped it to Swiss-Prot, a protein sequence database curated to contain one entry per human protein coding gene. We find that this latest PeptideAtlas build includes at least one peptide for each of āˆ¼12500 Swiss-Prot entries, leaving āˆ¼7500 gene products yet to be confidently cataloged. We characterize these ā€œPA-unseenā€ proteins in terms of tissue localization, transcript abundance, and Gene Ontology enrichment, and propose reasons for their absence from PeptideAtlas and strategies for detecting them in the future

    A Protein Standard That Emulates Homology for the Characterization of Protein Inference Algorithms

    Get PDF
    A natural way to benchmark the performance of an analytical experimental setup is to use samples of known composition and see to what degree one can correctly infer the content of such a sample from the data. For shotgun proteomics, one of the inherent problems of interpreting data is that the measured analytes are peptides and not the actual proteins themselves. As some proteins share proteolytic peptides, there might be more than one possible causative set of proteins resulting in a given set of peptides and there is a need for mechanisms that infer proteins from lists of detected peptides. A weakness of commercially available samples of known content is that they consist of proteins that are deliberately selected for producing tryptic peptides that are unique to a single protein. Unfortunately, such samples do not expose any complications in protein inference. Hence, for a realistic benchmark of protein inference procedures, there is a need for samples of known content where the present proteins share peptides with known absent proteins. Here, we present such a standard, that is based on <i>E. coli</i> expressed human protein fragments. To illustrate the application of this standard, we benchmark a set of different protein inference procedures on the data. We observe that inference procedures excluding shared peptides provide more accurate estimates of errors compared to methods that include information from shared peptides, while still giving a reasonable performance in terms of the number of identified proteins. We also demonstrate that using a sample of known protein content without proteins with shared tryptic peptides can give a false sense of accuracy for many protein inference methods

    The State of the Human Proteome in 2012 as Viewed through PeptideAtlas

    No full text
    The Human Proteome Project was launched in September 2010 with the goal of characterizing at least one protein product from each protein-coding gene. Here we assess how much of the proteome has been detected to date via tandem mass spectrometry by analyzing PeptideAtlas, a compendium of human derived LCā€“MS/MS proteomics data from many laboratories around the world. All data sets are processed with a consistent set of parameters using the Trans-Proteomic Pipeline and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas. Therefore, PeptideAtlas contains only high confidence protein identifications. To increase proteome coverage, we explored new comprehensive public data sources for data likely to add new proteins to the Human PeptideAtlas. We then folded these data into a Human PeptideAtlas 2012 build and mapped it to Swiss-Prot, a protein sequence database curated to contain one entry per human protein coding gene. We find that this latest PeptideAtlas build includes at least one peptide for each of āˆ¼12500 Swiss-Prot entries, leaving āˆ¼7500 gene products yet to be confidently cataloged. We characterize these ā€œPA-unseenā€ proteins in terms of tissue localization, transcript abundance, and Gene Ontology enrichment, and propose reasons for their absence from PeptideAtlas and strategies for detecting them in the future
    corecore