20 research outputs found
Accurate Peptide Fragment Mass Analysis: Multiplexed Peptide Identification and Quantification
Fourier transform-all reaction monitoring (FT-ARM) is
a novel approach
for the identification and quantification of peptides that relies
upon the selectivity of high mass accuracy data and the specificity
of peptide fragmentation patterns. An FT-ARM experiment involves continuous,
data-independent, high mass accuracy MS/MS acquisition spanning a
defined <i>m</i>/<i>z</i> range. Custom software
was developed to search peptides against the multiplexed fragmentation
spectra by comparing theoretical or empirical fragment ions against
every fragmentation spectrum across the entire acquisition. A dot
product score is calculated against each spectrum to generate a score
chromatogram used for both identification and quantification. Chromatographic
elution profile characteristics are not used to cluster precursor
peptide signals to their respective fragment ions. FT-ARM identifications
are demonstrated to be complementary to conventional data-dependent
shotgun analysis, especially in cases where the data-dependent method
fails because of fragmenting multiple overlapping precursors. The
sensitivity, robustness, and specificity of FT-ARM quantification
are shown to be analogous to selected reaction monitoring-based peptide
quantification with the added benefit of minimal assay development.
Thus, FT-ARM is demonstrated to be a novel and complementary data
acquisition, identification, and quantification method for the large
scale analysis of peptides
Decreased Gap Width in a Cylindrical High-Field Asymmetric Waveform Ion Mobility Spectrometry Device Improves Protein Discovery
High-field asymmetric waveform ion
mobility spectrometry (FAIMS)
is an atmospheric pressure ion mobility technique that separates gas
phase ions according to their characteristic dependence of ion mobility
on electric field strength. FAIMS can be implemented as a means of
automated gas-phase fractionation in liquid chromatography-tandem
mass spectrometry (LC-MS/MS) experiments. We modified a commercially
available cylindrical FAIMS device by enlarging the inner electrode,
thereby narrowing the gap and increasing the effective field strength.
This modification provided a nearly 4-fold increase in FAIMS peak
capacity over the optimally configured unmodified device. We employed
the modified FAIMS device for on-line fractionation in a proteomic
analysis of a complex sample and observed major increases in protein
discovery. NanoLC-FAIMS-MS/MS of an unfractionated yeast tryptic digest
using the modified FAIMS device identified 53% more proteins than
were identified using an unmodified FAIMS device and 98% more proteins
than were identified with unaided nanoLC-MS/MS. We describe here the
development of a nanoLC-FAIMS-MS/MS protocol that provides automated
gas-phase fractionation for proteomic analysis of complex protein
digests. We compare this protocol against prefractionation of peptides
with isoelectric focusing and demonstrate that FAIMS fractionation
yields comparable protein recovery while significantly reducing the
amount of sample required and eliminating the need for additional
sample handling
Kojak: Efficient Analysis of Chemically Cross-Linked Protein Complexes
Protein chemical cross-linking and
mass spectrometry enable the
analysis of proteināprotein interactions and protein topologies;
however, complicated cross-linked peptide spectra require specialized
algorithms to identify interacting sites. The Kojak cross-linking
software application is a new, efficient approach to identify cross-linked
peptides, enabling large-scale analysis of proteināprotein
interactions by chemical cross-linking techniques. The algorithm integrates
spectral processing and scoring schemes adopted from traditional database
search algorithms and can identify cross-linked peptides using many
different chemical cross-linkers with or without heavy isotope labels.
Kojak was used to analyze both novel and existing data sets and was
compared to existing cross-linking algorithms. The algorithm provided
increased cross-link identifications over existing algorithms and,
equally importantly, the results in a fraction of computational time.
The Kojak algorithm is open-source, cross-platform, and freely available.
This software provides both existing and new cross-linking researchers
alike an effective way to derive additional cross-link identifications
from new or existing data sets. For new users, it provides a simple
analytical resource resulting in more cross-link identifications than
other methods
The State of the Human Proteome in 2012 as Viewed through PeptideAtlas
The Human Proteome Project was launched in September
2010 with
the goal of characterizing at least one protein product from each
protein-coding gene. Here we assess how much of the proteome has been
detected to date via tandem mass spectrometry by analyzing PeptideAtlas,
a compendium of human derived LCāMS/MS proteomics data from
many laboratories around the world. All data sets are processed with
a consistent set of parameters using the Trans-Proteomic Pipeline
and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas.
Therefore, PeptideAtlas contains only high confidence protein identifications.
To increase proteome coverage, we explored new comprehensive public
data sources for data likely to add new proteins to the Human PeptideAtlas.
We then folded these data into a Human PeptideAtlas 2012 build and
mapped it to Swiss-Prot, a protein sequence database curated to contain
one entry per human protein coding gene. We find that this latest
PeptideAtlas build includes at least one peptide for each of ā¼12500
Swiss-Prot entries, leaving ā¼7500 gene products yet to be confidently
cataloged. We characterize these āPA-unseenā proteins
in terms of tissue localization, transcript abundance, and Gene Ontology
enrichment, and propose reasons for their absence from PeptideAtlas
and strategies for detecting them in the future
The State of the Human Proteome in 2012 as Viewed through PeptideAtlas
The Human Proteome Project was launched in September
2010 with
the goal of characterizing at least one protein product from each
protein-coding gene. Here we assess how much of the proteome has been
detected to date via tandem mass spectrometry by analyzing PeptideAtlas,
a compendium of human derived LCāMS/MS proteomics data from
many laboratories around the world. All data sets are processed with
a consistent set of parameters using the Trans-Proteomic Pipeline
and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas.
Therefore, PeptideAtlas contains only high confidence protein identifications.
To increase proteome coverage, we explored new comprehensive public
data sources for data likely to add new proteins to the Human PeptideAtlas.
We then folded these data into a Human PeptideAtlas 2012 build and
mapped it to Swiss-Prot, a protein sequence database curated to contain
one entry per human protein coding gene. We find that this latest
PeptideAtlas build includes at least one peptide for each of ā¼12500
Swiss-Prot entries, leaving ā¼7500 gene products yet to be confidently
cataloged. We characterize these āPA-unseenā proteins
in terms of tissue localization, transcript abundance, and Gene Ontology
enrichment, and propose reasons for their absence from PeptideAtlas
and strategies for detecting them in the future
<i>In Vivo</i> Application of Photocleavable Protein Interaction Reporter Technology
<i>In vivo</i> protein structures and proteināprotein
interactions are critical to the function of proteins in biological
systems. As a complementary approach to traditional protein interaction
identification methods, cross-linking strategies are beginning to
provide additional data on protein and protein complex topological
features. Previously, photocleavable protein interaction reporter
(pcPIR) technology was demonstrated by cross-linking pure proteins
and protein complexes and the use of ultraviolet light to cleave or
release cross-linked peptides to enable identification. In the present
report, the pcPIR strategy is applied to <i>Escherichia coli</i> cells, and <i>in vivo</i> protein interactions and topologies
are measured. More than 1600 labeled peptides from <i>E. coli</i> were identified, indicating that many protein sites react with pcPIR <i>in vivo</i>. From those labeled sites, 53 <i>in vivo</i> intercross-linked peptide pairs were identified and manually validated.
Approximately half of the interactions have been reported using other
techniques, although detailed structures exist for very few. Three
proteins or protein complexes with detailed crystallography structures
are compared to the cross-linking results obtained from <i>in
vivo</i> application of pcPIR technology
The State of the Human Proteome in 2012 as Viewed through PeptideAtlas
The Human Proteome Project was launched in September
2010 with
the goal of characterizing at least one protein product from each
protein-coding gene. Here we assess how much of the proteome has been
detected to date via tandem mass spectrometry by analyzing PeptideAtlas,
a compendium of human derived LCāMS/MS proteomics data from
many laboratories around the world. All data sets are processed with
a consistent set of parameters using the Trans-Proteomic Pipeline
and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas.
Therefore, PeptideAtlas contains only high confidence protein identifications.
To increase proteome coverage, we explored new comprehensive public
data sources for data likely to add new proteins to the Human PeptideAtlas.
We then folded these data into a Human PeptideAtlas 2012 build and
mapped it to Swiss-Prot, a protein sequence database curated to contain
one entry per human protein coding gene. We find that this latest
PeptideAtlas build includes at least one peptide for each of ā¼12500
Swiss-Prot entries, leaving ā¼7500 gene products yet to be confidently
cataloged. We characterize these āPA-unseenā proteins
in terms of tissue localization, transcript abundance, and Gene Ontology
enrichment, and propose reasons for their absence from PeptideAtlas
and strategies for detecting them in the future
The State of the Human Proteome in 2012 as Viewed through PeptideAtlas
The Human Proteome Project was launched in September
2010 with
the goal of characterizing at least one protein product from each
protein-coding gene. Here we assess how much of the proteome has been
detected to date via tandem mass spectrometry by analyzing PeptideAtlas,
a compendium of human derived LCāMS/MS proteomics data from
many laboratories around the world. All data sets are processed with
a consistent set of parameters using the Trans-Proteomic Pipeline
and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas.
Therefore, PeptideAtlas contains only high confidence protein identifications.
To increase proteome coverage, we explored new comprehensive public
data sources for data likely to add new proteins to the Human PeptideAtlas.
We then folded these data into a Human PeptideAtlas 2012 build and
mapped it to Swiss-Prot, a protein sequence database curated to contain
one entry per human protein coding gene. We find that this latest
PeptideAtlas build includes at least one peptide for each of ā¼12500
Swiss-Prot entries, leaving ā¼7500 gene products yet to be confidently
cataloged. We characterize these āPA-unseenā proteins
in terms of tissue localization, transcript abundance, and Gene Ontology
enrichment, and propose reasons for their absence from PeptideAtlas
and strategies for detecting them in the future
A Protein Standard That Emulates Homology for the Characterization of Protein Inference Algorithms
A natural way to
benchmark the performance of an analytical experimental
setup is to use samples of known composition and see to what degree
one can correctly infer the content of such a sample from the data.
For shotgun proteomics, one of the inherent problems of interpreting
data is that the measured analytes are peptides and not the actual
proteins themselves. As some proteins share proteolytic peptides,
there might be more than one possible causative set of proteins resulting
in a given set of peptides and there is a need for mechanisms that
infer proteins from lists of detected peptides. A weakness of commercially
available samples of known content is that they consist of proteins
that are deliberately selected for producing tryptic peptides that
are unique to a single protein. Unfortunately, such samples do not
expose any complications in protein inference. Hence, for a realistic
benchmark of protein inference procedures, there is a need for samples
of known content where the present proteins share peptides with known
absent proteins. Here, we present such a standard, that is based on <i>E. coli</i> expressed human protein fragments. To illustrate
the application of this standard, we benchmark a set of different
protein inference procedures on the data. We observe that inference
procedures excluding shared peptides provide more accurate estimates
of errors compared to methods that include information from shared
peptides, while still giving a reasonable performance in terms of
the number of identified proteins. We also demonstrate that using
a sample of known protein content without proteins with shared tryptic
peptides can give a false sense of accuracy for many protein inference
methods
The State of the Human Proteome in 2012 as Viewed through PeptideAtlas
The Human Proteome Project was launched in September
2010 with
the goal of characterizing at least one protein product from each
protein-coding gene. Here we assess how much of the proteome has been
detected to date via tandem mass spectrometry by analyzing PeptideAtlas,
a compendium of human derived LCāMS/MS proteomics data from
many laboratories around the world. All data sets are processed with
a consistent set of parameters using the Trans-Proteomic Pipeline
and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas.
Therefore, PeptideAtlas contains only high confidence protein identifications.
To increase proteome coverage, we explored new comprehensive public
data sources for data likely to add new proteins to the Human PeptideAtlas.
We then folded these data into a Human PeptideAtlas 2012 build and
mapped it to Swiss-Prot, a protein sequence database curated to contain
one entry per human protein coding gene. We find that this latest
PeptideAtlas build includes at least one peptide for each of ā¼12500
Swiss-Prot entries, leaving ā¼7500 gene products yet to be confidently
cataloged. We characterize these āPA-unseenā proteins
in terms of tissue localization, transcript abundance, and Gene Ontology
enrichment, and propose reasons for their absence from PeptideAtlas
and strategies for detecting them in the future