8 research outputs found
Value of Using Multiple Proteases for Large-Scale Mass Spectrometry-Based Proteomics
Large-scale protein sequencing methods rely on enzymatic digestion of complex protein mixtures to generate a collection of peptides for mass spectrometric analysis. Here we examine the use of multiple proteases (trypsin, LysC, ArgC, AspN, and GluC) to improve both protein identification and characterization in the model organism Saccharomyces cerevisiae. Using a data-dependent, decision tree-based algorithm to tailor MS2 fragmentation method to peptide precursor, we identified 92 095 unique peptides (609 665 total) mapping to 3908 proteins at a 1% false discovery rate (FDR). These results were a significant improvement upon data from a single protease digest (trypsin) − 27 822 unique peptides corresponding to 3313 proteins. The additional 595 protein identifications were mainly from those at low abundances (i.e., < 1000 copies/cell); sequence coverage for these proteins was likewise improved nearly 3-fold. We demonstrate that large portions of the proteome are simply inaccessible following digestion with a single protease and that multiple proteases, rather than technical replicates, provide a direct route to increase both protein identifications and proteome sequence coverage
Value of Using Multiple Proteases for Large-Scale Mass Spectrometry-Based Proteomics
Large-scale protein sequencing methods rely on enzymatic digestion of complex protein mixtures to generate a collection of peptides for mass spectrometric analysis. Here we examine the use of multiple proteases (trypsin, LysC, ArgC, AspN, and GluC) to improve both protein identification and characterization in the model organism Saccharomyces cerevisiae. Using a data-dependent, decision tree-based algorithm to tailor MS2 fragmentation method to peptide precursor, we identified 92 095 unique peptides (609 665 total) mapping to 3908 proteins at a 1% false discovery rate (FDR). These results were a significant improvement upon data from a single protease digest (trypsin) − 27 822 unique peptides corresponding to 3313 proteins. The additional 595 protein identifications were mainly from those at low abundances (i.e., < 1000 copies/cell); sequence coverage for these proteins was likewise improved nearly 3-fold. We demonstrate that large portions of the proteome are simply inaccessible following digestion with a single protease and that multiple proteases, rather than technical replicates, provide a direct route to increase both protein identifications and proteome sequence coverage
Value of Using Multiple Proteases for Large-Scale Mass Spectrometry-Based Proteomics
Large-scale protein sequencing methods rely on enzymatic digestion of complex protein mixtures to generate a collection of peptides for mass spectrometric analysis. Here we examine the use of multiple proteases (trypsin, LysC, ArgC, AspN, and GluC) to improve both protein identification and characterization in the model organism Saccharomyces cerevisiae. Using a data-dependent, decision tree-based algorithm to tailor MS2 fragmentation method to peptide precursor, we identified 92 095 unique peptides (609 665 total) mapping to 3908 proteins at a 1% false discovery rate (FDR). These results were a significant improvement upon data from a single protease digest (trypsin) − 27 822 unique peptides corresponding to 3313 proteins. The additional 595 protein identifications were mainly from those at low abundances (i.e., < 1000 copies/cell); sequence coverage for these proteins was likewise improved nearly 3-fold. We demonstrate that large portions of the proteome are simply inaccessible following digestion with a single protease and that multiple proteases, rather than technical replicates, provide a direct route to increase both protein identifications and proteome sequence coverage
A New Probabilistic Database Search Algorithm for ETD Spectra
Peptide characterization using electron transfer dissociation (ETD) is an important analytical tool for protein identification. The fragmentation observed in ETD spectra is complementary to that seen when using the traditional dissociation method, collision activated dissociation (CAD). Applications of ETD enhance the scope and complexity of the peptides that can be studied by mass spectrometry-based methods. For example, ETD is shown to be particularly useful for the study of post-translationally modified peptides. To take advantage of the power provided by ETD, it is important to have an ETD-specific database search engine, an integral tool of mass spectrometry-based analytical proteomics. In this paper, we report on our development of a database search engine using ETD spectra and protein sequence databases to identify peptides. The search engine is based on the probabilistic modeling of shared peaks count and shared peaks intensity between the spectra and the peptide sequences. The shared peaks count accounts for the cumulative variations from amino acid sequences, while shared peaks intensity models the variations between the candidate sequence and product ion intensities. To demonstrate the utility of this algorithm for searching real-world data, we present the results of applications of this model to two high-throughput data sets. Both data sets were obtained from yeast whole cell lysates. The first data set was obtained from a sample digested by Lys-C, and the second data set was obtained by a digestion using trypsin. We searched the data sets against a combined forward and reversed yeast protein database to estimate false discovery rates. We compare the search results from the new methods with the results from a search engine often employed for ETD spectra, OMSSA. Our findings show that overall the new model performs comparably to OMSSA for low false discovery rates. At the same time, we demonstrate that there are substantial differences with OMSSA for results on subsets of data. Therefore, we conclude the new model can be considered as being complementary to previously developed models
Data-Independent Acquisition Protease-Multiplexing Enables Increased Proteome Sequence Coverage Across Multiple Fragmentation Modes
The use of multiple proteases has
been shown to increase protein
sequence coverage in proteomics experiments; however, due to the additional
analysis time required, it has not been widely adopted in routine
data-dependent acquisition (DDA) proteomic workflows. Alternatively,
data-independent acquisition (DIA) has the potential to analyze multiplexed
samples from different protease digests, but has been primarily optimized
for fragmenting tryptic peptides. Here we evaluate a DIA multiplexing
approach that combines three proteolytic digests (Trypsin, AspN, and
GluC) into a single sample. We first optimize data acquisition conditions
for each protease individually with both the canonical DIA fragmentation
mode (beam type CID), as well as resonance excitation CID, to determine
optimal consensus conditions across proteases. Next, we demonstrate
that application of these conditions to a protease-multiplexed sample
of human peptides results in similar protein identifications and quantitative
performance as compared to trypsin alone, but enables up to a 63%
increase in peptide detections, and a 45% increase in nonredundant
amino acid detections. Nontryptic peptides enabled noncanonical protein
isoform determination and resulted in 100% sequence coverage for numerous
proteins, suggesting the utility of this approach in applications
where sequence coverage is critical, such as protein isoform analysis
Data-Independent Acquisition Protease-Multiplexing Enables Increased Proteome Sequence Coverage Across Multiple Fragmentation Modes
The use of multiple proteases has
been shown to increase protein
sequence coverage in proteomics experiments; however, due to the additional
analysis time required, it has not been widely adopted in routine
data-dependent acquisition (DDA) proteomic workflows. Alternatively,
data-independent acquisition (DIA) has the potential to analyze multiplexed
samples from different protease digests, but has been primarily optimized
for fragmenting tryptic peptides. Here we evaluate a DIA multiplexing
approach that combines three proteolytic digests (Trypsin, AspN, and
GluC) into a single sample. We first optimize data acquisition conditions
for each protease individually with both the canonical DIA fragmentation
mode (beam type CID), as well as resonance excitation CID, to determine
optimal consensus conditions across proteases. Next, we demonstrate
that application of these conditions to a protease-multiplexed sample
of human peptides results in similar protein identifications and quantitative
performance as compared to trypsin alone, but enables up to a 63%
increase in peptide detections, and a 45% increase in nonredundant
amino acid detections. Nontryptic peptides enabled noncanonical protein
isoform determination and resulted in 100% sequence coverage for numerous
proteins, suggesting the utility of this approach in applications
where sequence coverage is critical, such as protein isoform analysis
TCO, a putative transcriptional regulator in Arabidopsis, is a target of the protein kinase CK2
As multicellular organisms grow, spatial and temporal patterns of gene expression are strictly regulated to ensure that developmental programs are invoked at appropriate stages. In this work, we describe a putative transcriptional regulator in Arabidopsis, TACO LEAF (TCO), whose overexpression results in the ectopic activation of reproductive genes during vegetative growth. Isolated as an activation-tagged allele, tco-1D displays gene misexpression and phenotypic abnormalities, such as curled leaves and early flowering, characteristic of chromatin regulatory mutants. A role for TCO in this mode of transcriptional regulation is further supported by the subnuclear accumulation patterns of TCO protein and genetic interactions between tco-1D and chromatin modifier mutants. The endogenous expression pattern of TCO and gene misregulation in tco loss-of-function mutants indicate that this factor is involved in seed development. We also demonstrate that specific serine residues of TCO protein are targeted by the ubiquitous kinase CK2. Collectively, these results identify TCO as a novel regulator of gene expression whose activity is likely influenced by phosphorylation, as is the case with many chromatin regulators.</p
A Proteomics Grade Electron Transfer Dissociation-Enabled Hybrid Linear Ion Trap-Orbitrap Mass Spectrometer
Here we detail the modification of a quadrupole linear ion trap-orbitrap hybrid (QLT-orbitrap) mass spectrometer to accommodate a negative chemical ionization (NCI) source. The NCI source is used to produce fluoranthene radical anions for imparting electron transfer dissociation (ETD). The anion beam is stable, robust, and intense so that a sufficient amount of reagents can be injected into the QLT in only 4−8 ms. Following ion/ion reaction in the QLT, ETD product ions are mass-to-charge (m/z) analyzed in either the QLT (for speed and sensitivity) or the orbitrap (for mass resolution and accuracy). Here we describe the physical layout of this device, parametric optimization of anion transport, an evaluation of relevant ETD figures of merit, and the application of this instrument to protein sequence analysis. Described proteomic applications include complex peptide mixture analysis, post-translational modification (PTM) site identification, isotope-encoded quantitation, large peptide characterization, and intact protein analysis. From these experiments, we conclude the ETD-enabled orbitrap will provide the proteomic field with several new opportunities and represents an advance in protein sequence analysis technologies
