22 research outputs found
Autopilot: An Online Data Acquisition Control System for the Enhanced High-Throughput Characterization of Intact Proteins
The ability to study organisms by
direct analysis of their proteomes
without digestion via mass spectrometry has benefited greatly from
recent advances in separation techniques, instrumentation, and bioinformatics.
However, improvements to data acquisition logic have lagged in comparison.
Past workflows for Top Down Proteomics (TDPs) have focused on high
throughput at the expense of maximal protein coverage and characterization.
This mode of data acquisition has led to enormous overlap in the identification
of highly abundant proteins in subsequent LC-MS injections. Furthermore,
a wealth of data is left underutilized by analyzing each newly targeted
species as unique, rather than as part of a collection of fragmentation
events on a distinct proteoform. Here, we present a major advance
in software for acquisition of TDP data that incorporates a fully
automated workflow able to detect intact masses, guide fragmentation
to achieve maximal identification and characterization of intact protein
species, and perform database search online to yield real-time protein
identifications. On <i>Pseudomonas aeruginosa</i>, the software
combines fragmentation events of the same precursor with previously
obtained fragments to achieve improved characterization of the target
form by an average of 42 orders of magnitude in confidence. When HCD
fragmentation optimization was applied to intact proteins ions, there
was an 18.5 order of magnitude gain in confidence. These improved
metrics set the stage for increased proteome coverage and characterization
of higher order organisms in the future for sharply improved control
over MS instruments in a project- and lab-wide context
Autopilot: An Online Data Acquisition Control System for the Enhanced High-Throughput Characterization of Intact Proteins
The ability to study organisms by
direct analysis of their proteomes
without digestion via mass spectrometry has benefited greatly from
recent advances in separation techniques, instrumentation, and bioinformatics.
However, improvements to data acquisition logic have lagged in comparison.
Past workflows for Top Down Proteomics (TDPs) have focused on high
throughput at the expense of maximal protein coverage and characterization.
This mode of data acquisition has led to enormous overlap in the identification
of highly abundant proteins in subsequent LC-MS injections. Furthermore,
a wealth of data is left underutilized by analyzing each newly targeted
species as unique, rather than as part of a collection of fragmentation
events on a distinct proteoform. Here, we present a major advance
in software for acquisition of TDP data that incorporates a fully
automated workflow able to detect intact masses, guide fragmentation
to achieve maximal identification and characterization of intact protein
species, and perform database search online to yield real-time protein
identifications. On <i>Pseudomonas aeruginosa</i>, the software
combines fragmentation events of the same precursor with previously
obtained fragments to achieve improved characterization of the target
form by an average of 42 orders of magnitude in confidence. When HCD
fragmentation optimization was applied to intact proteins ions, there
was an 18.5 order of magnitude gain in confidence. These improved
metrics set the stage for increased proteome coverage and characterization
of higher order organisms in the future for sharply improved control
over MS instruments in a project- and lab-wide context
Advancing Intact Protein Quantitation with Updated Deconvolution Routines
Analysis of intact proteins by mass spectrometry enables
direct
quantitation of the specific proteoforms present in a sample and is
an increasingly important tool for biopharmaceutical and academic
research. Interpreting and quantifying intact protein species from
mass spectra typically involves many challenges including mass deconvolution
and peak processing as well as determining optimal spectral averaging
parameters and matching masses to theoretical proteoforms. Each of
these steps can present informatic hurdles, as parameters often need
to be tailored specifically to the data sets. To reduce intact mass
deconvolution data analysis burdens, we built upon the widely used
âsliding windowâ mass deconvolution technique with several
additional concepts. First, we found that how spectra are averaged
and the overlap in spectral windows can be tuned to favor either sensitivity
or speed. A multiple window averaging approach was found to be the
most effective way to increase mass detection and yielded a >2-fold
increase in the number of masses detected. We also developed a targeted
feature-finding routine that boosted sensitivity by >2-fold, decreased
coefficient of variation across replicates by 50%, and increased the
quality of mass elution profiles through 3-fold more detected time
points. Lastly, we furthered existing approaches for annotating detected
masses with potential proteoforms through spectral fitting for possible
proteoform family modifications and network viewing. These proteoform
annotation approaches ultimately produced a more accurate way of finding
related, but previously unknown proteoforms from intact mass-only
data. Together, these quantitation workflow improvements advance the
information obtainable from intact protein mass spectrometry analyses
Quantitation and Identification of Thousands of Human Proteoforms below 30 kDa
Top-down proteomics is capable of
identifying and quantitating
unique proteoforms through the analysis of intact proteins. We extended
the coverage of the label-free technique, achieving differential analysis
of whole proteins <30 kDa from the proteomes of growing and senescent
human fibroblasts. By integrating improved control software with more
instrument time allocated for quantitation of intact ions, we were
able to collect protein data between the two cell states, confidently
comparing 1577 proteoform levels. To then identify and characterize
proteoforms, our advanced acquisition software, named Autopilot, employed enhanced identification efficiency in identifying 1180
unique Swiss-Prot accession numbers at 1% false-discovery rate. This
coverage of the low mass proteome is equivalent to the largest previously
reported but was accomplished in 23% of the total acquisition time.
By maximizing both the number of quantified proteoforms and their
identification rate in an integrated software environment, this work
significantly advances proteoform-resolved analyses of complex systems
Quantitation and Identification of Thousands of Human Proteoforms below 30 kDa
Top-down proteomics is capable of
identifying and quantitating
unique proteoforms through the analysis of intact proteins. We extended
the coverage of the label-free technique, achieving differential analysis
of whole proteins <30 kDa from the proteomes of growing and senescent
human fibroblasts. By integrating improved control software with more
instrument time allocated for quantitation of intact ions, we were
able to collect protein data between the two cell states, confidently
comparing 1577 proteoform levels. To then identify and characterize
proteoforms, our advanced acquisition software, named Autopilot, employed enhanced identification efficiency in identifying 1180
unique Swiss-Prot accession numbers at 1% false-discovery rate. This
coverage of the low mass proteome is equivalent to the largest previously
reported but was accomplished in 23% of the total acquisition time.
By maximizing both the number of quantified proteoforms and their
identification rate in an integrated software environment, this work
significantly advances proteoform-resolved analyses of complex systems
Quantitation and Identification of Thousands of Human Proteoforms below 30 kDa
Top-down proteomics is capable of
identifying and quantitating
unique proteoforms through the analysis of intact proteins. We extended
the coverage of the label-free technique, achieving differential analysis
of whole proteins <30 kDa from the proteomes of growing and senescent
human fibroblasts. By integrating improved control software with more
instrument time allocated for quantitation of intact ions, we were
able to collect protein data between the two cell states, confidently
comparing 1577 proteoform levels. To then identify and characterize
proteoforms, our advanced acquisition software, named Autopilot, employed enhanced identification efficiency in identifying 1180
unique Swiss-Prot accession numbers at 1% false-discovery rate. This
coverage of the low mass proteome is equivalent to the largest previously
reported but was accomplished in 23% of the total acquisition time.
By maximizing both the number of quantified proteoforms and their
identification rate in an integrated software environment, this work
significantly advances proteoform-resolved analyses of complex systems
Legislative Documents
Also, variously referred to as: House bills; House documents; House legislative documents; legislative documents; General Court documents
Capillary HILIC-MS: A New Tool for Sensitive Top-Down Proteomics
Recent
progress in top-down proteomics has driven the demand for
chromatographic methods compatible with mass spectrometry (MS) that
can separate intact proteins. Hydrophilic interaction liquid chromatography
(HILIC) has recently shown good potential for the characterization
of glycoforms of intact proteins. In the present study, we demonstrate
that HILIC can separate a wide range of proteins exhibiting orthogonal
selectivity with respect to reversed-phase LC (RPLC). However, the
application of HILIC to the analysis of low abundance proteins (e.g.,
in proteomics analysis) is hampered by low volume loadability, hindering
down-scaling of the method to column diameters below 2.1 mm. Moreover,
HILIC-MS sensitivity is decreased due to ion suppression from the
trifluoroacetic acid (TFA) often used as the ion-pair agent to improve
the selectivity and efficiency in the analysis of glycoproteins. Here,
we introduce a capillary-based HILIC-MS method that overcomes these
problems. Our method uses RPLC trap-columns to load and inject the
sample, circumventing issues of protein solubility and volume loadability
in capillary columns (200 ÎŒm ID). The low flow rates and use
of a dopant gas in the electrospray interface improve protein-ionization
efficiencies and reduce suppression by TFA. Overall, this allows the
separation and detection of small protein quantities (down to 5 ng
injected on column) as indicated by the analysis of a mixture of model
proteins. The potential of the new capillary HILIC-MS is demonstrated
by the analysis of a complex cell lysate
The CâScore: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics
The
automated processing of data generated by top down proteomics
would benefit from improved scoring for protein identification and
characterization of highly related protein forms (proteoforms). Here
we propose the âC-scoreâ (short for Characterization
Score), a Bayesian approach to the proteoform identification and characterization
problem, implemented within a framework to allow the infusion of expert
knowledge into generative models that take advantage of known properties
of proteins and top down analytical systems (e.g., fragmentation propensities,
âoff-by-1 Daâ discontinuous errors, and intelligent
weighting for site-specific modifications). The performance of the
scoring system based on the initial generative models was compared
to the current probability-based scoring system used within both ProSightPC
and ProSightPTM on a manually curated set of 295 human proteoforms.
The current implementation of the C-score framework generated a marked
improvement over the existing scoring system as measured by the area
under the curve on the resulting ROC chart (AUC of 0.99 versus 0.78)
The CâScore: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics
The
automated processing of data generated by top down proteomics
would benefit from improved scoring for protein identification and
characterization of highly related protein forms (proteoforms). Here
we propose the âC-scoreâ (short for Characterization
Score), a Bayesian approach to the proteoform identification and characterization
problem, implemented within a framework to allow the infusion of expert
knowledge into generative models that take advantage of known properties
of proteins and top down analytical systems (e.g., fragmentation propensities,
âoff-by-1 Daâ discontinuous errors, and intelligent
weighting for site-specific modifications). The performance of the
scoring system based on the initial generative models was compared
to the current probability-based scoring system used within both ProSightPC
and ProSightPTM on a manually curated set of 295 human proteoforms.
The current implementation of the C-score framework generated a marked
improvement over the existing scoring system as measured by the area
under the curve on the resulting ROC chart (AUC of 0.99 versus 0.78)