22 research outputs found

    Autopilot: An Online Data Acquisition Control System for the Enhanced High-Throughput Characterization of Intact Proteins

    No full text
    The ability to study organisms by direct analysis of their proteomes without digestion via mass spectrometry has benefited greatly from recent advances in separation techniques, instrumentation, and bioinformatics. However, improvements to data acquisition logic have lagged in comparison. Past workflows for Top Down Proteomics (TDPs) have focused on high throughput at the expense of maximal protein coverage and characterization. This mode of data acquisition has led to enormous overlap in the identification of highly abundant proteins in subsequent LC-MS injections. Furthermore, a wealth of data is left underutilized by analyzing each newly targeted species as unique, rather than as part of a collection of fragmentation events on a distinct proteoform. Here, we present a major advance in software for acquisition of TDP data that incorporates a fully automated workflow able to detect intact masses, guide fragmentation to achieve maximal identification and characterization of intact protein species, and perform database search online to yield real-time protein identifications. On <i>Pseudomonas aeruginosa</i>, the software combines fragmentation events of the same precursor with previously obtained fragments to achieve improved characterization of the target form by an average of 42 orders of magnitude in confidence. When HCD fragmentation optimization was applied to intact proteins ions, there was an 18.5 order of magnitude gain in confidence. These improved metrics set the stage for increased proteome coverage and characterization of higher order organisms in the future for sharply improved control over MS instruments in a project- and lab-wide context

    Autopilot: An Online Data Acquisition Control System for the Enhanced High-Throughput Characterization of Intact Proteins

    No full text
    The ability to study organisms by direct analysis of their proteomes without digestion via mass spectrometry has benefited greatly from recent advances in separation techniques, instrumentation, and bioinformatics. However, improvements to data acquisition logic have lagged in comparison. Past workflows for Top Down Proteomics (TDPs) have focused on high throughput at the expense of maximal protein coverage and characterization. This mode of data acquisition has led to enormous overlap in the identification of highly abundant proteins in subsequent LC-MS injections. Furthermore, a wealth of data is left underutilized by analyzing each newly targeted species as unique, rather than as part of a collection of fragmentation events on a distinct proteoform. Here, we present a major advance in software for acquisition of TDP data that incorporates a fully automated workflow able to detect intact masses, guide fragmentation to achieve maximal identification and characterization of intact protein species, and perform database search online to yield real-time protein identifications. On <i>Pseudomonas aeruginosa</i>, the software combines fragmentation events of the same precursor with previously obtained fragments to achieve improved characterization of the target form by an average of 42 orders of magnitude in confidence. When HCD fragmentation optimization was applied to intact proteins ions, there was an 18.5 order of magnitude gain in confidence. These improved metrics set the stage for increased proteome coverage and characterization of higher order organisms in the future for sharply improved control over MS instruments in a project- and lab-wide context

    Advancing Intact Protein Quantitation with Updated Deconvolution Routines

    No full text
    Analysis of intact proteins by mass spectrometry enables direct quantitation of the specific proteoforms present in a sample and is an increasingly important tool for biopharmaceutical and academic research. Interpreting and quantifying intact protein species from mass spectra typically involves many challenges including mass deconvolution and peak processing as well as determining optimal spectral averaging parameters and matching masses to theoretical proteoforms. Each of these steps can present informatic hurdles, as parameters often need to be tailored specifically to the data sets. To reduce intact mass deconvolution data analysis burdens, we built upon the widely used “sliding window” mass deconvolution technique with several additional concepts. First, we found that how spectra are averaged and the overlap in spectral windows can be tuned to favor either sensitivity or speed. A multiple window averaging approach was found to be the most effective way to increase mass detection and yielded a >2-fold increase in the number of masses detected. We also developed a targeted feature-finding routine that boosted sensitivity by >2-fold, decreased coefficient of variation across replicates by 50%, and increased the quality of mass elution profiles through 3-fold more detected time points. Lastly, we furthered existing approaches for annotating detected masses with potential proteoforms through spectral fitting for possible proteoform family modifications and network viewing. These proteoform annotation approaches ultimately produced a more accurate way of finding related, but previously unknown proteoforms from intact mass-only data. Together, these quantitation workflow improvements advance the information obtainable from intact protein mass spectrometry analyses

    Quantitation and Identification of Thousands of Human Proteoforms below 30 kDa

    No full text
    Top-down proteomics is capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. We extended the coverage of the label-free technique, achieving differential analysis of whole proteins <30 kDa from the proteomes of growing and senescent human fibroblasts. By integrating improved control software with more instrument time allocated for quantitation of intact ions, we were able to collect protein data between the two cell states, confidently comparing 1577 proteoform levels. To then identify and characterize proteoforms, our advanced acquisition software, named Autopilot, employed enhanced identification efficiency in identifying 1180 unique Swiss-Prot accession numbers at 1% false-discovery rate. This coverage of the low mass proteome is equivalent to the largest previously reported but was accomplished in 23% of the total acquisition time. By maximizing both the number of quantified proteoforms and their identification rate in an integrated software environment, this work significantly advances proteoform-resolved analyses of complex systems

    Quantitation and Identification of Thousands of Human Proteoforms below 30 kDa

    No full text
    Top-down proteomics is capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. We extended the coverage of the label-free technique, achieving differential analysis of whole proteins <30 kDa from the proteomes of growing and senescent human fibroblasts. By integrating improved control software with more instrument time allocated for quantitation of intact ions, we were able to collect protein data between the two cell states, confidently comparing 1577 proteoform levels. To then identify and characterize proteoforms, our advanced acquisition software, named Autopilot, employed enhanced identification efficiency in identifying 1180 unique Swiss-Prot accession numbers at 1% false-discovery rate. This coverage of the low mass proteome is equivalent to the largest previously reported but was accomplished in 23% of the total acquisition time. By maximizing both the number of quantified proteoforms and their identification rate in an integrated software environment, this work significantly advances proteoform-resolved analyses of complex systems

    Quantitation and Identification of Thousands of Human Proteoforms below 30 kDa

    No full text
    Top-down proteomics is capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. We extended the coverage of the label-free technique, achieving differential analysis of whole proteins <30 kDa from the proteomes of growing and senescent human fibroblasts. By integrating improved control software with more instrument time allocated for quantitation of intact ions, we were able to collect protein data between the two cell states, confidently comparing 1577 proteoform levels. To then identify and characterize proteoforms, our advanced acquisition software, named Autopilot, employed enhanced identification efficiency in identifying 1180 unique Swiss-Prot accession numbers at 1% false-discovery rate. This coverage of the low mass proteome is equivalent to the largest previously reported but was accomplished in 23% of the total acquisition time. By maximizing both the number of quantified proteoforms and their identification rate in an integrated software environment, this work significantly advances proteoform-resolved analyses of complex systems

    Legislative Documents

    Get PDF
    Also, variously referred to as: House bills; House documents; House legislative documents; legislative documents; General Court documents

    Capillary HILIC-MS: A New Tool for Sensitive Top-Down Proteomics

    No full text
    Recent progress in top-down proteomics has driven the demand for chromatographic methods compatible with mass spectrometry (MS) that can separate intact proteins. Hydrophilic interaction liquid chromatography (HILIC) has recently shown good potential for the characterization of glycoforms of intact proteins. In the present study, we demonstrate that HILIC can separate a wide range of proteins exhibiting orthogonal selectivity with respect to reversed-phase LC (RPLC). However, the application of HILIC to the analysis of low abundance proteins (e.g., in proteomics analysis) is hampered by low volume loadability, hindering down-scaling of the method to column diameters below 2.1 mm. Moreover, HILIC-MS sensitivity is decreased due to ion suppression from the trifluoroacetic acid (TFA) often used as the ion-pair agent to improve the selectivity and efficiency in the analysis of glycoproteins. Here, we introduce a capillary-based HILIC-MS method that overcomes these problems. Our method uses RPLC trap-columns to load and inject the sample, circumventing issues of protein solubility and volume loadability in capillary columns (200 ÎŒm ID). The low flow rates and use of a dopant gas in the electrospray interface improve protein-ionization efficiencies and reduce suppression by TFA. Overall, this allows the separation and detection of small protein quantities (down to 5 ng injected on column) as indicated by the analysis of a mixture of model proteins. The potential of the new capillary HILIC-MS is demonstrated by the analysis of a complex cell lysate

    The C‑Score: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics

    No full text
    The automated processing of data generated by top down proteomics would benefit from improved scoring for protein identification and characterization of highly related protein forms (proteoforms). Here we propose the “C-score” (short for Characterization Score), a Bayesian approach to the proteoform identification and characterization problem, implemented within a framework to allow the infusion of expert knowledge into generative models that take advantage of known properties of proteins and top down analytical systems (e.g., fragmentation propensities, “off-by-1 Da” discontinuous errors, and intelligent weighting for site-specific modifications). The performance of the scoring system based on the initial generative models was compared to the current probability-based scoring system used within both ProSightPC and ProSightPTM on a manually curated set of 295 human proteoforms. The current implementation of the C-score framework generated a marked improvement over the existing scoring system as measured by the area under the curve on the resulting ROC chart (AUC of 0.99 versus 0.78)

    The C‑Score: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics

    No full text
    The automated processing of data generated by top down proteomics would benefit from improved scoring for protein identification and characterization of highly related protein forms (proteoforms). Here we propose the “C-score” (short for Characterization Score), a Bayesian approach to the proteoform identification and characterization problem, implemented within a framework to allow the infusion of expert knowledge into generative models that take advantage of known properties of proteins and top down analytical systems (e.g., fragmentation propensities, “off-by-1 Da” discontinuous errors, and intelligent weighting for site-specific modifications). The performance of the scoring system based on the initial generative models was compared to the current probability-based scoring system used within both ProSightPC and ProSightPTM on a manually curated set of 295 human proteoforms. The current implementation of the C-score framework generated a marked improvement over the existing scoring system as measured by the area under the curve on the resulting ROC chart (AUC of 0.99 versus 0.78)
    corecore