11 research outputs found

    Improving Collision Induced Dissociation (CID), High Energy Collision Dissociation (HCD), and Electron Transfer Dissociation (ETD) Fourier Transform MS/MS Degradome–Peptidome Identifications Using High Accuracy Mass Information

    No full text
    MS dissociation methods, including collision induced dissociation (CID), high energy collision dissociation (HCD), and electron transfer dissociation (ETD), can each contribute distinct peptidome identifications using conventional peptide identification methods (Shen et al. <i>J. Proteome Res</i>. <b>2011</b>), but such samples still pose significant informatics challenges. In this work, we explored utilization of high accuracy fragment ion mass measurements, in this case provided by Fourier transform MS/MS, to improve peptidome peptide data set size and consistency relative to conventional descriptive and probabilistic scoring methods. For example, we identified 20–40% more peptides than SEQUEST, Mascot, and MS_GF scoring methods using high accuracy fragment ion information and the same false discovery rate (FDR) from CID, HCD, and ETD spectra. Identified species covered >90% of the collective identifications obtained using various conventional peptide identification methods, which significantly addresses the common issue of different data analysis methods generating different peptide data sets. Choice of peptide dissociation and high-precision measurement-based identification methods presently available for degradomic–peptidomic analyses needs to be based on the coverage and confidence (or specificity) afforded by the method, as well as practical issues (e.g., throughput). By using accurate fragment information, >1000 peptidome components can be identified from a single human blood plasma analysis with low peptide-level FDRs (e.g., 0.6%), providing an improved basis for investigating potential disease-related peptidome components

    Improving Collision Induced Dissociation (CID), High Energy Collision Dissociation (HCD), and Electron Transfer Dissociation (ETD) Fourier Transform MS/MS Degradome–Peptidome Identifications Using High Accuracy Mass Information

    No full text
    MS dissociation methods, including collision induced dissociation (CID), high energy collision dissociation (HCD), and electron transfer dissociation (ETD), can each contribute distinct peptidome identifications using conventional peptide identification methods (Shen et al. <i>J. Proteome Res</i>. <b>2011</b>), but such samples still pose significant informatics challenges. In this work, we explored utilization of high accuracy fragment ion mass measurements, in this case provided by Fourier transform MS/MS, to improve peptidome peptide data set size and consistency relative to conventional descriptive and probabilistic scoring methods. For example, we identified 20–40% more peptides than SEQUEST, Mascot, and MS_GF scoring methods using high accuracy fragment ion information and the same false discovery rate (FDR) from CID, HCD, and ETD spectra. Identified species covered >90% of the collective identifications obtained using various conventional peptide identification methods, which significantly addresses the common issue of different data analysis methods generating different peptide data sets. Choice of peptide dissociation and high-precision measurement-based identification methods presently available for degradomic–peptidomic analyses needs to be based on the coverage and confidence (or specificity) afforded by the method, as well as practical issues (e.g., throughput). By using accurate fragment information, >1000 peptidome components can be identified from a single human blood plasma analysis with low peptide-level FDRs (e.g., 0.6%), providing an improved basis for investigating potential disease-related peptidome components

    Identification of Ultramodified Proteins Using Top-Down Tandem Mass Spectra

    No full text
    Post-translational modifications (PTMs) play an important role in various biological processes through changing protein structure and function. Some ultramodified proteins (like histones) have multiple PTMs forming PTM patterns that define the functionality of a protein. While bottom-up mass spectrometry (MS) has been successful in identifying individual PTMs within short peptides, it is unable to identify PTM patterns spreading along entire proteins in a coordinated fashion. In contrast, top-down MS analyzes intact proteins and reveals PTM patterns along the entire proteins. However, while recent advances in instrumentation have made top-down MS accessible to many laboratories, most computational tools for top-down MS focus on proteins with few PTMs and are unable to identify complex PTM patterns. We propose a new algorithm, MS-Align-E, that identifies both expected and unexpected PTMs in ultramodified proteins. We demonstrate that MS-Align-E identifies many proteoforms of histone H4 and benchmark it against the currently accepted software tools

    Moving beyond the van Krevelen Diagram: A New Stoichiometric Approach for Compound Classification in Organisms

    No full text
    van Krevelen diagrams (O/C vs H/C ratios of elemental formulas) have been widely used in studies to obtain an estimation of the main compound categories present in environmental samples. However, the limits defining a specific compound category based solely on O/C and H/C ratios of elemental formulas have never been accurately listed or proposed to classify metabolites in biological samples. Furthermore, while O/C vs H/C ratios of elemental formulas can provide an overview of the compound categories, such classification is inefficient because of the large overlap among different compound categories along both axes. We propose a more accurate compound classification for biological samples analyzed by high-resolution mass spectrometry based on an assessment of the C/H/O/N/P stoichiometric ratios of over 130 000 elemental formulas of compounds classified in 6 main categories: lipids, peptides, amino sugars, carbohydrates, nucleotides, and phytochemical compounds (oxy-aromatic compounds). Our multidimensional stoichiometric compound classification (MSCC) constraints showed a highly accurate categorization of elemental formulas to the main compound categories in biological samples with over 98% of accuracy representing a substantial improvement over any classification based on the classic van Krevelen diagram. This method represents a signficant step forward in environmental research, especially ecological stoichiometry and eco-metabolomics studies, by providing a novel and robust tool to improve our understanding of the ecosystem structure and function through the chemical characterization of biological samples

    Moving beyond the van Krevelen Diagram: A New Stoichiometric Approach for Compound Classification in Organisms

    No full text
    van Krevelen diagrams (O/C vs H/C ratios of elemental formulas) have been widely used in studies to obtain an estimation of the main compound categories present in environmental samples. However, the limits defining a specific compound category based solely on O/C and H/C ratios of elemental formulas have never been accurately listed or proposed to classify metabolites in biological samples. Furthermore, while O/C vs H/C ratios of elemental formulas can provide an overview of the compound categories, such classification is inefficient because of the large overlap among different compound categories along both axes. We propose a more accurate compound classification for biological samples analyzed by high-resolution mass spectrometry based on an assessment of the C/H/O/N/P stoichiometric ratios of over 130 000 elemental formulas of compounds classified in 6 main categories: lipids, peptides, amino sugars, carbohydrates, nucleotides, and phytochemical compounds (oxy-aromatic compounds). Our multidimensional stoichiometric compound classification (MSCC) constraints showed a highly accurate categorization of elemental formulas to the main compound categories in biological samples with over 98% of accuracy representing a substantial improvement over any classification based on the classic van Krevelen diagram. This method represents a signficant step forward in environmental research, especially ecological stoichiometry and eco-metabolomics studies, by providing a novel and robust tool to improve our understanding of the ecosystem structure and function through the chemical characterization of biological samples

    Advanced Solvent Based Methods for Molecular Characterization of Soil Organic Matter by High-Resolution Mass Spectrometry

    No full text
    Soil organic matter (SOM), a complex, heterogeneous mixture of above and belowground plant litter and animal and microbial residues at various degrees of decomposition, is a key reservoir for carbon (C) and nutrient biogeochemical cycling in soil based ecosystems. A limited understanding of the molecular composition of SOM limits the ability to routinely decipher chemical processes within soil and accurately predict how terrestrial carbon fluxes will respond to changing climatic conditions and land use. To elucidate the molecular-level structure of SOM, we selectively extracted a broad range of intact SOM compounds by a combination of different organic solvents from soils with a wide range of C content. Our use of electrospray ionization (ESI) coupled with Fourier transform ion cyclotron resonance mass spectrometry (FTICR MS) and a suite of solvents with varying polarity significantly expands the inventory of the types of organic molecules present in soils. Specifically, we found that hexane is selective for lipid-like compounds with very low O/C ratios (<0.1); water (H<sub>2</sub>O) was selective for carbohydrates with high O/C ratios; acetonitrile (ACN) preferentially extracts lignin, condensed structures, and tannin polyphenolic compounds with O/C > 0.5; methanol (MeOH) has higher selectivity toward compounds characterized with low O/C < 0.5; and hexane, MeOH, ACN, and H<sub>2</sub>O solvents increase the number and types of organic molecules extracted from soil for a broader range of chemically diverse soil types. Our study of SOM molecules by ESI FTICR MS revealed new insight into the molecular-level complexity of organics contained in soils. We present the first comparative study of the molecular composition of SOM from different ecosystems using ultra high-resolution mass spectrometry

    Formularity: Software for Automated Formula Assignment of Natural and Other Organic Matter from Ultrahigh-Resolution Mass Spectra

    No full text
    Ultrahigh resolution mass spectrometry, such as Fourier transform ion cyclotron resonance mass spectrometry (FT ICR MS), can resolve thousands of molecular ions in complex organic matrices. A Compound Identification Algorithm (CIA) was previously developed for automated elemental formula assignment for natural organic matter (NOM). In this work, we describe software Formularity with a user-friendly interface for CIA function and newly developed search function Isotopic Pattern Algorithm (IPA). While CIA assigns elemental formulas for compounds containing C, H, O, N, S, and P, IPA is capable of assigning formulas for compounds containing other elements. We used halogenated organic compounds (HOC), a chemical class that is ubiquitous in nature as well as anthropogenic systems, as an example to demonstrate the capability of Formularity with IPA. A HOC standard mix was used to evaluate the identification confidence of IPA. Tap water and HOC spike in Suwannee River NOM were used to assess HOC identification in complex environmental samples. Strategies for reconciliation of CIA and IPA assignments were discussed. Software and sample databases with documentation are freely available

    De Novo Sequencing of Peptides from Top-Down Tandem Mass Spectra

    No full text
    De novo sequencing of proteins and peptides is one of the most important problems in mass spectrometry-driven proteomics. A variety of methods have been developed to accomplish this task from a set of bottom-up tandem (MS/MS) mass spectra. However, a more recently emerged top-down technology, now gaining more and more popularity, opens new perspectives for protein analysis and characterization, implying a need for efficient algorithms to process this kind of MS/MS data. Here, we describe a method that allows for the retrieval, from a set of top-down MS/MS spectra, of long and accurate sequence fragments of the proteins contained in the sample. To this end, we outline a strategy for generating high-quality sequence tags from top-down spectra, and introduce the concept of a <i>T</i>-Bruijn graph by adapting to the case of tags the notion of an <i>A</i>-Bruijn graph widely used in genomics. The output of the proposed approach represents the set of amino acid strings spelled out by optimal paths in the connected components of a <i>T</i>-Bruijn graph. We illustrate its performance on top-down data sets acquired from carbonic anhydrase 2 (CAH2) and the Fab region of alemtuzumab

    <i>De Novo</i> Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra

    No full text
    There are two approaches for <i>de novo</i> protein sequencing: Edman degradation and mass spectrometry (MS). Existing MS-based methods characterize a novel protein by assembling tandem mass spectra of overlapping peptides generated from multiple proteolytic digestions of the protein. Because each tandem mass spectrum covers only a short peptide of the target protein, the key to high coverage protein sequencing is to find spectral pairs from overlapping peptides in order to assemble tandem mass spectra to long ones. However, overlapping regions of peptides may be too short to be confidently identified. High-resolution mass spectrometers have become accessible to many laboratories. These mass spectrometers are capable of analyzing molecules of large mass values, boosting the development of top-down MS. Top-down tandem mass spectra cover whole proteins. However, top-down tandem mass spectra, even combined, rarely provide full ion fragmentation coverage of a protein. We propose an algorithm, TBNovo, for <i>de novo</i> protein sequencing by combining top-down and bottom-up MS. In TBNovo, a top-down tandem mass spectrum is utilized as a scaffold, and bottom-up tandem mass spectra are aligned to the scaffold to increase sequence coverage. Experiments on data sets of two proteins showed that TBNovo achieved high sequence coverage and high sequence accuracy

    De Novo Sequencing of Peptides from Top-Down Tandem Mass Spectra

    No full text
    De novo sequencing of proteins and peptides is one of the most important problems in mass spectrometry-driven proteomics. A variety of methods have been developed to accomplish this task from a set of bottom-up tandem (MS/MS) mass spectra. However, a more recently emerged top-down technology, now gaining more and more popularity, opens new perspectives for protein analysis and characterization, implying a need for efficient algorithms to process this kind of MS/MS data. Here, we describe a method that allows for the retrieval, from a set of top-down MS/MS spectra, of long and accurate sequence fragments of the proteins contained in the sample. To this end, we outline a strategy for generating high-quality sequence tags from top-down spectra, and introduce the concept of a <i>T</i>-Bruijn graph by adapting to the case of tags the notion of an <i>A</i>-Bruijn graph widely used in genomics. The output of the proposed approach represents the set of amino acid strings spelled out by optimal paths in the connected components of a <i>T</i>-Bruijn graph. We illustrate its performance on top-down data sets acquired from carbonic anhydrase 2 (CAH2) and the Fab region of alemtuzumab
    corecore