46 research outputs found

    Accurate peak list extraction from proteomic mass spectra for identification and profiling studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mass spectrometry is an essential technique in proteomics both to identify the proteins of a biological sample and to compare proteomic profiles of different samples. In both cases, the main phase of the data analysis is the procedure to extract the significant features from a mass spectrum. Its final output is the so-called peak list which contains the mass, the charge and the intensity of every detected biomolecule. The main steps of the peak list extraction procedure are usually preprocessing, peak detection, peak selection, charge determination and monoisotoping operation.</p> <p>Results</p> <p>This paper describes an original algorithm for peak list extraction from low and high resolution mass spectra. It has been developed principally to improve the precision of peak extraction in comparison to other reference algorithms. It contains many innovative features among which a sophisticated method for managing the overlapping isotopic distributions.</p> <p>Conclusions</p> <p>The performances of the basic version of the algorithm and of its optional functionalities have been evaluated in this paper on both SELDI-TOF, MALDI-TOF and ESI-FTICR ECD mass spectra. Executable files of MassSpec, a MATLAB implementation of the peak list extraction procedure for Windows and Linux systems, can be downloaded free of charge for nonprofit institutions from the following web site: <url>http://aimed11.unipv.it/MassSpec</url></p

    How many human proteoforms are there?

    Get PDF
    Despite decades of accumulated knowledge about proteins and their post-translational modifications (PTMs), numerous questions remain regarding their molecular composition and biological function. One of the most fundamental queries is the extent to which the combinations of DNA-, RNA- and PTM-level variations explode the complexity of the human proteome. Here, we outline what we know from current databases and measurement strategies including mass spectrometry-based proteomics. In doing so, we examine prevailing notions about the number of modifications displayed on human proteins and how they combine to generate the protein diversity underlying health and disease. We frame central issues regarding determination of protein-level variation and PTMs, including some paradoxes present in the field today. We use this framework to assess existing data and to ask the question, "How many distinct primary structures of proteins (proteoforms) are created from the 20,300 human genes?" We also explore prospects for improving measurements to better regularize protein-level biology and efficiently associate PTMs to function and phenotype

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Improved Dynamic Range, Quantitation, and Characterization of Histone H4 Post-Translational Modifications: A Top Down Mass Spectrometric Approach

    No full text
    168 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2006.Intimately associated with DNA, histone proteins serve as both a structural scaffold for DNA packaging into the nucleus and an epigenetic means for the regulation of gene expression. One such histone-based mechanism for transcriptional regulation is post-translational modification (PTM) of histones H2A, H2B, H3 and H4. Combinations of modifications such as acetylation, methylation, and phosphorylation have been hypothesized to create a "histone code" that influences gene transcription, gene silencing, and chromatin formation. Essential for complete understanding of this code is an efficient methodology for detection, exact localization and quantitation of combinations of modifications at specific sites. We combine here gas-phase concentration and purification of human histone H4 inside a Quadrupole-Fourier Transform Mass Spectrometer hybrid (Q-FTMS) with Top Down fragmentation using Electron Capture Dissociation (ECD). We extend the use of Top Down MS to assess how the abundance of each modified histone H4 form changes in synchronized cells progressing through the cell cycle in order to identify PTMs and combinations of PTMs that are associated with cell cycle specific events such as replication and mitosis. The many observed combinations of modifications on H4 led us to develop a novel database searching strategy to simplify data analysis. Histone H4 was "shotgun annotated", resulting in the population of a database with masses of hypothetical modified H4 forms. Querying this database with ECD spectra rich in fragment-ions, we found that this approach quickly finds the correct modified H4 form. We also developed an additional chromatographic approach that increased our dynamic range from 102 to >10 4, allowing the characterization and quantitation of >35 chemically distinct forms of H4 in HeLa cells, many of which have not been described previously. During the quantitation of these 39 distinct forms, we developed methods that dealt with challenges associated with intact proteins ( i.e., partial oxidation of histones) and dissected isomeric mixtures of H4 PTMs). The prevalence of multiply modified H4 revealed here suggests that current views of histone PTM function are biased by the limited ability of other approaches to account for combinatorial modification.U of I OnlyRestricted to the U of I community idenfinitely during batch ingest of legacy ETD
    corecore