81,094 research outputs found

    Feature extraction for proteomics imaging mass spectrometry data

    Get PDF
    Imaging mass spectrometry (IMS) has transformed proteomics by providing an avenue for collecting spatially distributed molecular data. Mass spectrometry data acquired with matrix assisted laser desorption ionization (MALDI) IMS consist of tens of thousands of spectra, measured at regular grid points across the surface of a tissue section. Unlike the more standard liquid chromatography mass spectrometry, MALDI-IMS preserves the spatial information inherent in the tissue. Motivated by the need to differentiate cell populations and tissue types in MALDI-IMS data accurately and efficiently, we propose an integrated cluster and feature extraction approach for such data. We work with the derived binary data representing presence/absence of ions, as this is the essential information in the data. Our approach takes advantage of the spatial structure of the data in a noise removal and initial dimension reduction step and applies k -means clustering with the cosine distance to the high-dimensional binary data. The combined smoothing-clustering yields spatially localized clusters that clearly show the correspondence with cancer and various noncancerous tissue types. Feature extraction of the high-dimensional binary data is accomplished with our difference in proportions of occurrence (DIPPS) approach which ranks the variables and selects a set of variables in a data-driven manner. We summarize the best variables in a single image that has a natural interpretation. Application of our method to data from patients with ovarian cancer shows good separation of tissue types and close agreement of our results with tissue types identified by pathologists.Lyron J. Winderbaum, Inge Koch, Ove J. R. Gustafsson, Stephan Meding and Peter Hoffman

    Identification of histoplasma -specific peptides in human urine

    Get PDF
    pre-printHistoplasmosis is a severe dimorphic fungus infection, which is often difficult to diagnose due to similarity in symptoms to other diseases and lack of specific diagnostic tests. Urine samples from histoplasma-antigen-positive patients and appropriate controls were prepared using various sample preparation strategies including immunoenrichment, ultrafiltration, high-abundant protein depletion, deglycosylation, reverse-phase fractions, and digest using various enzymes. Samples were then analyzed by nanospray tandem mass spectrometry. Accurate mass TOF scans underwent molecular feature extraction and statistical analysis for unique disease makers, and acquired MS/MS data were searched against known human and histoplasma proteins. In human urine, some 52 peptides from 37 Histoplasma proteins were identified with high confidence. This is the first report of identification of a large number of Histoplasma-specific peptides from immunoassay-positive patient samples using tandem mass spectrometry and bioinformatics techniques. These findings may lead to novel diagnostic markers for histoplasmosis in human urine

    Identification of Histoplasma-Specific Peptides in Human Urine

    Get PDF
    Histoplasmosis is a severe dimorphic fungus infection, which is often difficult to diagnose due to similarity in symptoms to other diseases and lack of specific diagnostic tests. Urine samples from histoplasma-antigen-positive patients and appropriate controls were prepared using various sample preparation strategies including immunoenrichment, ultrafiltration, high-abundant protein depletion, deglycosylation, reverse-phase fractions, and digest using various enzymes. Samples were then analyzed by nanospray tandem mass spectrometry. Accurate mass TOF scans underwent molecular feature extraction and statistical analysis for unique disease makers, and acquired MS/MS data were searched against known human and histoplasma proteins. In human urine, some 52 peptides from 37 Histoplasma proteins were identified with high confidence. This is the first report of identification of a large number of Histoplasma-specific peptides from immunoassay-positive patient samples using tandem mass spectrometry and bioinformatics techniques. These findings may lead to novel diagnostic markers for histoplasmosis in human urine

    NITPICK: peak identification for mass spectrometry data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The reliable extraction of features from mass spectra is a fundamental step in the automated analysis of proteomic mass spectrometry (MS) experiments.</p> <p>Results</p> <p>This contribution proposes a sparse template regression approach to peak picking called NITPICK. NITPICK is a Non-greedy, Iterative Template-based peak PICKer that deconvolves complex overlapping isotope distributions in multicomponent mass spectra. NITPICK is based on <it>fractional averagine</it>, a novel extension to Senko's well-known averagine model, and on a modified version of sparse, non-negative least angle regression, for which a suitable, statistically motivated early stopping criterion has been derived. The strength of NITPICK is the deconvolution of overlapping mixture mass spectra.</p> <p>Conclusion</p> <p>Extensive comparative evaluation has been carried out and results are provided for simulated and real-world data sets. NITPICK outperforms pepex, to date the only alternate, publicly available, non-greedy feature extraction routine. NITPICK is available as software package for the R programming language and can be downloaded from <url>http://hci.iwr.uni-heidelberg.de/mip/proteomics/</url>.</p

    Trapped ion mobility spectrometry and PASEF enable in-depth lipidomics from minimal sample amounts

    No full text
    A comprehensive characterization of the lipidome from limited starting material remains very challenging. Here we report a high-sensitivity lipidomics workflow based on nanoflow liquid chromatography and trapped ion mobility spectrometry (TIMS). Taking advantage of parallel accumulation-serial fragmentation (PASEF), we fragment on average 15 precursors in each of 100 ms TIMS scans, while maintaining the full mobility resolution of co-eluting isomers. The acquisition speed of over 100 Hz allows us to obtain MS/MS spectra of the vast majority of isotope patterns. Analyzing 1 mu L of human plasma, PASEF increases the number of identified lipids more than three times over standard TIMS-MS/MS, achieving attomole sensitivity. Building on high intra- and inter-laboratory precision and accuracy of TIMS collisional cross sections (CCS), we compile 1856 lipid CCS values from plasma, liver and cancer cells. Our study establishes PASEF in lipid analysis and paves the way for sensitive, ion mobility-enhanced lipidomics in four dimensions

    Multiplierz: An Extensible API Based Desktop Environment for Proteomics Data Analysis

    Get PDF
    BACKGROUND. Efficient analysis of results from mass spectrometry-based proteomics experiments requires access to disparate data types, including native mass spectrometry files, output from algorithms that assign peptide sequence to MS/MS spectra, and annotation for proteins and pathways from various database sources. Moreover, proteomics technologies and experimental methods are not yet standardized; hence a high degree of flexibility is necessary for efficient support of high- and low-throughput data analytic tasks. Development of a desktop environment that is sufficiently robust for deployment in data analytic pipelines, and simultaneously supports customization for programmers and non-programmers alike, has proven to be a significant challenge. RESULTS. We describe multiplierz, a flexible and open-source desktop environment for comprehensive proteomics data analysis. We use this framework to expose a prototype version of our recently proposed common API (mzAPI) designed for direct access to proprietary mass spectrometry files. In addition to routine data analytic tasks, multiplierz supports generation of information rich, portable spreadsheet-based reports. Moreover, multiplierz is designed around a "zero infrastructure" philosophy, meaning that it can be deployed by end users with little or no system administration support. Finally, access to multiplierz functionality is provided via high-level Python scripts, resulting in a fully extensible data analytic environment for rapid development of custom algorithms and deployment of high-throughput data pipelines. CONCLUSION. Collectively, mzAPI and multiplierz facilitate a wide range of data analysis tasks, spanning technology development to biological annotation, for mass spectrometry-based proteomics research.Dana-Farber Cancer Institute; National Human Genome Research Institute (P50HG004233); National Science Foundation Integrative Graduate Education and Research Traineeship grant (DGE-0654108

    IR ion spectroscopy in a combined approach with MS/MS and IM-MS to discriminate epimeric anthocyanin glycosides (cyanidin 3-O-glucoside and -galactoside)

    Get PDF
    Anthocyanins are widespread in plants and flowers, being responsible for their different colouring. Two representative members of this family have been selected, cyanidin 3-O-β-glucopyranoside and 3-O-β-galactopyranoside, and probed by mass spectrometry based methods, testing their performance in discriminating between the two epimers. The native anthocyanins, delivered into the gas phase by electrospray ionization, display a comparable drift time in ion mobility mass spectrometry (IM-MS) and a common fragment, corresponding to loss of the sugar moiety, in their collision induced dissociation (CID) pattern. However, the IR multiple photon dissociation (IRMPD) spectra in the fingerprint range show a feature particularly evident in the case of the glucoside. This signature is used to identify the presence of cyanidin 3-O-β-glucopyranoside in a natural extract of pomegranate. In an effort to increase any differentiation between the two epimers, aluminum complexes were prepared and sampled for elemental composition by FT-ICR-MS. CID experiments now display an extensive fragmentation pattern, showing few product ions peculiar to each species. More noteworthy is the IRMPD behavior in the OH stretching range showing significant differences in the spectra of the two epimers. DFT calculations allow to interpret the observed distinct bands due to a varied network of hydrogen bonding and relative conformer stability

    Mining whole sample mass spectrometry proteomics data for biomarkers: an overview

    No full text
    In this paper we aim to provide a concise overview of designing and conducting an MS proteomics experiment in such a way as to allow statistical analysis that may lead to the discovery of novel biomarkers. We provide a summary of the various stages that make up such an experiment, highlighting the need for experimental goals to be decided upon in advance. We discuss issues in experimental design at the sample collection stage, and good practise for standardising protocols within the proteomics laboratory. We then describe approaches to the data mining stage of the experiment, including the processing steps that transform a raw mass spectrum into a useable form. We propose a permutation-based procedure for determining the significance of reported error rates. Finally, because of its general advantages in speed and cost, we suggest that MS proteomics may be a good candidate for an early primary screening approach to disease diagnosis, identifying areas of risk and making referrals for more specific tests without necessarily making a diagnosis in its own right. Our discussion is illustrated with examples drawn from experiments on bovine blood serum conducted in the Centre for Proteomic Research (CPR) at Southampton University
    corecore