16 research outputs found

    Bayesian nonparametric models for peak identification in MALDI-TOF mass spectroscopy

    Full text link
    We present a novel nonparametric Bayesian approach based on L\'{e}vy Adaptive Regression Kernels (LARK) to model spectral data arising from MALDI-TOF (Matrix Assisted Laser Desorption Ionization Time-of-Flight) mass spectrometry. This model-based approach provides identification and quantification of proteins through model parameters that are directly interpretable as the number of proteins, mass and abundance of proteins and peak resolution, while having the ability to adapt to unknown smoothness as in wavelet based methods. Informative prior distributions on resolution are key to distinguishing true peaks from background noise and resolving broad peaks into individual peaks for multiple protein species. Posterior distributions are obtained using a reversible jump Markov chain Monte Carlo algorithm and provide inference about the number of peaks (proteins), their masses and abundance. We show through simulation studies that the procedure has desirable true-positive and false-discovery rates. Finally, we illustrate the method on five example spectra: a blank spectrum, a spectrum with only the matrix of a low-molecular-weight substance used to embed target proteins, a spectrum with known proteins, and a single spectrum and average of ten spectra from an individual lung cancer patient.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS450 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Nonparametric Models for Peak Identification and Quantification in MALDI-TOF Mass Spectroscopy

    Full text link
    We present a novel nonparametric Bayesian model using Lévy random field priors for identifying the presence and abundance of proteins from mass spectrometry data. Informed prior distributions, based on expert opinion and on preliminary laboratory experiments, help distinguish true peaks from background noise and help resolve un-certainty about peak multiplicity

    TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2

    Full text link
    We present a novel nonparametric Bayesian approach based on Lévy Adaptive Regression Kernels (LARK) to model spectral data arising from MALDI-TOF (Matrix Assisted Laser Desorption Ionization Time-of-Flight) mass spectrometry. This model based approach provides identification and quantification of proteins though model parameters that are directly interpretable as the number of proteins, mass and abundance of proteins and peak resolution. Informed prior distributions, based on expert opinion and on preliminary laboratory experiments, help to distinguish true peaks from background noise and help resolve uncertainty about the peak multiplicity. Posterior distributions are obtained using a reversible jump Markov chain Monte Carlo algorithm and provide inference about the number of peaks (proteins), their masses and abundance. We show through simulation studies that the procedure has desirable true-and false-discovery rates. Finally, we illustrate the method on four example spectra: a blank spectrum, a spectrum with only the matrix of a low-molecular-weight substance used to embed target proteins, and a single spectrum and average of ten spectra from an individual lung cancer patient

    Nonparametric models for proteomic peak identification and quantification. Bayesian Inference for Gene Expression and Proteomics

    Full text link
    We present model-based inference for proteomic peak identification and quantification from mass spectroscopy data, focusing on nonparametric Bayesian models. Using experimental data generated from MALDI-TOF mass spectroscopy (Matrix Assisted Laser Desorption Ionization Time of Flight) we model observed intensities in spectra with a hierarchical nonparametric model for expected intensity as a function of time-of-flight. We express the unknown intensity function as a sum of kernel functions, a natural choice of basis functions for modelling spectral peaks. We discuss how to place prior distributions on the unknown functions using Lévy random fields and describe posterior inference via a reversible jump Markov chain Monte Carlo algorithm
    corecore