200 research outputs found

    Bayesian nonparametric models for peak identification in MALDI-TOF mass spectroscopy

    Full text link
    We present a novel nonparametric Bayesian approach based on L\'{e}vy Adaptive Regression Kernels (LARK) to model spectral data arising from MALDI-TOF (Matrix Assisted Laser Desorption Ionization Time-of-Flight) mass spectrometry. This model-based approach provides identification and quantification of proteins through model parameters that are directly interpretable as the number of proteins, mass and abundance of proteins and peak resolution, while having the ability to adapt to unknown smoothness as in wavelet based methods. Informative prior distributions on resolution are key to distinguishing true peaks from background noise and resolving broad peaks into individual peaks for multiple protein species. Posterior distributions are obtained using a reversible jump Markov chain Monte Carlo algorithm and provide inference about the number of peaks (proteins), their masses and abundance. We show through simulation studies that the procedure has desirable true-positive and false-discovery rates. Finally, we illustrate the method on five example spectra: a blank spectrum, a spectrum with only the matrix of a low-molecular-weight substance used to embed target proteins, a spectrum with known proteins, and a single spectrum and average of ten spectra from an individual lung cancer patient.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS450 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Stochastic expansions using continuous dictionaries: L\'{e}vy adaptive regression kernels

    Get PDF
    This article describes a new class of prior distributions for nonparametric function estimation. The unknown function is modeled as a limit of weighted sums of kernels or generator functions indexed by continuous parameters that control local and global features such as their translation, dilation, modulation and shape. L\'{e}vy random fields and their stochastic integrals are employed to induce prior distributions for the unknown functions or, equivalently, for the number of kernels and for the parameters governing their features. Scaling, shape, and other features of the generating functions are location-specific to allow quite different function properties in different parts of the space, as with wavelet bases and other methods employing overcomplete dictionaries. We provide conditions under which the stochastic expansions converge in specified Besov or Sobolev norms. Under a Gaussian error model, this may be viewed as a sparse regression problem, with regularization induced via the L\'{e}vy random field prior distribution. Posterior inference for the unknown functions is based on a reversible jump Markov chain Monte Carlo algorithm. We compare the L\'{e}vy Adaptive Regression Kernel (LARK) method to wavelet-based methods using some of the standard test functions, and illustrate its flexibility and adaptability in nonstationary applications.Comment: Published in at http://dx.doi.org/10.1214/11-AOS889 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Unified Conditional Frequentist and Bayesian Test for Fixed and Sequential Simple Hypothesis Testing

    Get PDF
    Preexperimental frequentist error probabilities are arguably inadequate, as summaries of evidence from data, in many hypothesis-testing settings. The conditional frequentist may respond to this by identifying certain subsets of the outcome space and reporting a conditional error probability, given the subset of the outcome space in which the observed data lie. Statistical methods consistent with the likelihood principle, including Bayesian methods, avoid the problem by a more extreme form of conditioning. In this paper we prove that the conditional frequentist\u27s method can be made exactly equivalent to the Bayesian\u27s in simple versus simple hypothesis testing: specifically, we find a conditioning strategy for which the conditional frequentist\u27s reported conditional error probabilities are the same as the Bayesian\u27s posterior probabilities of error. A conditional frequentist who uses such a strategy can exploit other features of the Bayesian approach--for example, the validity of sequential hypothesis tests (including versions of the sequential probability ratio test, or SPRT) even if the stopping rule is incompletely specified
    • …
    corecore