200 research outputs found
Bayesian nonparametric models for peak identification in MALDI-TOF mass spectroscopy
We present a novel nonparametric Bayesian approach based on L\'{e}vy Adaptive
Regression Kernels (LARK) to model spectral data arising from MALDI-TOF (Matrix
Assisted Laser Desorption Ionization Time-of-Flight) mass spectrometry. This
model-based approach provides identification and quantification of proteins
through model parameters that are directly interpretable as the number of
proteins, mass and abundance of proteins and peak resolution, while having the
ability to adapt to unknown smoothness as in wavelet based methods. Informative
prior distributions on resolution are key to distinguishing true peaks from
background noise and resolving broad peaks into individual peaks for multiple
protein species. Posterior distributions are obtained using a reversible jump
Markov chain Monte Carlo algorithm and provide inference about the number of
peaks (proteins), their masses and abundance. We show through simulation
studies that the procedure has desirable true-positive and false-discovery
rates. Finally, we illustrate the method on five example spectra: a blank
spectrum, a spectrum with only the matrix of a low-molecular-weight substance
used to embed target proteins, a spectrum with known proteins, and a single
spectrum and average of ten spectra from an individual lung cancer patient.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS450 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Stochastic expansions using continuous dictionaries: L\'{e}vy adaptive regression kernels
This article describes a new class of prior distributions for nonparametric
function estimation. The unknown function is modeled as a limit of weighted
sums of kernels or generator functions indexed by continuous parameters that
control local and global features such as their translation, dilation,
modulation and shape. L\'{e}vy random fields and their stochastic integrals are
employed to induce prior distributions for the unknown functions or,
equivalently, for the number of kernels and for the parameters governing their
features. Scaling, shape, and other features of the generating functions are
location-specific to allow quite different function properties in different
parts of the space, as with wavelet bases and other methods employing
overcomplete dictionaries. We provide conditions under which the stochastic
expansions converge in specified Besov or Sobolev norms. Under a Gaussian error
model, this may be viewed as a sparse regression problem, with regularization
induced via the L\'{e}vy random field prior distribution. Posterior inference
for the unknown functions is based on a reversible jump Markov chain Monte
Carlo algorithm. We compare the L\'{e}vy Adaptive Regression Kernel (LARK)
method to wavelet-based methods using some of the standard test functions, and
illustrate its flexibility and adaptability in nonstationary applications.Comment: Published in at http://dx.doi.org/10.1214/11-AOS889 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Unified Conditional Frequentist and Bayesian Test for Fixed and Sequential Simple Hypothesis Testing
Preexperimental frequentist error probabilities are arguably inadequate, as summaries of evidence from data, in many hypothesis-testing settings. The conditional frequentist may respond to this by identifying certain subsets of the outcome space and reporting a conditional error probability, given the subset of the outcome space in which the observed data lie. Statistical methods consistent with the likelihood principle, including Bayesian methods, avoid the problem by a more extreme form of conditioning.
In this paper we prove that the conditional frequentist\u27s method can be made exactly equivalent to the Bayesian\u27s in simple versus simple hypothesis testing: specifically, we find a conditioning strategy for which the conditional frequentist\u27s reported conditional error probabilities are the same as the Bayesian\u27s posterior probabilities of error. A conditional frequentist who uses such a strategy can exploit other features of the Bayesian approach--for example, the validity of sequential hypothesis tests (including versions of the sequential probability ratio test, or SPRT) even if the stopping rule is incompletely specified
- …