678 research outputs found
A Spectral Algorithm for Learning Hidden Markov Models
Hidden Markov Models (HMMs) are one of the most fundamental and widely used
statistical tools for modeling discrete time series. In general, learning HMMs
from data is computationally hard (under cryptographic assumptions), and
practitioners typically resort to search heuristics which suffer from the usual
local optima issues. We prove that under a natural separation condition (bounds
on the smallest singular value of the HMM parameters), there is an efficient
and provably correct algorithm for learning HMMs. The sample complexity of the
algorithm does not explicitly depend on the number of distinct (discrete)
observations---it implicitly depends on this quantity through spectral
properties of the underlying HMM. This makes the algorithm particularly
applicable to settings with a large number of observations, such as those in
natural language processing where the space of observation is sometimes the
words in a language. The algorithm is also simple, employing only a singular
value decomposition and matrix multiplications.Comment: Published in JCSS Special Issue "Learning Theory 2009
Posterior concentration rates for empirical Bayes procedures, with applications to Dirichlet Process mixtures
In this paper we provide general conditions to check on the model and the
prior to derive posterior concentration rates for data-dependent priors (or
empirical Bayes approaches). We aim at providing conditions that are close to
the conditions provided in the seminal paper by Ghosal and van der Vaart
(2007a). We then apply the general theorem to two different settings: the
estimation of a density using Dirichlet process mixtures of Gaussian random
variables with base measure depending on some empirical quantities and the
estimation of the intensity of a counting process under the Aalen model. A
simulation study for inhomogeneous Poisson processes also illustrates our
results. In the former case we also derive some results on the estimation of
the mixing density and on the deconvolution problem. In the latter, we provide
a general theorem on posterior concentration rates for counting processes with
Aalen multiplicative intensity with priors not depending on the data.Comment: With supplementary materia
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
A central problem in machine learning involves modeling complex data-sets
using highly flexible families of probability distributions in which learning,
sampling, inference, and evaluation are still analytically or computationally
tractable. Here, we develop an approach that simultaneously achieves both
flexibility and tractability. The essential idea, inspired by non-equilibrium
statistical physics, is to systematically and slowly destroy structure in a
data distribution through an iterative forward diffusion process. We then learn
a reverse diffusion process that restores structure in data, yielding a highly
flexible and tractable generative model of the data. This approach allows us to
rapidly learn, sample from, and evaluate probabilities in deep generative
models with thousands of layers or time steps, as well as to compute
conditional and posterior probabilities under the learned model. We
additionally release an open source reference implementation of the algorithm
- …