1,267 research outputs found
Convergence in distribution for filtering processes associated to Hidden Markov Models with densities
Consider a filtering process associated to a hidden Markov model with
densities for which both the state space and the observation space are
complete, separable, metric spaces. If the underlying, hidden Markov chain is
strongly ergodic and the filtering process fulfills a certain coupling
condition we prove that, in the limit, the distribution of the filtering
process is independent of the initial distribution of the hidden Markov chain.
If furthermore the hidden Markov chain is uniformly ergodic, then we prove that
the filtering process converges in distribution.Comment: 54 pages revision. Rewritten introduction. Theorem 12.1 sharper than
Theorem 16.1 (v1). Proofs and results reorganised. Example 18.3 (v1) exclude
Decoding Hidden Markov Models Faster Than Viterbi Via Online Matrix-Vector (max, +)-Multiplication
In this paper, we present a novel algorithm for the maximum a posteriori
decoding (MAPD) of time-homogeneous Hidden Markov Models (HMM), improving the
worst-case running time of the classical Viterbi algorithm by a logarithmic
factor. In our approach, we interpret the Viterbi algorithm as a repeated
computation of matrix-vector -multiplications. On time-homogeneous
HMMs, this computation is online: a matrix, known in advance, has to be
multiplied with several vectors revealed one at a time. Our main contribution
is an algorithm solving this version of matrix-vector -multiplication
in subquadratic time, by performing a polynomial preprocessing of the matrix.
Employing this fast multiplication algorithm, we solve the MAPD problem in
time for any time-homogeneous HMM of size and observation
sequence of length , with an extra polynomial preprocessing cost negligible
for . To the best of our knowledge, this is the first algorithm for the
MAPD problem requiring subquadratic time per observation, under the only
assumption -- usually verified in practice -- that the transition probability
matrix does not change with time.Comment: AAAI 2016, to appea
Smoothed Analysis in Unsupervised Learning via Decoupling
Smoothed analysis is a powerful paradigm in overcoming worst-case
intractability in unsupervised learning and high-dimensional data analysis.
While polynomial time smoothed analysis guarantees have been obtained for
worst-case intractable problems like tensor decompositions and learning
mixtures of Gaussians, such guarantees have been hard to obtain for several
other important problems in unsupervised learning. A core technical challenge
in analyzing algorithms is obtaining lower bounds on the least singular value
for random matrix ensembles with dependent entries, that are given by
low-degree polynomials of a few base underlying random variables.
In this work, we address this challenge by obtaining high-confidence lower
bounds on the least singular value of new classes of structured random matrix
ensembles of the above kind. We then use these bounds to design algorithms with
polynomial time smoothed analysis guarantees for the following three important
problems in unsupervised learning:
1. Robust subspace recovery, when the fraction of inliers in the
d-dimensional subspace is at least for any constant integer . This contrasts with the known
worst-case intractability when , and the previous smoothed
analysis result which needed (Hardt and Moitra, 2013).
2. Learning overcomplete hidden markov models, where the size of the state
space is any polynomial in the dimension of the observations. This gives the
first polynomial time guarantees for learning overcomplete HMMs in a smoothed
analysis model.
3. Higher order tensor decompositions, where we generalize the so-called
FOOBI algorithm of Cardoso to find order- rank-one tensors in a subspace.
This allows us to obtain polynomially robust decomposition algorithms for
'th order tensors with rank .Comment: 44 page
Identifiability of parameters in latent structure models with many observed variables
While hidden class models of various types arise in many statistical
applications, it is often difficult to establish the identifiability of their
parameters. Focusing on models in which there is some structure of independence
of some of the observed variables conditioned on hidden ones, we demonstrate a
general approach for establishing identifiability utilizing algebraic
arguments. A theorem of J. Kruskal for a simple latent-class model with finite
state space lies at the core of our results, though we apply it to a diverse
set of models. These include mixtures of both finite and nonparametric product
distributions, hidden Markov models and random graph mixture models, and lead
to a number of new results and improvements to old ones. In the parametric
setting, this approach indicates that for such models, the classical definition
of identifiability is typically too strong. Instead generic identifiability
holds, which implies that the set of nonidentifiable parameters has measure
zero, so that parameter inference is still meaningful. In particular, this
sheds light on the properties of finite mixtures of Bernoulli products, which
have been used for decades despite being known to have nonidentifiable
parameters. In the nonparametric setting, we again obtain identifiability only
when certain restrictions are placed on the distributions that are mixed, but
we explicitly describe the conditions.Comment: Published in at http://dx.doi.org/10.1214/09-AOS689 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …