Search CORE

1,267 research outputs found

Convergence in distribution for filtering processes associated to Hidden Markov Models with densities

Author: Kaijser Thomas
Publication venue: 'Institute of Mathematics, Polish Academy of Sciences'
Publication date: 01/01/2013
Field of study

Consider a filtering process associated to a hidden Markov model with densities for which both the state space and the observation space are complete, separable, metric spaces. If the underlying, hidden Markov chain is strongly ergodic and the filtering process fulfills a certain coupling condition we prove that, in the limit, the distribution of the filtering process is independent of the initial distribution of the hidden Markov chain. If furthermore the hidden Markov chain is uniformly ergodic, then we prove that the filtering process converges in distribution.Comment: 54 pages revision. Rewritten introduction. Theorem 12.1 sharper than Theorem 16.1 (v1). Proofs and results reorganised. Example 18.3 (v1) exclude

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Decoding Hidden Markov Models Faster Than Viterbi Via Online Matrix-Vector (max, +)-Multiplication

Author: Cairo Massimo
Farina Gabriele
Rizzi Romeo
Publication venue
Publication date: 11/12/2015
Field of study

In this paper, we present a novel algorithm for the maximum a posteriori decoding (MAPD) of time-homogeneous Hidden Markov Models (HMM), improving the worst-case running time of the classical Viterbi algorithm by a logarithmic factor. In our approach, we interpret the Viterbi algorithm as a repeated computation of matrix-vector

(\max, +)

-multiplications. On time-homogeneous HMMs, this computation is online: a matrix, known in advance, has to be multiplied with several vectors revealed one at a time. Our main contribution is an algorithm solving this version of matrix-vector

(\max,+)

-multiplication in subquadratic time, by performing a polynomial preprocessing of the matrix. Employing this fast multiplication algorithm, we solve the MAPD problem in

O(mn^2/ \log n)

time for any time-homogeneous HMM of size

n

and observation sequence of length

m

, with an extra polynomial preprocessing cost negligible for

m > n

. To the best of our knowledge, this is the first algorithm for the MAPD problem requiring subquadratic time per observation, under the only assumption -- usually verified in practice -- that the transition probability matrix does not change with time.Comment: AAAI 2016, to appea

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Association for the Advancement of Artificial Intelligence: AAAI Publications

Smoothed Analysis in Unsupervised Learning via Decoupling

Author: Bhaskara Aditya
Chen Aidao
Perreault Aidan
Vijayaraghavan Aravindan
Publication venue
Publication date: 23/04/2019
Field of study

Smoothed analysis is a powerful paradigm in overcoming worst-case intractability in unsupervised learning and high-dimensional data analysis. While polynomial time smoothed analysis guarantees have been obtained for worst-case intractable problems like tensor decompositions and learning mixtures of Gaussians, such guarantees have been hard to obtain for several other important problems in unsupervised learning. A core technical challenge in analyzing algorithms is obtaining lower bounds on the least singular value for random matrix ensembles with dependent entries, that are given by low-degree polynomials of a few base underlying random variables. In this work, we address this challenge by obtaining high-confidence lower bounds on the least singular value of new classes of structured random matrix ensembles of the above kind. We then use these bounds to design algorithms with polynomial time smoothed analysis guarantees for the following three important problems in unsupervised learning: 1. Robust subspace recovery, when the fraction

\alpha

of inliers in the d-dimensional subspace

T \subset \mathbb{R}^n

is at least

\alpha > (d/n)^\ell

for any constant integer

\ell>0

. This contrasts with the known worst-case intractability when

\alpha< d/n

, and the previous smoothed analysis result which needed

\alpha > d/n

(Hardt and Moitra, 2013). 2. Learning overcomplete hidden markov models, where the size of the state space is any polynomial in the dimension of the observations. This gives the first polynomial time guarantees for learning overcomplete HMMs in a smoothed analysis model. 3. Higher order tensor decompositions, where we generalize the so-called FOOBI algorithm of Cardoso to find order-

\ell

rank-one tensors in a subspace. This allows us to obtain polynomially robust decomposition algorithms for

2\ell

'th order tensors with rank

O(n^{\ell})

.Comment: 44 page

arXiv.org e-Print Archive

Crossref

Identifiability of parameters in latent structure models with many observed variables

Author: Allman Elizabeth S.
Matias Catherine
Rhodes John A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

While hidden class models of various types arise in many statistical applications, it is often difficult to establish the identifiability of their parameters. Focusing on models in which there is some structure of independence of some of the observed variables conditioned on hidden ones, we demonstrate a general approach for establishing identifiability utilizing algebraic arguments. A theorem of J. Kruskal for a simple latent-class model with finite state space lies at the core of our results, though we apply it to a diverse set of models. These include mixtures of both finite and nonparametric product distributions, hidden Markov models and random graph mixture models, and lead to a number of new results and improvements to old ones. In the parametric setting, this approach indicates that for such models, the classical definition of identifiability is typically too strong. Instead generic identifiability holds, which implies that the set of nonidentifiable parameters has measure zero, so that parameter inference is still meaningful. In particular, this sheds light on the properties of finite mixtures of Bernoulli products, which have been used for decades despite being known to have nonidentifiable parameters. In the nonparametric setting, we again obtain identifiability only when certain restrictions are placed on the distributions that are mixed, but we explicitly describe the conditions.Comment: Published in at http://dx.doi.org/10.1214/09-AOS689 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive