35,696 research outputs found
Spectral dimensionality reduction for HMMs
Hidden Markov Models (HMMs) can be accurately approximated using
co-occurrence frequencies of pairs and triples of observations by using a fast
spectral method in contrast to the usual slow methods like EM or Gibbs
sampling. We provide a new spectral method which significantly reduces the
number of model parameters that need to be estimated, and generates a sample
complexity that does not depend on the size of the observation vocabulary. We
present an elementary proof giving bounds on the relative accuracy of
probability estimates from our model. (Correlaries show our bounds can be
weakened to provide either L1 bounds or KL bounds which provide easier direct
comparisons to previous work.) Our theorem uses conditions that are checkable
from the data, instead of putting conditions on the unobservable Markov
transition matrix
A statistical multiresolution approach for face recognition using structural hidden Markov models
This paper introduces a novel methodology that combines the multiresolution feature of the discrete wavelet transform (DWT) with the local interactions of the facial structures expressed through the structural hidden Markov model (SHMM). A range of wavelet filters such as Haar, biorthogonal 9/7, and Coiflet, as well as Gabor, have been implemented in order to search for the best performance. SHMMs perform a thorough probabilistic analysis of any sequential pattern by revealing both its inner and outer structures simultaneously. Unlike traditional HMMs, the SHMMs do not perform the state conditional independence of the visible observation sequence assumption. This is achieved via the concept of local structures introduced by the SHMMs. Therefore, the long-range dependency problem inherent to traditional HMMs has been drastically reduced. SHMMs have not previously been applied to the problem of face identification. The results reported in this application have shown that SHMM outperforms the traditional hidden Markov model with a 73% increase in accuracy
On the Inability of Markov Models to Capture Criticality in Human Mobility
We examine the non-Markovian nature of human mobility by exposing the
inability of Markov models to capture criticality in human mobility. In
particular, the assumed Markovian nature of mobility was used to establish a
theoretical upper bound on the predictability of human mobility (expressed as a
minimum error probability limit), based on temporally correlated entropy. Since
its inception, this bound has been widely used and empirically validated using
Markov chains. We show that recurrent-neural architectures can achieve
significantly higher predictability, surpassing this widely used upper bound.
In order to explain this anomaly, we shed light on several underlying
assumptions in previous research works that has resulted in this bias. By
evaluating the mobility predictability on real-world datasets, we show that
human mobility exhibits scale-invariant long-range correlations, bearing
similarity to a power-law decay. This is in contrast to the initial assumption
that human mobility follows an exponential decay. This assumption of
exponential decay coupled with Lempel-Ziv compression in computing Fano's
inequality has led to an inaccurate estimation of the predictability upper
bound. We show that this approach inflates the entropy, consequently lowering
the upper bound on human mobility predictability. We finally highlight that
this approach tends to overlook long-range correlations in human mobility. This
explains why recurrent-neural architectures that are designed to handle
long-range structural correlations surpass the previously computed upper bound
on mobility predictability
Hyperspectral image unmixing using a multiresolution sticky HDP
This paper is concerned with joint Bayesian endmember extraction and linear unmixing of hyperspectral images using a spatial prior on the abundance vectors.We propose a generative model for hyperspectral images in which the abundances are sampled from a Dirichlet distribution (DD) mixture model, whose parameters depend on a latent label process. The label process is then used to enforces a spatial prior which encourages adjacent pixels to have the same label. A Gibbs sampling framework is used to generate samples from the posterior distributions of the abundances and the parameters of the DD mixture model. The spatial prior that is used is a tree-structured sticky hierarchical Dirichlet process (SHDP) and, when used to determine the posterior endmember and abundance distributions, results in a new unmixing algorithm called spatially constrained unmixing (SCU). The directed Markov model facilitates the use of scale-recursive estimation algorithms, and is therefore more computationally efficient as compared to standard Markov random field (MRF) models. Furthermore, the proposed SCU algorithm estimates the number of regions in the image in an unsupervised fashion. The effectiveness of the proposed SCU algorithm is illustrated using synthetic and real data
Learning loopy graphical models with latent variables: Efficient methods and guarantees
The problem of structure estimation in graphical models with latent variables
is considered. We characterize conditions for tractable graph estimation and
develop efficient methods with provable guarantees. We consider models where
the underlying Markov graph is locally tree-like, and the model is in the
regime of correlation decay. For the special case of the Ising model, the
number of samples required for structural consistency of our method scales
as , where p is the
number of variables, is the minimum edge potential, is
the depth (i.e., distance from a hidden node to the nearest observed nodes),
and is a parameter which depends on the bounds on node and edge
potentials in the Ising model. Necessary conditions for structural consistency
under any algorithm are derived and our method nearly matches the lower bound
on sample requirements. Further, the proposed method is practical to implement
and provides flexibility to control the number of latent variables and the
cycle lengths in the output graph.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1070 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Multiscale Approach for Statistical Characterization of Functional Images
Increasingly, scientific studies yield functional image data, in which the observed data consist of sets of curves recorded on the pixels of the image. Examples include temporal brain response intensities measured by fMRI and NMR frequency spectra measured at each pixel. This article presents a new methodology for improving the characterization of pixels in functional imaging, formulated as a spatial curve clustering problem. Our method operates on curves as a unit. It is nonparametric and involves multiple stages: (i) wavelet thresholding, aggregation, and Neyman truncation to effectively reduce dimensionality; (ii) clustering based on an extended EM algorithm; and (iii) multiscale penalized dyadic partitioning to create a spatial segmentation. We motivate the different stages with theoretical considerations and arguments, and illustrate the overall procedure on simulated and real datasets. Our method appears to offer substantial improvements over monoscale pixel-wise methods. An Appendix which gives some theoretical justifications of the methodology, computer code, documentation and dataset are available in the online supplements
Multiscale Discriminant Saliency for Visual Attention
The bottom-up saliency, an early stage of humans' visual attention, can be
considered as a binary classification problem between center and surround
classes. Discriminant power of features for the classification is measured as
mutual information between features and two classes distribution. The estimated
discrepancy of two feature classes very much depends on considered scale
levels; then, multi-scale structure and discriminant power are integrated by
employing discrete wavelet features and Hidden markov tree (HMT). With wavelet
coefficients and Hidden Markov Tree parameters, quad-tree like label structures
are constructed and utilized in maximum a posterior probability (MAP) of hidden
class variables at corresponding dyadic sub-squares. Then, saliency value for
each dyadic square at each scale level is computed with discriminant power
principle and the MAP. Finally, across multiple scales is integrated the final
saliency map by an information maximization rule. Both standard quantitative
tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating
the proposed multiscale discriminant saliency method (MDIS) against the
well-know information-based saliency method AIM on its Bruce Database wity
eye-tracking data. Simulation results are presented and analyzed to verify the
validity of MDIS as well as point out its disadvantages for further research
direction.Comment: 16 pages, ICCSA 2013 - BIOCA sessio
- …