2,003 research outputs found
A Unified Framework for Sparse Non-Negative Least Squares using Multiplicative Updates and the Non-Negative Matrix Factorization Problem
We study the sparse non-negative least squares (S-NNLS) problem. S-NNLS
occurs naturally in a wide variety of applications where an unknown,
non-negative quantity must be recovered from linear measurements. We present a
unified framework for S-NNLS based on a rectified power exponential scale
mixture prior on the sparse codes. We show that the proposed framework
encompasses a large class of S-NNLS algorithms and provide a computationally
efficient inference procedure based on multiplicative update rules. Such update
rules are convenient for solving large sets of S-NNLS problems simultaneously,
which is required in contexts like sparse non-negative matrix factorization
(S-NMF). We provide theoretical justification for the proposed approach by
showing that the local minima of the objective function being optimized are
sparse and the S-NNLS algorithms presented are guaranteed to converge to a set
of stationary points of the objective function. We then extend our framework to
S-NMF, showing that our framework leads to many well known S-NMF algorithms
under specific choices of prior and providing a guarantee that a popular
subclass of the proposed algorithms converges to a set of stationary points of
the objective function. Finally, we study the performance of the proposed
approaches on synthetic and real-world data.Comment: To appear in Signal Processin
A sticky HDP-HMM with application to speaker diarization
We consider the problem of speaker diarization, the problem of segmenting an
audio recording of a meeting into temporal segments corresponding to individual
speakers. The problem is rendered particularly difficult by the fact that we
are not allowed to assume knowledge of the number of people participating in
the meeting. To address this problem, we take a Bayesian nonparametric approach
to speaker diarization that builds on the hierarchical Dirichlet process hidden
Markov model (HDP-HMM) of Teh et al. [J. Amer. Statist. Assoc. 101 (2006)
1566--1581]. Although the basic HDP-HMM tends to over-segment the audio
data---creating redundant states and rapidly switching among them---we describe
an augmented HDP-HMM that provides effective control over the switching rate.
We also show that this augmentation makes it possible to treat emission
distributions nonparametrically. To scale the resulting architecture to
realistic diarization problems, we develop a sampling algorithm that employs a
truncated approximation of the Dirichlet process to jointly resample the full
state sequence, greatly improving mixing rates. Working with a benchmark NIST
data set, we show that our Bayesian nonparametric architecture yields
state-of-the-art speaker diarization results.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS395 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Model-based clustering of large networks
We describe a network clustering framework, based on finite mixture models, that can be applied to discrete-valued networks with hundreds of thousands of nodes and billions of edge variables. Relative to other recent model-based clustering work for networks, we introduce a more flexible modeling framework, improve the variational-approximation estimation algorithm, discuss and implement standard error estimation via a parametric bootstrap approach, and apply these methods to much larger data sets than those seen elsewhere in the literature. The more flexible framework is achieved through introducing novel parameterizations of the model, giving varying degrees of parsimony, using exponential family models whose structure may be exploited in various theoretical and algorithmic ways. The algorithms are based on variational generalized EM algorithms, where the E-steps are augmented by a minorization-maximization (MM) idea. The bootstrapped standard error estimates are based on an efficient Monte Carlo network simulation idea. Last, we demonstrate the usefulness of the model-based clustering framework by applying it to a discrete-valued network with more than 131,000 nodes and 17 billion edge variables
A nonparametric HMM for genetic imputation and coalescent inference
Genetic sequence data are well described by hidden Markov models (HMMs) in
which latent states correspond to clusters of similar mutation patterns. Theory
from statistical genetics suggests that these HMMs are nonhomogeneous (their
transition probabilities vary along the chromosome) and have large support for
self transitions. We develop a new nonparametric model of genetic sequence
data, based on the hierarchical Dirichlet process, which supports these self
transitions and nonhomogeneity. Our model provides a parameterization of the
genetic process that is more parsimonious than other more general nonparametric
models which have previously been applied to population genetics. We provide
truncation-free MCMC inference for our model using a new auxiliary sampling
scheme for Bayesian nonparametric HMMs. In a series of experiments on male X
chromosome data from the Thousand Genomes Project and also on data simulated
from a population bottleneck we show the benefits of our model over the popular
finite model fastPHASE, which can itself be seen as a parametric truncation of
our model. We find that the number of HMM states found by our model is
correlated with the time to the most recent common ancestor in population
bottlenecks. This work demonstrates the flexibility of Bayesian nonparametrics
applied to large and complex genetic data
Audio-Visual Automatic Speech Recognition Towards Education for Disabilities
Education is a fundamental right that enriches everyone’s life. However, physically challenged people often debar from the general and advanced education system. Audio-Visual Automatic Speech Recognition (AV-ASR) based system is useful to improve the education of physically challenged people by providing hands-free computing. They can communicate to the learning system through AV-ASR. However, it is challenging to trace the lip correctly for visual modality. Thus, this paper addresses the appearance-based visual feature along with the co-occurrence statistical measure for visual speech recognition. Local Binary Pattern-Three Orthogonal Planes (LBP-TOP) and Grey-Level Co-occurrence Matrix (GLCM) is proposed for visual speech information. The experimental results show that the proposed system achieves 76.60 % accuracy for visual speech and 96.00 % accuracy for audio speech recognition
- …