8,989 research outputs found
Sparse gaussian graphical models for speech recognition.
We address the problem of learning the structure of Gaussian graphical models for use in automatic speech recognition, a means of controlling the form of the inverse covariance matrices of such systems. With particular focus on data sparsity issues, we implement a method for imposing graphical model structure on a Gaussian mixture system, using a convex optimisation technique to maximise a penalised likelihood expression. The results of initial experiments on a phone recognition task show a performance improvement over an equivalent full-covariance system
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Substructure and Boundary Modeling for Continuous Action Recognition
This paper introduces a probabilistic graphical model for continuous action
recognition with two novel components: substructure transition model and
discriminative boundary model. The first component encodes the sparse and
global temporal transition prior between action primitives in state-space model
to handle the large spatial-temporal variations within an action class. The
second component enforces the action duration constraint in a discriminative
way to locate the transition boundaries between actions more accurately. The
two components are integrated into a unified graphical structure to enable
effective training and inference. Our comprehensive experimental results on
both public and in-house datasets show that, with the capability to incorporate
additional information that had not been explicitly or efficiently modeled by
previous methods, our proposed algorithm achieved significantly improved
performance for continuous action recognition.Comment: Detailed version of the CVPR 2012 paper. 15 pages, 6 figure
- …