19,352 research outputs found
Learning Model Structure from Data : an Application to On-Line Handwriting
We present a learning strategy for Hidden Markov Models that may be used to cluster handwriting sequences or to learn a character model by identifying its main writing styles. Our approach aims at learning both the structure and parameters of a Hidden Markov Model (HMM) from the data. A byproduct of this learning strategy is the ability to cluster signals and identify allograph. We provide experimental results on artificial data that demonstrate the possibility to learn from data HMM parameters and topology. For a given topology, our approach outperforms in some cases that we identify standard Maximum Likelihood learning scheme. We also apply our unsupervised learning scheme on on-line handwritten signals for allograph clustering as well as for learning HMM models for handwritten digit recognition
Learning to automatically detect features for mobile robots using second-order Hidden Markov Models
In this paper, we propose a new method based on Hidden Markov Models to
interpret temporal sequences of sensor data from mobile robots to automatically
detect features. Hidden Markov Models have been used for a long time in pattern
recognition, especially in speech recognition. Their main advantages over other
methods (such as neural networks) are their ability to model noisy temporal
signals of variable length. We show in this paper that this approach is well
suited for interpretation of temporal sequences of mobile-robot sensor data. We
present two distinct experiments and results: the first one in an indoor
environment where a mobile robot learns to detect features like open doors or
T-intersections, the second one in an outdoor environment where a different
mobile robot has to identify situations like climbing a hill or crossing a
rock.Comment: 200
Identifiability and Unmixing of Latent Parse Trees
This paper explores unsupervised learning of parsing models along two
directions. First, which models are identifiable from infinite data? We use a
general technique for numerically checking identifiability based on the rank of
a Jacobian matrix, and apply it to several standard constituency and dependency
parsing models. Second, for identifiable models, how do we estimate the
parameters efficiently? EM suffers from local optima, while recent work using
spectral methods cannot be directly applied since the topology of the parse
tree varies across sentences. We develop a strategy, unmixing, which deals with
this additional complexity for restricted classes of parsing models
Diffusion of Context and Credit Information in Markovian Models
This paper studies the problem of ergodicity of transition probability
matrices in Markovian models, such as hidden Markov models (HMMs), and how it
makes very difficult the task of learning to represent long-term context for
sequential data. This phenomenon hurts the forward propagation of long-term
context information, as well as learning a hidden state representation to
represent long-term context, which depends on propagating credit information
backwards in time. Using results from Markov chain theory, we show that this
problem of diffusion of context and credit is reduced when the transition
probabilities approach 0 or 1, i.e., the transition probability matrices are
sparse and the model essentially deterministic. The results found in this paper
apply to learning approaches based on continuous optimization, such as gradient
descent and the Baum-Welch algorithm.Comment: See http://www.jair.org/ for any accompanying file
The posterior-Viterbi: a new decoding algorithm for hidden Markov models
Background: Hidden Markov models (HMM) are powerful machine learning tools
successfully applied to problems of computational Molecular Biology. In a
predictive task, the HMM is endowed with a decoding algorithm in order to
assign the most probable state path, and in turn the class labeling, to an
unknown sequence. The Viterbi and the posterior decoding algorithms are the
most common. The former is very efficient when one path dominates, while the
latter, even though does not guarantee to preserve the automaton grammar, is
more effective when several concurring paths have similar probabilities. A
third good alternative is 1-best, which was shown to perform equal or better
than Viterbi. Results: In this paper we introduce the posterior-Viterbi (PV) a
new decoding which combines the posterior and Viterbi algorithms. PV is a two
step process: first the posterior probability of each state is computed and
then the best posterior allowed path through the model is evaluated by a
Viterbi algorithm.
Conclusions: We show that PV decoding performs better than other algorithms
first on toy models and then on the computational biological problem of the
prediction of the topology of beta-barrel membrane proteins.Comment: 23 pages, 3 figure
- …