9,847 research outputs found
Generalization of Extended Baum-Welch Parameter Estimation for Discriminative Training and Decoding
We demonstrate the generalizability of the Extended Baum-Welch (EBW) algorithm not only for HMM parameter estimation but for decoding as well.\ud
We show that there can exist a general function associated with the objective function under EBW that reduces to the well-known auxiliary function used in the Baum-Welch algorithm for maximum likelihood estimates.\ud
We generalize representation for the updates of model parameters by making use of a differentiable function (such as arithmetic or geometric\ud
mean) on the updated and current model parameters and describe their effect on the learning rate during HMM parameter estimation. Improvements on speech recognition tasks are also presented here
A linear memory algorithm for Baum-Welch training
Background: Baum-Welch training is an expectation-maximisation algorithm for
training the emission and transition probabilities of hidden Markov models in a
fully automated way.
Methods and results: We introduce a linear space algorithm for Baum-Welch
training. For a hidden Markov model with M states, T free transition and E free
emission parameters, and an input sequence of length L, our new algorithm
requires O(M) memory and O(L M T_max (T + E)) time for one Baum-Welch
iteration, where T_max is the maximum number of states that any state is
connected to. The most memory efficient algorithm until now was the
checkpointing algorithm with O(log(L) M) memory and O(log(L) L M T_max) time
requirement. Our novel algorithm thus renders the memory requirement completely
independent of the length of the training sequences. More generally, for an
n-hidden Markov model and n input sequences of length L, the memory requirement
of O(log(L) L^(n-1) M) is reduced to O(L^(n-1) M) memory while the running time
is changed from O(log(L) L^n M T_max + L^n (T + E)) to O(L^n M T_max (T + E)).
Conclusions: For the large class of hidden Markov models used for example in
gene prediction, whose number of states does not scale with the length of the
input sequence, our novel algorithm can thus be both faster and more
memory-efficient than any of the existing algorithms.Comment: 14 pages, 1 figure version 2: fixed some errors, final version of
pape
Incremental HMM with an improved Baum-Welch Algorithm
There is an increasing demand for systems which handle higher density, additional loads as seen in storage workload modelling, where workloads can be characterized on-line. This paper aims to find a workload model which processes incoming data and then updates its parameters "on-the-fly." Essentially, this will be an incremental hidden Markov model (IncHMM) with an improved Baum-Welch algorithm. Thus, the benefit will be obtaining a parsimonious model which updates its encoded information whenever more real time workload data becomes available. To achieve this model, two new approximations of the Baum-Welch algorithm are defined, followed by training our model using discrete time series. This time series is transformed from a large network trace made up of I/O commands, into a partitioned binned trace, and then filtered through a K-means clustering algorithm to obtain an observation trace. The IncHMM, together with the observation trace, produces the required parameters to form a discrete Markov arrival process (MAP). Finally, we generate our own data trace (using the IncHMM parameters and a random distribution) and statistically compare it to the raw I/O trace, thus validating our model
Effect of Initial HMM Choices in Multiple Sequence Training for Gesture Recognition
We present several ways to initialize and train Hidden Markov Models (HMMs) for gesture recognition. These include using a single initial model for training (reestimation), multiple random initial models, and initial models directly computed from physical considerations. Each of the initial models is trained on multiple observation sequences using both Baum-Welch and the Viterbi Path Counting algorithm on three different model structures: Fully Connected (or ergodic), Left-Right, and Left-Right Banded. After performing many recognition trials on our video database of 780 letter gestures, results show that a) the simpler the structure is, the less the effect of the initial model, b) the direct computation method for designing the initial model is effective and provides insight into HMM learning, and c) Viterbi Path Counting performs best overall and depends much less on the initial model than does Baum-Welch training
- …
