Recent studies have shown that high-order hidden Markov models (HMMs) are feasible and useful for spoken language processing. This paper extends the fixed-order versions to ergodic mixedorder HMMs, which allow the modelling of variable-length contexts with significantly less parameters. A novel training procedure automatically infers the number of states and the topology of the HMM from the training set, based on information-theoretic criteria. This is done by incorporating only high-order contexts with sufficient support in the data. The mixed-order training algorithm is faster than fixed-order methods, with similar classification performance in language identification tasks. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.