Search CORE

4 research outputs found

Comparative Study of Boosting and Non-Boosting Training for Constructing Ensembles of Acoustic Models

Author: Alexander Rudnicky (4321747)
Rong Zhang (5409425)
Publication venue
Publication date: 30/06/2018
Field of study

This paper compares the performance of Boosting and nonBoosting training algorithms in large vocabulary continuous speech recognition (LVCSR) using ensembles of acoustic models. Both algorithms demonstrated significant word error rate reduction on the CMU Communicator corpus. However, both algorithms produced comparable improvements, even though one would expect that the Boosting algorithm, which has a solid theoretic foundation, should work much better than the non-Boosting algorithm. Several voting schemes for hypothesis combining were evaluated, including weighted voting, un-weighted voting and ROVER. </p

Training Data Selection for Discriminative Training of Acoustic Models

Author: Fang-Hui Chu
朱芳輝
Publication venue
Publication date
Field of study

[[abstract]]This thesis aims to investigate various training data selection approaches for improving the minimum phone error (MPE) based discriminative training of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). First, inspired by the concept of the AdaBoost algorithm that lays more emphasis on the training samples misclassified by the already-trained classifier, the accumulated statistics of the training utterances prone to be incorrectly recognized are properly adjusted during the MPE training. Meanwhile, multiple speech recognition systems with their acoustic models respectively trained using various training data selection criteria are combined together at different recognition stages for improving the recognition accuracy. On the other hand, a novel data selection approach conducted on the expected phone accuracy domain of the word lattices of training utterances is explored as well. It is able to select more discriminative training instances, in terms of either utterances or phone arcs, for better model discrimination. Moreover, this approach is further integrated with a previously proposed frame-level data selection approach, namely the normalized entropy based frame-level data selection, and a frame-level phone accuracy function for improving the MPE training. All experiments were performed on the Mandarin broadcast news corpus (MATBN), and the associated results initially demonstrated the feasibility of our proposed training data selection approaches.

National Taiwan Normal University Repository

Ensembles for sequence learning

Author: Dimitrakakis Christos
Publication venue: Lausanne, EPFL
Publication date: 01/01/2006
Field of study

This thesis explores the application of ensemble methods to sequential learning tasks. The focus is on the development and the critical examination of new methods or novel applications of existing methods, with emphasis on supervised and reinforcement learning problems. In both types of problems, even after having observed a certain amount of data, we are often faced with uncertainty as to which hypothesis is correct among all the possible ones. However, in many methods for both supervised and for reinforcement learning problems this uncertainty is ignored, in the sense that there is a single solution selected out of the whole of the hypothesis space. Apart from the classical solution of analytical Bayesian formulations, ensemble methods offer an alternative approach to representing this uncertainty. This is done simply through maintaining a set of alternative hypotheses. The sequential supervised problem considered is that of automatic speech recognition using hidden Markov models. The application of ensemble methods to the problem represents a challenge in itself, since most such methods can not be readily adapted to sequential learning tasks. This thesis proposes a number of different approaches for applying ensemble methods to speech recognition and develops methods for effective training of phonetic mixtures with or without access to phonetic alignment data. Furthermore, the notion of expected loss is introduced for integrating probabilistic models with the boosting approach. In some cases substantial improvements over the baseline system are obtained. In reinforcement learning problems the goal is to act in such a way as to maximise future reward in a given environment. In such problems uncertainty becomes important since neither the environment nor the distribution of rewards that result from each action are known. This thesis presents novel algorithms for acting nearly optimally under uncertainty based on theoretical considerations. Some ensemble-based representations of uncertainty (including a fully Bayesian model) are developed and tested on a few simple tasks resulting in performance comparable with the state of the art. The thesis also draws some parallels between a proposed representation of uncertainty based on gradient-estimates and on"prioritised sweeping" and between the application of reinforcement learning to controlling an ensemble of classifiers and classical supervised ensemble learning methods

Infoscience - École polytechnique fédérale de Lausanne

Chalmers Research