Search CORE

2 research outputs found

Speaker recognition using adaptively boosted decision tree classifier

Author: Foo Say Wei
Lim Eng Guan
Publication venue
Publication date: 01/01/2002
Field of study

In this paper, a novel approach for speaker recognition is proposed. The approach makes use of adaptive boosting (AdaBoost) and C4.5 decision trees for closed set, text-dependent speaker recognition. A subset of 20 speakers, 10 male and 10 female, drawn from the YOHO speaker verification corpus is used to assess the performance of the system. Results reveal that an accuracy of 99.5% of speaker identification may be achieved.Published versio

DR-NTU (Digital Repository of NTU)

Training Data Selection for Discriminative Training of Acoustic Models

Author: Fang-Hui Chu
朱芳輝
Publication venue
Publication date
Field of study

[[abstract]]This thesis aims to investigate various training data selection approaches for improving the minimum phone error (MPE) based discriminative training of acoustic models for Mandarin large vocabulary continuous speech recognition (LVCSR). First, inspired by the concept of the AdaBoost algorithm that lays more emphasis on the training samples misclassified by the already-trained classifier, the accumulated statistics of the training utterances prone to be incorrectly recognized are properly adjusted during the MPE training. Meanwhile, multiple speech recognition systems with their acoustic models respectively trained using various training data selection criteria are combined together at different recognition stages for improving the recognition accuracy. On the other hand, a novel data selection approach conducted on the expected phone accuracy domain of the word lattices of training utterances is explored as well. It is able to select more discriminative training instances, in terms of either utterances or phone arcs, for better model discrimination. Moreover, this approach is further integrated with a previously proposed frame-level data selection approach, namely the normalized entropy based frame-level data selection, and a frame-level phone accuracy function for improving the MPE training. All experiments were performed on the Mandarin broadcast news corpus (MATBN), and the associated results initially demonstrated the feasibility of our proposed training data selection approaches.

National Taiwan Normal University Repository