Search CORE

1,879 research outputs found

The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016

Author: Ajili M.
Alegre F.
Ambikairajah E.
Aronowitz H.
Bahmaninezhad F.
Bonastre J. F.
Bousquet P. M.
Busch C.
Chng E. S.
Delgado H.
Evans N.
Fauve B.
Halonen M.
Hansen J. H.L.
Hautamäki V.
Isadskiy S.
Jin R.
Kanervisto A.
Kheder W. B.
Kinnunen T.
Larcher A.
Le Lan G.
Lee K. A.
Li H.
Li Haizhou
Lim Z. H.
Lin W. W.
Liu Gang
Ma B.
Ma J.
Mak M. W.
Matrouf D.
Nautsch A.
Nguyen T. H.
Qian Q.
Rao W.
Rathgeb C.
Rouvier M.
Saeidi R.
Sahidullah M.
Sarkar A. K.
Sethu V.
Sizov A.
Sriskandaraja K.
Stafylakis T.
Sun H.
Tan Z. H.
Thomsen D. A.L.
Todisco M.
Tzimiropoulos G.
Vestman V.
Wang G.
Wang Tianzhou
Wang Z.
Xiao X.
Xu C.
Xu H.
Xue J.
Zhang C.
Zhao Q.
Zhao T.
Zhu S.
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2017
Field of study

The 2016 speaker recognition evaluation (SRE'16) is the latest edition in the series of benchmarking events conducted by the National Institute of Standards and Technology (NIST). I4U is a joint entry to SRE'16 as the result from the collaboration and active exchange of information among researchers from sixteen Institutes and Universities across 4 continents. The joint submission and several of its 32 sub-systems were among top-performing systems. A lot of efforts have been devoted to two major challenges, namely, unlabeled training data and dataset shift from Switchboard-Mixer to the new Call My Net dataset. This paper summarizes the lessons learned, presents our shared view from the sixteen research groups on recent advances, major paradigm shift, and common tool chain used in speaker recognition as we have witnessed in SRE'16. More importantly, we look into the intriguing question of fusing a large ensemble of sub-systems and the potential benefit of large-scale collaboration.Peer reviewe

Aaltodoc Publication Archive

VBN

Cognitive Component Analysis

Author: Feng Ling
Publication venue
Publication date: 01/11/2008
Field of study

Online Research Database In Technology

Semi-Supervised Acoustic Model Training by Discriminative Data Selection from Multiple ASR Systems' Hypotheses

Author: Akita Yuya
Kawahara Tatsuya
Li Sheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2016
Field of study

While the performance of ASR systems depends on the size of the training data, it is very costly to prepare accurate and faithful transcripts. In this paper, we investigate a semisupervised training scheme, which takes the advantage of huge quantities of unlabeled video lecture archive, particularly for the deep neural network (DNN) acoustic model. In the proposed method, we obtain ASR hypotheses by complementary GMM-and DNN-based ASR systems. Then, a set of CRF-based classifiers is trained to select the correct hypotheses and verify the selected data. The proposed hypothesis combination shows higher quality compared with the conventional system combination method (ROVER). Moreover, compared with the conventional data selection based on confidence measure score, our method is demonstrated more effective for filtering usable data. Significant improvement in the ASR accuracy is achieved over the baseline system and in comparison with the models trained with the conventional system combination and data selection methods

Kyoto University Research Information Repository