522 research outputs found
Transfer Learning for Context-Aware Spoken Language Understanding
Spoken language understanding (SLU) is a key component of task-oriented
dialogue systems. SLU parses natural language user utterances into semantic
frames. Previous work has shown that incorporating context information
significantly improves SLU performance for multi-turn dialogues. However,
collecting a large-scale human-labeled multi-turn dialogue corpus for the
target domains is complex and costly. To reduce dependency on the collection
and annotation effort, we propose a Context Encoding Language Transformer
(CELT) model facilitating exploiting various context information for SLU. We
explore different transfer learning approaches to reduce dependency on data
collection and annotation. In addition to unsupervised pre-training using
large-scale general purpose unlabeled corpora, such as Wikipedia, we explore
unsupervised and supervised adaptive training approaches for transfer learning
to benefit from other in-domain and out-of-domain dialogue corpora.
Experimental results demonstrate that the proposed model with the proposed
transfer learning approaches achieves significant improvement on the SLU
performance over state-of-the-art models on two large-scale single-turn
dialogue benchmarks and one large-scale multi-turn dialogue benchmark.Comment: 6 pages, 3 figures, ASRU201
Semi-Supervised Acoustic Model Training by Discriminative Data Selection from Multiple ASR Systems' Hypotheses
While the performance of ASR systems depends on the size of the training data, it is very costly to prepare accurate and faithful transcripts. In this paper, we investigate a semisupervised training scheme, which takes the advantage of huge quantities of unlabeled video lecture archive, particularly for the deep neural network (DNN) acoustic model. In the proposed method, we obtain ASR hypotheses by complementary GMM-and DNN-based ASR systems. Then, a set of CRF-based classifiers is trained to select the correct hypotheses and verify the selected data. The proposed hypothesis combination shows higher quality compared with the conventional system combination method (ROVER). Moreover, compared with the conventional data selection based on confidence measure score, our method is demonstrated more effective for filtering usable data. Significant improvement in the ASR accuracy is achieved over the baseline system and in comparison with the models trained with the conventional system combination and data selection methods
- …