1,166 research outputs found
Deep Cascade Multi-task Learning for Slot Filling in Online Shopping Assistant
Slot filling is a critical task in natural language understanding (NLU) for
dialog systems. State-of-the-art approaches treat it as a sequence labeling
problem and adopt such models as BiLSTM-CRF. While these models work relatively
well on standard benchmark datasets, they face challenges in the context of
E-commerce where the slot labels are more informative and carry richer
expressions. In this work, inspired by the unique structure of E-commerce
knowledge base, we propose a novel multi-task model with cascade and residual
connections, which jointly learns segment tagging, named entity tagging and
slot filling. Experiments show the effectiveness of the proposed cascade and
residual structures. Our model has a 14.6% advantage in F1 score over the
strong baseline methods on a new Chinese E-commerce shopping assistant dataset,
while achieving competitive accuracies on a standard dataset. Furthermore,
online test deployed on such dominant E-commerce platform shows 130%
improvement on accuracy of understanding user utterances. Our model has already
gone into production in the E-commerce platform.Comment: AAAI 201
TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies
This paper presents an extended work on the trilingual spoken language translation corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC), namely TriECCC. TriECCC is a simultaneously spoken language translation corpus with parallel resources of speech and text in three languages: Khmer, English, and French. This corpus has approximately [Formula: see text] thousand utterances, approximately [Formula: see text], [Formula: see text], and [Formula: see text] h in length of speech, and [Formula: see text], [Formula: see text] and [Formula: see text] million words in text, in Khmer, English, and French, respectively. We first report the baseline results of machine translation (MT), and speech translation (ST) systems, which show reasonable performance. We then investigate the use of the ROVER method to combine multiple MT outputs and fine-tune the pre-trained English–French MT models to enhance the Khmer MT systems. Experimental results show that the ROVER is effective for combining English-to-Khmer and French-to-Khmer systems. Fine-tuning from both single and multiple parents shows the effective improvement on the BLEU scores for Khmer-to-English/French and English/French-to-Khmer MT systems
- …