57,125 research outputs found
Bootstrapping Multilingual Intent Models via Machine Translation for Dialog Automation
With the resurgence of chat-based dialog systems in consumer and enterprise
applications, there has been much success in developing data-driven and
rule-based natural language models to understand human intent. Since these
models require large amounts of data and in-domain knowledge, expanding an
equivalent service into new markets is disrupted by language barriers that
inhibit dialog automation.
This paper presents a user study to evaluate the utility of out-of-the-box
machine translation technology to (1) rapidly bootstrap multilingual spoken
dialog systems and (2) enable existing human analysts to understand foreign
language utterances. We additionally evaluate the utility of machine
translation in human assisted environments, where a portion of the traffic is
processed by analysts. In English->Spanish experiments, we observe a high
potential for dialog automation, as well as the potential for human analysts to
process foreign language utterances with high accuracy.Comment: 6 pages, 3 figures, accepted for publication at the 2018 European
Association for Machine Translation Conference (EAMT 2018
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop
We summarize the accomplishments of a multi-disciplinary workshop exploring
the computational and scientific issues surrounding the discovery of linguistic
units (subwords and words) in a language without orthography. We study the
replacement of orthographic transcriptions by images and/or translated text in
a well-resourced language to help unsupervised discovery from raw speech.Comment: Accepted to ICASSP 201
Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018
This paper describes FBK's submission to the end-to-end English-German speech
translation task at IWSLT 2018. Our system relies on a state-of-the-art model
based on LSTMs and CNNs, where the CNNs are used to reduce the temporal
dimension of the audio input, which is in general much higher than machine
translation input. Our model was trained only on the audio-to-text parallel
data released for the task, and fine-tuned on cleaned subsets of the original
training corpus. The addition of weight normalization and label smoothing
improved the baseline system by 1.0 BLEU point on our validation set. The final
submission also featured checkpoint averaging within a training run and
ensemble decoding of models trained during multiple runs. On test data, our
best single model obtained a BLEU score of 9.7, while the ensemble obtained a
BLEU score of 10.24.Comment: 6 pages, 2 figures, system description at the 15th International
Workshop on Spoken Language Translation (IWSLT) 201
- …