Search CORE

6 research outputs found

Neural System Combination for Machine Translation

Author: Hu Wenpeng
Zhang Jiajun
Zhou Long
Zong Chengqing
Publication venue
Publication date: 01/01/2017
Field of study

Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.Comment: Accepted as a short paper by ACL-201

arXiv.org e-Print Archive

Crossref

Findings of the IWSLT 2020 Evaluation campaign

Author: Ansari Ebrahim
Axelrod Amittai
Bach Nguyen
Bojar Ondřej
Cattoni Roldano
Dalvi Fahim
Durrani Nadir
Federico Marcello
Federmann Christian
Gu Jiatao
Huang Fei
Knight Kevin
Ma Xutai
Nagesh Ajay
Negri Matteo
Niehues Jan
Pino Juan
Salesky Elizabeth
Shi Xing
Stüker Sebastian
Turchi Marco
Waibel Alexander
Wang Changhan
Publication venue: Association for Computational Linguistics
Publication date: 20/04/2022
Field of study

KITopen

Consecutive Decoding for Speech-to-text Translation

Author: Dong Qianqian
Li Lei
Wang Mingxuan
Xu Bo
Xu Shuang
Zhou Hao
Publication venue
Publication date: 05/04/2021
Field of study

Speech-to-text translation (ST), which directly translates the source language speech to the target language text, has attracted intensive attention recently. However, the combination of speech recognition and machine translation in a single model poses a heavy burden on the direct cross-modal cross-lingual mapping. To reduce the learning difficulty, we propose COnSecutive Transcription and Translation (COSTT), an integral approach for speech-to-text translation. The key idea is to generate source transcript and target translation text with a single decoder. It benefits the model training so that additional large parallel text corpus can be fully exploited to enhance the speech translation training. Our method is verified on three mainstream datasets, including Augmented LibriSpeech English-French dataset, TED English-German dataset, and TED English-Chinese dataset. Experiments show that our proposed COSTT outperforms the previous state-of-the-art methods. The code is available at https://github.com/dqqcasia/st.Comment: Accepted by AAAI 2021. arXiv admin note: text overlap with arXiv:2009.0970

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

"Listen, Understand and Translate": Triple Supervision Decouples End-to-end Speech-to-text Translation

Author: Dong Qianqian
Li Lei
Wang Mingxuan
Xu Bo
Xu Shuang
Ye Rong
Zhou Hao
Publication venue
Publication date: 05/04/2021
Field of study

An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs the text in a target language. Existing methods are limited by the amount of parallel corpus. Can we build a system to fully utilize signals in a parallel ST corpus? We are inspired by human understanding system which is composed of auditory perception and cognitive processing. In this paper, we propose Listen-Understand-Translate, (LUT), a unified framework with triple supervision signals to decouple the end-to-end speech-to-text translation task. LUT is able to guide the acoustic encoder to extract as much information from the auditory input. In addition, LUT utilizes a pre-trained BERT model to enforce the upper encoder to produce as much semantic information as possible, without extra data. We perform experiments on a diverse set of speech translation benchmarks, including Librispeech English-French, IWSLT English-German and TED English-Chinese. Our results demonstrate LUT achieves the state-of-the-art performance, outperforming previous methods. The code is available at https://github.com/dqqcasia/st.Comment: Accepted by AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications