Search CORE

38 research outputs found

MetaMT,a MetaLearning Method Leveraging Multiple Domain Data for Low Resource Machine Translation

Author: Li Rumeng
Wang Xun
Yu Hong
Publication venue
Publication date: 11/12/2019
Field of study

Manipulating training data leads to robust neural models for MT

arXiv.org e-Print Archive

PubMed Central

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sequence to Sequence Mixture Model for Diverse Machine Translation

Author: Haffari Gholamreza
He Xuanli
Norouzi Mohammad
Publication venue
Publication date: 01/01/2018
Field of study

Sequence to sequence (SEQ2SEQ) models often lack diversity in their generated translations. This can be attributed to the limitation of SEQ2SEQ models in capturing lexical and syntactic variations in a parallel corpus resulting from different styles, genres, topics, or ambiguity of the translation process. In this paper, we develop a novel sequence to sequence mixture (S2SMIX) model that improves both translation diversity and quality by adopting a committee of specialized translation models rather than a single translation model. Each mixture component selects its own training dataset via optimization of the marginal loglikelihood, which leads to a soft clustering of the parallel corpus. Experiments on four language pairs demonstrate the superiority of our mixture model compared to a SEQ2SEQ baseline with standard or diversity-boosted beam search. Our mixture model uses negligible additional parameters and incurs no extra computation cost during decoding.Comment: 11 pages, 5 figures, accepted to CoNLL201

arXiv.org e-Print Archive

Crossref

Monash University Research Portal

Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation

Author: Sumita Eiichiro
Utiyama Masao
Wang Rui
Publication venue
Publication date: 01/01/2018
Field of study

Traditional Neural machine translation (NMT) involves a fixed training procedure where each sentence is sampled once during each epoch. In reality, some sentences are well-learned during the initial few epochs; however, using this approach, the well-learned sentences would continue to be trained along with those sentences that were not well learned for 10-30 epochs, which results in a wastage of time. Here, we propose an efficient method to dynamically sample the sentences in order to accelerate the NMT training. In this approach, a weight is assigned to each sentence based on the measured difference between the training costs of two iterations. Further, in each epoch, a certain percentage of sentences are dynamically sampled according to their weights. Empirical results based on the NIST Chinese-to-English and the WMT English-to-German tasks depict that the proposed method can significantly accelerate the NMT training and improve the NMT performance.Comment: Revised version of ACL-201

arXiv.org e-Print Archive

Crossref