Search CORE

104,898 research outputs found

Corpus Augmentation by Sentence Segmentation for Low-Resource Neural Machine Translation

Author: Matsumoto Tadahiro
Zhang Jinyi
Publication venue: 'MDPI AG'
Publication date: 01/05/2019
Field of study

Neural Machine Translation (NMT) has been proven to achieve impressive results. The NMT system translation results depend strongly on the size and quality of parallel corpora. Nevertheless, for many language pairs, no rich-resource parallel corpora exist. As described in this paper, we propose a corpus augmentation method by segmenting long sentences in a corpus using back-translation and generating pseudo-parallel sentence pairs. The experiment results of the Japanese-Chinese and Chinese-Japanese translation with Japanese-Chinese scientific paper excerpt corpus (ASPEC-JC) show that the method improves translation performance.Comment: 4 pages. The version before Applied. Science

arXiv.org e-Print Archive

Directory of Open Access Journals

Chinese–Spanish neural machine translation enhanced with character and word bitmap fonts

Author: A Lavie
D Chiang
David Aldón
JL Fleiss
José A. R. Fonollosa
Marta R. Costa-jussà
MR Costa-jussà
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Recently, machine translation systems based on neural networks have reached state-of-the-art results for some pairs of languages (e.g., German–English). In this paper, we are investigating the performance of neural machine translation in Chinese–Spanish, which is a challenging language pair. Given that the meaning of a Chinese word can be related to its graphical representation, this work aims to enhance neural machine translation by using as input a combination of: words or characters and their corresponding bitmap fonts. The fact of performing the interpretation of every word or character as a bitmap font generates more informed vectorial representations. Best results are obtained when using words plus their bitmap fonts obtaining an improvement (over a competitive neural MT baseline system) of almost six BLEU, five METEOR points and ranked coherently better in the human evaluation.Peer ReviewedPostprint (published version

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Neural machine translation using bitmap fonts

Author: Aldón Mínguez David
Rodríguez Fonollosa José Adrián
Ruiz Costa-Jussà Marta
Publication venue
Publication date: 01/01/2016
Field of study

Recently, translation systems based on neural networks are starting to compete with systems based on phrases. The systems which are based on neural networks use vectorial repre- sentations of words. However, one of the biggest challenges that machine translation still faces, is dealing with large vocabularies and morphologically rich languages. This work aims to adapt a neural machine translation system to translate from Chinese to Spanish, using as input different types of granularity: words, characters, bitmap fonts of Chinese characters or words. The fact of performing the interpretation of every character or word as a bitmap font allows for obtaining more informed vectorial representations. Best results are obtained when using the information of the word bitmap font.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Neural System Combination for Machine Translation

Author: Hu Wenpeng
Zhang Jiajun
Zhou Long
Zong Chengqing
Publication venue
Publication date: 01/01/2017
Field of study

Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.Comment: Accepted as a short paper by ACL-201

arXiv.org e-Print Archive

Crossref

A Client mobile application for Chinese-Spanish statistical machine translation

Author: Banchs Martínez Rafael Enrique
Centelles Jordi
Ruiz Costa-Jussà Marta
Publication venue
Publication date: 01/01/2014
Field of study

This show and tell paper describes a client mobile application for Chinese-Spanish machine translation. The system combines a standard server-based statistical machine translation (SMT) system, which requires online operation, with different input modalities including text, optical character recognition (OCR) and automatic speech recognition (ASR). It also includes an index-based search engine for supporting off-line translation.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Supervised Attentions for Neural Machine Translation

Author: Ittycheriah Abe
Mi Haitao
Wang Zhiguo
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the "true" alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our model improves both translation and alignment qualities significantly over the large-vocabulary neural machine translation system, and even beats a state-of-the-art traditional syntax-based system.Comment: 6 pages. In Proceedings of EMNLP 2016. arXiv admin note: text overlap with arXiv:1605.0314

arXiv.org e-Print Archive

Crossref