104,898 research outputs found

    Corpus Augmentation by Sentence Segmentation for Low-Resource Neural Machine Translation

    Full text link
    Neural Machine Translation (NMT) has been proven to achieve impressive results. The NMT system translation results depend strongly on the size and quality of parallel corpora. Nevertheless, for many language pairs, no rich-resource parallel corpora exist. As described in this paper, we propose a corpus augmentation method by segmenting long sentences in a corpus using back-translation and generating pseudo-parallel sentence pairs. The experiment results of the Japanese-Chinese and Chinese-Japanese translation with Japanese-Chinese scientific paper excerpt corpus (ASPEC-JC) show that the method improves translation performance.Comment: 4 pages. The version before Applied. Science

    Chinese–Spanish neural machine translation enhanced with character and word bitmap fonts

    Get PDF
    Recently, machine translation systems based on neural networks have reached state-of-the-art results for some pairs of languages (e.g., German–English). In this paper, we are investigating the performance of neural machine translation in Chinese–Spanish, which is a challenging language pair. Given that the meaning of a Chinese word can be related to its graphical representation, this work aims to enhance neural machine translation by using as input a combination of: words or characters and their corresponding bitmap fonts. The fact of performing the interpretation of every word or character as a bitmap font generates more informed vectorial representations. Best results are obtained when using words plus their bitmap fonts obtaining an improvement (over a competitive neural MT baseline system) of almost six BLEU, five METEOR points and ranked coherently better in the human evaluation.Peer ReviewedPostprint (published version

    Neural machine translation using bitmap fonts

    Get PDF
    Recently, translation systems based on neural networks are starting to compete with systems based on phrases. The systems which are based on neural networks use vectorial repre- sentations of words. However, one of the biggest challenges that machine translation still faces, is dealing with large vocabularies and morphologically rich languages. This work aims to adapt a neural machine translation system to translate from Chinese to Spanish, using as input different types of granularity: words, characters, bitmap fonts of Chinese characters or words. The fact of performing the interpretation of every character or word as a bitmap font allows for obtaining more informed vectorial representations. Best results are obtained when using the information of the word bitmap font.Postprint (published version

    Neural System Combination for Machine Translation

    Full text link
    Neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this paper, we propose a neural system combination framework leveraging multi-source NMT, which takes as input the outputs of NMT and SMT systems and produces the final translation. Extensive experiments on the Chinese-to-English translation task show that our model archives significant improvement by 5.3 BLEU points over the best single system output and 3.4 BLEU points over the state-of-the-art traditional system combination methods.Comment: Accepted as a short paper by ACL-201

    A Client mobile application for Chinese-Spanish statistical machine translation

    Get PDF
    This show and tell paper describes a client mobile application for Chinese-Spanish machine translation. The system combines a standard server-based statistical machine translation (SMT) system, which requires online operation, with different input modalities including text, optical character recognition (OCR) and automatic speech recognition (ASR). It also includes an index-based search engine for supporting off-line translation.Postprint (published version

    Supervised Attentions for Neural Machine Translation

    Full text link
    In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the "true" alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our model improves both translation and alignment qualities significantly over the large-vocabulary neural machine translation system, and even beats a state-of-the-art traditional syntax-based system.Comment: 6 pages. In Proceedings of EMNLP 2016. arXiv admin note: text overlap with arXiv:1605.0314
    • …
    corecore