4 research outputs found

    Neural pre-translation for hybrid machine translation

    Get PDF
    Hybrid machine translation (HMT) takes advantage of different types of machine translation (MT) systems to improve translation performance. Neural machine translation (NMT) can produce more fluent translations while phrase-based statistical machine translation (PB-SMT) can produce adequate results primarily due to the contribution of the translation model. In this paper, we propose a cascaded hybrid framework to combine NMT and PB-SMT to improve translation quality. Specifically, we first use the trained NMT system to pre-translate the training data, and then employ the pre-translated training data to build an SMT system and tune parameters using the pre-translated development set. Finally, the SMT system is utilised as a post-processing step to re-decode the pre-translated test set and produce the final result. Experiments conducted on Japanese!English and Chinese!English show that the proposed cascaded hybrid framework can significantly improve performance by 2.38 BLEU points and 4.22 BLEU points, respectively, compared to the baseline NMT system

    Korean Sentence Complexity Reduction for Machine Translation

    Get PDF
    학위논문 (석사)-- 서울대학교 대학원 : 언어학과, 2017. 2. 신효필.Text simplification used as a preprocessing task for the improved functionality of natural language processing systems has a long history of research based on European languages, yet, there is no research that has utilized Korean as the object of study. However, there is great demand for comprehensible Korean to English machine translations, yet due to the disparate nature of these two languages, machine translation often fails to achieve fluent results. In order to improve the translation quality of Korean text as the source language, the first-ever rule-based Korean complexity reduction system was designed, constructed, and implemented in this study. This system was achieved by a unique technique termed "phrase-grouping and generalization of nuance structures," in Korean as a disambiguation tool. This technique has potential applications in all languages and additional natural language processing tasks. On top of this, in order to set a foundation for which complexity reduction operations and combinations generate fluent Korean and improved machine translation output, a unique factorial approach to simplification generation was also implemented. In order to assess the output of the system proposed in the current research, the parallel evaluation of simplified Korean text by Korean native speakers and the evaluation of translations by English native speakers was conducted. The translation systems used in this study were Google Translate and Moses, both statistical machine translation systems, and Naver Translate, a neural machine translation system. This is the first research to conduct experiments on the interaction of text simplification and neural networks. Additionally, no known research has analyzed output from three machine translation systems simultaneously. Generally, the proposed system generated relatively fluent Korean, though due to the factorial nature by which simplifications were generated, sentence quality usually began to deteriorate after more than one simplification operation. On the other hand, the proposed system as a preprocessing task for machine translation consistently improved translation quality for all three systems utilized in this study by up to two performed simplifications. In the case of the statistical machine translation systems used in this study, more than two simplifications deteriorated not only Korean sentence quality, but also translation quality. However, in the case of Naver Translate, the neural machine translation system used in this study, even three simplifications resulted in translation improvement according to the evaluators. This study, then, emphasizes the need for more research conducted on text simplification as the field of natural language processing transitions to neural network-based approaches and applications.1. Introduction: Text Simplification 1 1.1 Korean Text Simplification 4 1.1.1 Korean Sentence Complexity Reduction 4 1.2 Research Objectives 5 1.2 Research Outline 6 2. Literature Review 8 2.1 Text Simplification 8 2.2 Machine Translation 12 2.2.1 Rule-based Machine Translation 12 2.2.2 Statistical Machine Translation 13 2.2.3 Neural Machine Translation 15 2.2.4 Hybrid Machine Translation 17 2.3 Text Simplification for Machine Translation 18 2.4 Automated Asian Text Simplification 21 2.4.1 Automated Japanese Simplification 22 2.4.2 Automated Korean Simplification 22 3. Samsung Machine Translation Corpus 24 3.1 Corpus Description 24 3.2 Corpus Issues 26 3.2.1 Korean Issues 26 3.2.1 English Issues 27 4. Korean Sentence Complexity Reduction System 29 4.1 Rule Creation and Description 30 4.1.1 Korean Coordination Simplification 30 4.1.2 Contrastive Coordination Simplification 34 4.1.3 Indirect Sentence Reduction 35 4.1.4 Gerund Reduction 38 4.1.5 Cause and Effect Reduction 40 4.2 Factorial Reduction 42 4.3 Phrase-grouping and Generalization 45 4.4 System Coverage 47 4.5 System Architecture 50 4.6 System Evaluation 52 5. Pilot Study 55 5.1 Methodology 55 5.2 Simplification Results 58 5.3 Simplification + Machine Translation Results 62 5.4 Pilot Study Discussion 69 6. Experiment 71 6.1 Methodology 71 6.2 Simplification Results 72 6.3 Simplification + Machine Translation Results 75 6.4 Naver Translate 84 6.5 Experiment Discussion 88 7. Conclusion 91 References 94 Abstract in Korean 99Maste

    構文構造に基づくニューラル機械翻訳

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学准教授 鶴岡 慶雅, 東京大学教授 峯松 信明, 東京大学准教授 山崎 俊彦, 東京大学講師 齋藤 大輔, 東京大学准教授 吉永 直樹University of Tokyo(東京大学
    corecore