1,502 research outputs found

    ๋ฌธ๋งฅ ์ธ์‹๊ธฐ๋ฐ˜์˜ ๋ฌธ์„œ ๋‹จ์œ„ ์‹ ๊ฒฝ๋ง ๊ธฐ๊ณ„ ๋ฒˆ์—ญ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2022.2. ์ •๊ต๋ฏผ.The neural machine translation (NMT) has attracted great attention in recent years, as it has yielded state-of-the-art translation quality. Despite of their promising results, many current NMT systems are sentence-level; translating each sentence independently. This ignores contexts on text thus producing inadequate and inconsistent translations at the document-level. To overcome the shortcomings, the context-aware NMT (CNMT) has been proposed that takes contextual sentences as input. This dissertation proposes novel methods for improving the CNMT system and an application of CNMT. We first tackle the efficient modeling of multiple contextual sentences on CNMT encoder. For this purpose, we propose a hierarchical context encoder that encodes contextual sentences from token-level to sentence-level. This novel architecture enables the model to achieve state-of-the-art performance on translation quality while taking less computation time on training and translation than existing methods. Secondly, we investigate the training method for CNMT models, where most models rely on negative log-likelihood (NLL) that do not fully exploit contextual dependencies. To overcome the insufficiency, we introduce coreference-based contrastive learning for CNMT that generates contrastive examples from coreference chains between the source and target sentences. The proposed method improves pronoun resolution accuracy of CNMT models, as well as overall translation quality. Finally, we investigate an application of CNMT on dealing with Korean honorifics which depends on contextual information for generating adequate translations. For the English-Korean translation task, we propose to use CNMT models that capture crucial contextual information on the English source document and adopt a context-aware post-editing system for exploiting contexts on Korean target sentences, resulting in more consistent Korean honorific translations.์‹ ๊ฒฝ๋ง ๊ธฐ๊ณ„๋ฒˆ์—ญ ๊ธฐ๋ฒ•์€ ์ตœ๊ทผ ๋ฒˆ์—ญ ํ’ˆ์งˆ์— ์žˆ์–ด์„œ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์ด๋ฃฉํ•˜์—ฌ ๋งŽ์€ ์ฃผ๋ชฉ์„ ๋ฐ›๊ณ  ์žˆ๋‹ค. ๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ํ˜„์žฌ ๋Œ€๋ถ€๋ถ„์˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ ์‹œ์Šคํ…œ์€ ํ…์ŠคํŠธ๋ฅผ ๋…๋ฆฝ๋œ ๋ฌธ์žฅ ๋‹จ์œ„๋กœ ๋ฒˆ์—ญ์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ…์ŠคํŠธ์— ์กด์žฌํ•˜๋Š” ๋ฌธ๋งฅ์„ ๋ฌด์‹œํ•˜๊ณ  ๊ฒฐ๊ตญ ๋ฌธ์„œ ๋‹จ์œ„๋กœ ํŒŒ์•…ํ–ˆ์„ ๋•Œ ์ ์ ˆํ•˜์ง€ ์•Š์€ ๋ฒˆ์—ญ๋ฌธ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค. ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ์ฃผ๋ณ€ ๋ฌธ์žฅ์„ ๋™์‹œ์— ๊ณ ๋ คํ•˜๋Š” ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ ๊ธฐ๋ฒ•์ด ์ œ์•ˆ๋˜๊ณ  ์žˆ๋‹ค. ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฒ•๋“ค๊ณผ ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ ๊ธฐ๋ฒ•์˜ ํ™œ์šฉ ๋ฐฉ์•ˆ์„ ์ œ์‹œํ•œ๋‹ค. ๋จผ์ € ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฌธ๋งฅ ๋ฌธ์žฅ๋“ค์„ ํšจ๊ณผ์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์œ„ํ•ด ๋ฌธ๋งฅ ๋ฌธ์žฅ๋“ค์„ ํ† ํฐ ๋ ˆ๋ฒจ ๋ฐ ๋ฌธ์žฅ ๋ ˆ๋ฒจ๋กœ ๋‹จ๊ณ„์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๊ณ„์ธต์  ๋ฌธ๋งฅ ์ธ์ฝ”๋”๋ฅผ ์ œ์‹œํ•˜์˜€๋‹ค. ์ œ์‹œ๋œ ๋ชจ๋ธ์€ ๊ธฐ์กด ๋ชจ๋ธ๋“ค๊ณผ ๋น„๊ตํ•˜์—ฌ ๊ฐ€์žฅ ์ข‹์€ ๋ฒˆ์—ญ ํ’ˆ์งˆ์„ ์–ป์œผ๋ฉด์„œ ๋™์‹œ์— ํ•™์Šต ๋ฐ ๋ฒˆ์—ญ์— ๊ฑธ๋ฆฌ๋Š” ์—ฐ์‚ฐ ์‹œ๊ฐ„์„ ๋‹จ์ถ•ํ•˜์˜€๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ๋Š” ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ๋ชจ๋ธ์˜ ํ•™์Šต ๋ฐฉ๋ฒ•์„ ๊ฐœ์„ ํ•˜๊ณ ์ž ํ•˜์˜€๋Š”๋ฐ ์ด๋Š” ๊ธฐ์กด ์—ฐ๊ตฌ์—์„œ๋Š” ๋ฌธ๋งฅ์— ๋Œ€ํ•œ ์˜์กด ๊ด€๊ณ„๋ฅผ ์ „๋ถ€ ํ™œ์šฉํ•˜์ง€ ๋ชปํ•˜๋Š” ์ „ํ†ต์ ์ธ ์Œ์˜ ๋กœ๊ทธ์šฐ๋„ ์†์‹คํ•จ์ˆ˜์— ์˜์กดํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ์ด๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ๋ชจ๋ธ์„ ์œ„ํ•œ ์ƒํ˜ธ์ฐธ์กฐ์— ๊ธฐ๋ฐ˜ํ•œ ๋Œ€์กฐํ•™์Šต ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์ œ์‹œ๋œ ๊ธฐ๋ฒ•์€ ์›๋ฌธ๊ณผ ์ฃผ๋ณ€ ๋ฌธ๋งฅ ๋ฌธ์žฅ๋“ค ์‚ฌ์ด์— ์กด์žฌํ•˜๋Š” ์ƒํ˜ธ์ฐธ์กฐ ์‚ฌ์Šฌ์„ ํ™œ์šฉํ•˜์—ฌ ๋Œ€์กฐ ์‚ฌ๋ก€๋ฅผ ์ƒ์„ฑํ•˜๋ฉฐ, ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ ๋ชจ๋ธ๋“ค์˜ ์ „๋ฐ˜์ ์ธ ๋ฒˆ์—ญ ํ’ˆ์งˆ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋Œ€๋ช…์‚ฌ ํ•ด๊ฒฐ ์„ฑ๋Šฅ๋„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ๋Š” ๋งฅ๋ฝ ์ •๋ณด๊ฐ€ ํ•„์š”ํ•œ ํ•œ๊ตญ์–ด ๊ฒฝ์–ด์ฒด ๋ฒˆ์—ญ์— ์žˆ์–ด์„œ ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ ๊ธฐ๋ฒ•์˜ ํ™œ์šฉ ๋ฐฉ์•ˆ์— ๋Œ€ํ•ด์„œ๋„ ์—ฐ๊ตฌํ•˜์˜€๋‹ค. ์ด์— ์˜์–ด-ํ•œ๊ตญ์–ด ๋ฒˆ์—ญ ๋ฌธ์ œ์— ๋ฌธ๋งฅ ์ธ์‹ ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง ๋ฒˆ์—ญ ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ ์˜์–ด ์›๋ฌธ์—์„œ ํ•„์ˆ˜์ ์ธ ๋งฅ๋ฝ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ํ•œํŽธ ํ•œ๊ตญ์–ด ๋ฒˆ์—ญ๋ฌธ์—์„œ๋„ ๋ฌธ๋งฅ ์ธ์‹ ์‚ฌํ›„ํŽธ์ง‘ ์‹œ์Šคํ…œ์„ ํ™œ์šฉํ•˜์—ฌ ๋ณด๋‹ค ์ผ๊ด€๋œ ํ•œ๊ตญ์–ด ๊ฒฝ์–ด์ฒด ํ‘œํ˜„์„ ๋ฒˆ์—ญํ•˜๋„๋ก ๊ฐœ์„ ํ•˜๋Š” ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค.Abstract i Contents ii List of Tables vi List of Figures viii 1 Introduction 1 2 Background: Neural Machine Translation 7 2.1 A Brief History 7 2.2 Problem Setup 9 2.3 Encoder-Decoder architectures 10 2.3.1 RNN-based Architecture 11 2.3.2 SAN-based Architecture 13 2.4 Training 16 2.5 Decoding 16 2.6 Evaluation 17 3 Efficient Hierarchical Architecture for Modeling Contextual Sentences 18 3.1 Related works 20 3.1.1 Modeling Context in NMT 20 3.1.2 Hierarchical Context Modeling 21 3.1.3 Evaluation of Context-aware NMT 21 3.2 Model description 22 3.2.1 Context-aware NMT encoders 22 3.2.2 Hierarchical context encoder 27 3.3 Data 28 3.3.1 English-German IWSLT 2017 corpus 29 3.3.2 OpenSubtitles corpus 29 3.3.3 English-Korean subtitle corpus 31 3.4 Experiments 31 3.4.1 Hyperparameters and Training details 31 3.4.2 Overall BLEU evaluation 32 3.4.3 Model complexity analysis 32 3.4.4 BLEU evaluation on helpful/unhelpful context 34 3.4.5 EnKo pronoun resolution test suite 35 3.4.6 Qualitative Analysis 37 3.5 Summary of Efficient Hierarchical Architecture for Modeling Contextual Sentences 43 4 Contrastive Learning for Context-aware Neural Machine Translation 44 4.1 Related Works 46 4.1.1 Context-aware NMT Architectures 46 4.1.2 Coreference and NMT 47 4.1.3 Data augmentation for NMT 47 4.1.4 Contrastive Learning 47 4.2 Context-aware NMT models 48 4.3 Our Method: CorefCL 50 4.3.1 Data Augmentation Using Coreference 50 4.3.2 Contrastive Learning for Context-aware NMT 52 4.4 Experiments 53 4.4.1 Datasets 53 4.4.2 Settings 54 4.4.3 Overall BLEU Evaluation 55 4.4.4 Results on English-German Contrastive Evaluation Set 57 4.4.5 Analysis 58 4.5 Summary of Contrastive Learning for Context-aware Neural Machine Translation 59 5 Improving English-Korean Honorific Translation Using Contextual Information 60 5.1 Related Works 63 5.1.1 Neural Machine Translation dealing with Korean 63 5.1.2 Controlling the Styles in NMT 63 5.1.3 Context-Aware NMT Framework and Application 64 5.2 Addressing Korean Honorifics in Context 65 5.2.1 Overview of Korean Honorifics System 65 5.2.2 The Role of Context on Choosing Honorifics 68 5.3 Context-Aware NMT Frameworks 69 5.3.1 NMT Model with Contextual Encoders 71 5.3.2 Context-Aware Post Editing (CAPE) 71 5.4 Our Proposed Method - Context-Aware NMT for Korean Honorifics 73 5.4.1 Using CNMT methods for Honorific-Aware Translation 74 5.4.2 Scope of Honorific Expressions 75 5.4.3 Automatic Honorific Labeling 76 5.5 Experiments 77 5.5.1 Dataset and Preprocessing 77 5.5.2 Model Implementation and Training Details 80 5.5.3 Metrics 80 5.5.4 Results 81 5.5.5 Translation Examples and Analysis 86 5.6 Summary of Improving English-Korean Honorific Translation Using Contextual Information 89 6 Future Directions 91 6.1 Document-level Datasets 91 6.2 Document-level Evaluation 92 6.3 Bias and Fairness of Document-level NMT 93 6.4 Towards Practical Applications 94 7 Conclusions 96 Abstract (In Korean) 117 Acknowledgment 119๋ฐ•

    Context-Aware Neural Machine Translation Learns Anaphora Resolution

    Get PDF
    Standard machine translation systems process sentences in isolation and hence ignore extra-sentential information, even though extended context can both prevent mistakes in ambiguous cases and improve translation coherence. We introduce a context-aware neural machine translation model designed in such way that the flow of information from the extended context to the translation model can be controlled and analyzed. We experiment with an English-Russian subtitles dataset, and observe that much of what is captured by our model deals with improving pronoun translation. We measure correspondences between induced attention distributions and coreference relations and observe that the model implicitly captures anaphora. It is consistent with gains for sentences where pronouns need to be gendered in translation. Beside improvements in anaphoric cases, the model also improves in overall BLEU, both over its context-agnostic version (+0.7) and over simple concatenation of the context and source sentences (+0.6).Comment: ACL 201

    Dynamic Entity Representations in Neural Language Models

    Full text link
    Understanding a long document requires tracking how entities are introduced and evolve over time. We present a new type of language model, EntityNLM, that can explicitly model entities, dynamically update their representations, and contextually generate their mentions. Our model is generative and flexible; it can model an arbitrary number of entities in context while generating each entity mention at an arbitrary length. In addition, it can be used for several different tasks such as language modeling, coreference resolution, and entity prediction. Experimental results with all these tasks demonstrate that our model consistently outperforms strong baselines and prior work.Comment: EMNLP 2017 camera-ready versio

    Selective Attention for Context-aware Neural Machine Translation

    Full text link
    Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document. Recent works in context-aware NMT consider only a few previous sentences as context and may not scale to entire documents. To this end, we propose a novel and scalable top-down approach to hierarchical attention for context-aware NMT which uses sparse attention to selectively focus on relevant sentences in the document context and then attends to key words in those sentences. We also propose single-level attention approaches based on sentence or word-level information in the context. The document-level context representation, produced from these attention modules, is integrated into the encoder or decoder of the Transformer model depending on whether we use monolingual or bilingual context. Our experiments and evaluation on English-German datasets in different document MT settings show that our selective attention approach not only significantly outperforms context-agnostic baselines but also surpasses context-aware baselines in most cases.Comment: Accepted at NAACL-HLT 201

    Implicit Argument Prediction as Reading Comprehension

    Full text link
    Implicit arguments, which cannot be detected solely through syntactic cues, make it harder to extract predicate-argument tuples. We present a new model for implicit argument prediction that draws on reading comprehension, casting the predicate-argument tuple with the missing argument as a query. We also draw on pointer networks and multi-hop computation. Our model shows good performance on an argument cloze task as well as on a nominal implicit argument prediction task.Comment: Accepted at AAAI 201
    • โ€ฆ
    corecore