4,871 research outputs found
A Novel Approach to Dropped Pronoun Translation
Dropped Pronouns (DP) in which pronouns are frequently dropped in the source language but should be retained in the target language are challenge in machine translation. In response to this problem, we propose a semisupervised approach to recall possibly missing pronouns in the translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation is two-phase: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. Experimental results show that our approach achieves a significant improvement of 1.58 BLEU points in translation performance with 66% F-score for DP generation accuracy
A Novel and Robust Approach for Pro-Drop Language Translation
A significant challenge for machine translation (MT) is the phenomena of dropped pronouns (DPs), where certain classes of pronouns are frequently dropped in the source language but should be retained in the target language. In response to this common problem, we propose a semi-supervised approach with a universal framework to recall missing pronouns in translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation has two phases: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our statistical MT (SMT) system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. To validate the robustness of our approach, we investigate our approach on both Chinese–English and Japanese–English corpora extracted from movie subtitles. Compared with an SMT baseline system, experimental results show that our approach achieves a significant improvement of++1.58 BLEU points in translation performance with 66% F-score for DP generation accuracy for Chinese–English, and nearly++1 BLEU point with 58% F-score for Japanese–English. We believe that this work could help both MT researchers and industries to boost the performance of MT systems between pro-drop and non-pro-drop languages
Translating pro-drop languages with reconstruction models
Pronouns are frequently omitted in pro-drop languages, such as Chinese, generally leading to significant challenges with respect to the production of complete translations. To date, very little attention has been paid to the dropped pronoun (DP) problem within neural machine translation (NMT). In this work, we propose a novel reconstruction-based approach to alleviating DP translation problems for NMT models. Firstly, DPs within all source sentences are automatically annotated with parallel information extracted from the bilingual training corpus. Next, the annotated source sentence is reconstructed from hidden representations in the NMT model. With auxiliary training objectives, in the terms of reconstruction scores, the parameters associated with the NMT model are guided to produce enhanced hidden representations that are encouraged as much as possible to embed annotated DP information. Experimental results on both Chinese-English and Japanese-English dialogue translation tasks show that the proposed approach significantly and consistently improves translation performance over a strong NMT baseline, which is directly built on the training data annotated with DPs
Memory Networks
We describe a new class of learning models called memory networks. Memory
networks reason with inference components combined with a long-term memory
component; they learn how to use these jointly. The long-term memory can be
read and written to, with the goal of using it for prediction. We investigate
these models in the context of question answering (QA) where the long-term
memory effectively acts as a (dynamic) knowledge base, and the output is a
textual response. We evaluate them on a large-scale QA task, and a smaller, but
more complex, toy task generated from a simulated world. In the latter, we show
the reasoning power of such models by chaining multiple supporting sentences to
answer questions that require understanding the intension of verbs
Discourse-aware neural machine translation
Machine translation (MT) models usually translate a text by considering isolated sentences
based on a strict assumption that the sentences in a text are independent of one another.
However, it is a truism that texts have properties of connectedness that go beyond those of
their individual sentences. Disregarding dependencies across sentences will harm translation quality especially in terms of coherence, cohesion, and consistency. Previously,
some discourse-aware approaches have been investigated for conventional statistical machine translation (SMT). However, this is a serious obstacle for the state-of-the-art neural
machine translation (NMT), which recently has surpassed the performance of SMT.
In this thesis, we try to incorporate useful discourse information for enhancing NMT
models. More specifically, we conduct research on two main parts: 1) exploiting novel
document-level NMT architecture; and 2) dealing with a specific discourse phenomenon
for translation models.
Firstly, we investigate the influence of historical contextual information on the perfor-
mance of NMT models. A cross-sentence context-aware NMT model is proposed to consider the influence of previous sentences in the same document. Specifically, this history
is summarized using an additional hierarchical encoder. The historical representations are
then integrated into the standard NMT model in different strategies. Experimental results
on a Chinese–English document-level translation task show that the approach significantly
improves upon a strong attention-based NMT system by up to +2.1 BLEU points. In addition, analysis and comparison also give insightful discussions and conclusions for this
research direction.
Secondly, we explore the impact of discourse phenomena on the performance of MT.
In this thesis, we focus on the phenomenon of pronoun-dropping (pro-drop), where, in pro-drop languages, pronouns can be omitted when it is possible to infer the referent from the
context. As the data for training a dropped pronoun (DP) generator is scarce, we propose to
automatically annotate DPs using alignment information from a large parallel corpus. We
then introduce a hybrid approach: building a neural-based DP generator and integrating it
into the SMT model. Experimental results on both Chinese–English and Japanese–English
translation tasks demonstrate that our approach achieves a significant improvement of up to
+1.58 BLEU points with 66% F-score for DP generation accuracy.
Motivated by this promising result, we further exploit the DP translation approach for
advanced NMT models. A novel reconstruction-based model is proposed to reconstruct the
DP-annotated source sentence from the hidden states of either encoder or decoder, or both
components. Experimental results on the same translation tasks show that the proposed approach significantly and consistently improves translation performance over a strong NMT
baseline, which is trained on DP-annotated parallel data.
To avoid the errors propagated from an external DP prediction model, we finally investigate an end-to-end DP translation model. Specifically, we improve the reconstruction-based
model from three perspectives. We first employ a shared reconstructor to better exploit encoder and decoder representations. Secondly, we propose to jointly learn to translate and
predict DPs. In order to capture discourse information for DP prediction, we finally combine the hierarchical encoder with the DP translation model. Experimental results on the
same translation tasks show that our approach significantly improves both translation performance and DP prediction accuracy
- …