13 research outputs found
Latent Anaphora Resolution for Cross-Lingual Pronoun Prediction
This paper addresses the task of predicting the correct French translations of third-person subject pronouns in English discourse, a problem that is relevant as a prerequisite for machine translation and that requires anaphora resolution. We present an approach based on neural networks that models anaphoric links as latent variables and show that its performance is competitive with that of a system with separate anaphora resolution while not requiring any coreference-annotated training data. This demonstrates that the information contained in parallel bitexts can successfully be used to acquire knowledge about pronominal anaphora in an unsupervised way
Parallel Data Helps Neural Entity Coreference Resolution
Coreference resolution is the task of finding expressions that refer to the
same entity in a text. Coreference models are generally trained on monolingual
annotated data but annotating coreference is expensive and challenging.
Hardmeier et al.(2013) have shown that parallel data contains latent anaphoric
knowledge, but it has not been explored in end-to-end neural models yet. In
this paper, we propose a simple yet effective model to exploit coreference
knowledge from parallel data. In addition to the conventional modules learning
coreference from annotations, we introduce an unsupervised module to capture
cross-lingual coreference knowledge. Our proposed cross-lingual model achieves
consistent improvements, up to 1.74 percentage points, on the OntoNotes 5.0
English dataset using 9 different synthetic parallel datasets. These
experimental results confirm that parallel data can provide additional
coreference knowledge which is beneficial to coreference resolution tasks.Comment: camera-ready version; to appear in the Findings of ACL 202
A Document-Level SMT System with Integrated Pronoun Prediction
This paper describes one of Uppsala University’s submissions to the pronoun-focused machine translation (MT) shared task at DiscoMT 2015. The system is based on phrase-based statistical MT implemented with the document-level decoder Docent. It includes a neural network for pronoun prediction trained with latent anaphora resolution. At translation time, coreference information is obtained from the Stanford CoreNLP system
Anaphora Models and Reordering for Phrase-Based SMT
We describe the Uppsala University systems for WMT14. We look at the integration of a model for translating pronominal anaphora and a syntactic dependency projection model for English–French. Furthermore, we investigate post-ordering and tunable POS distortion models for English–German
Context-Aware Neural Machine Translation Learns Anaphora Resolution
Standard machine translation systems process sentences in isolation and hence
ignore extra-sentential information, even though extended context can both
prevent mistakes in ambiguous cases and improve translation coherence. We
introduce a context-aware neural machine translation model designed in such way
that the flow of information from the extended context to the translation model
can be controlled and analyzed. We experiment with an English-Russian subtitles
dataset, and observe that much of what is captured by our model deals with
improving pronoun translation. We measure correspondences between induced
attention distributions and coreference relations and observe that the model
implicitly captures anaphora. It is consistent with gains for sentences where
pronouns need to be gendered in translation. Beside improvements in anaphoric
cases, the model also improves in overall BLEU, both over its context-agnostic
version (+0.7) and over simple concatenation of the context and source
sentences (+0.6).Comment: ACL 201
Analysing concatenation approaches to document-level NMT in two different domains
In this paper, we investigate how different aspects of discourse context affect the performance of recent neural MT systems. We describe two popular datasets covering news and movie subtitles and we provide a thorough analysis of the distribution of various document-level features in their domains. Furthermore, we train a set of context-aware MT models on both datasets and propose a comparative evaluation scheme that contrasts coherent context with artificially scrambled documents and absent context, arguing that the impact of discourse-aware MT models will become visible in this way. Our results show that the models are indeed affected by the manipulation of the test data, providing a different view on document-level translation quality than absolute sentence-level scores.Peer reviewe
Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation
We describe the design, the evaluation setup, and the results of the DiscoMT 2015 shared task, which included two subtasks, relevant to both the machine translation (MT) and the discourse communities: (i) pronoun-focused translation, a practical MT task, and (ii) cross-lingual pronoun prediction, a classification task that requires no specific MT expertise and is interesting as a machine learning task in its own right. We focused on the English–French language pair, for which MT output is generally of high quality, but has visible issues with pronoun translation due to differences in the pronoun systems of the two languages. Six groups participated in the pronoun-focused translation task and eight groups in the cross-lingual pronoun prediction task
Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemmatised and PoS-tagged form. We provided four subtasks, for the English–French and English–German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English–French subtask, five for French–English, nine for English–German, and six for German–English. Most of the submissions outperformed two strong language-model-based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs