2,967 research outputs found
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation
This paper demonstrates that word sense disambiguation (WSD) can improve
neural machine translation (NMT) by widening the source context considered when
modeling the senses of potentially ambiguous words. We first introduce three
adaptive clustering algorithms for WSD, based on k-means, Chinese restaurant
processes, and random walks, which are then applied to large word contexts
represented in a low-rank space and evaluated on SemEval shared-task data. We
then learn word vectors jointly with sense vectors defined by our best WSD
method, within a state-of-the-art NMT system. We show that the concatenation of
these vectors, and the use of a sense selection mechanism based on the weighted
average of sense vectors, outperforms several baselines including sense-aware
ones. This is demonstrated by translation on five language pairs. The
improvements are above one BLEU point over strong NMT baselines, +4% accuracy
over all ambiguous nouns and verbs, or +20% when scored manually over several
challenging words.Comment: To appear in TAC
Multilingual Unsupervised Sentence Simplification
Progress in Sentence Simplification has been hindered by the lack of
supervised data, particularly in languages other than English. Previous work
has aligned sentences from original and simplified corpora such as English
Wikipedia and Simple English Wikipedia, but this limits corpus size, domain,
and language. In this work, we propose using unsupervised mining techniques to
automatically create training corpora for simplification in multiple languages
from raw Common Crawl web data. When coupled with a controllable generation
mechanism that can flexibly adjust attributes such as length and lexical
complexity, these mined paraphrase corpora can be used to train simplification
systems in any language. We further incorporate multilingual unsupervised
pretraining methods to create even stronger models and show that by training on
mined data rather than supervised corpora, we outperform the previous best
results. We evaluate our approach on English, French, and Spanish
simplification benchmarks and reach state-of-the-art performance with a totally
unsupervised approach. We will release our models and code to mine the data in
any language included in Common Crawl
- …