7,885 research outputs found
Automatic evaluation of generation and parsing for machine translation with automatically acquired transfer rules
This paper presents a new method of evaluation for generation and parsing components of transfer-based MT systems where the transfer rules have been automatically
acquired from parsed sentence-aligned bitext corpora. The method provides a means of quantifying the upper bound imposed on the MT system by the quality of the parsing
and generation technologies for the target language. We include experiments to calculate this upper bound for both handcrafted and automatically induced parsing and generation technologies currently in use by transfer-based MT systems
Results of the WMT19 metrics shared task: segment-level and strong MT systems pose big challenges
This paper presents the results of the WMT19 Metrics Shared Task. Participants were asked to score the outputs of the translations systems competing in the WMT19 News Translation Task with automatic metrics. 13 research groups submitted 24 metrics, 10 of which are reference-less "metrics" and constitute submissions to the joint task with WMT19 Quality Estimation Task, "QE as a Metric". In addition, we computed 11 baseline metrics, with 8 commonly applied baselines (BLEU, SentBLEU, NIST, WER, PER, TER, CDER, and chrF) and 3 reimplementations (chrF+, sacreBLEU-BLEU, and sacreBLEU-chrF). Metrics were evaluated on the system level, how well a given metric correlates with the WMT19 official manual ranking, and segment level, how well the metric correlates with human judgements of segment quality. This year, we use direct assessment (DA) as our only form of manual evaluation
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
We propose a method to transfer knowledge across neural machine translation
(NMT) models by means of a shared dynamic vocabulary. Our approach allows to
extend an initial model for a given language pair to cover new languages by
adapting its vocabulary as long as new data become available (i.e., introducing
new vocabulary items if they are not included in the initial model). The
parameter transfer mechanism is evaluated in two scenarios: i) to adapt a
trained single language NMT system to work with a new language pair and ii) to
continuously add new language pairs to grow to a multilingual NMT system. In
both the scenarios our goal is to improve the translation performance, while
minimizing the training convergence time. Preliminary experiments spanning five
languages with different training data sizes (i.e., 5k and 50k parallel
sentences) show a significant performance gain ranging from +3.85 up to +13.63
BLEU in different language directions. Moreover, when compared with training an
NMT model from scratch, our transfer-learning approach allows us to reach
higher performance after training up to 4% of the total training steps.Comment: Published at the International Workshop on Spoken Language
Translation (IWSLT), 201
Language Model Bootstrapping Using Neural Machine Translation For Conversational Speech Recognition
Building conversational speech recognition systems for new languages is
constrained by the availability of utterances that capture user-device
interactions. Data collection is both expensive and limited by the speed of
manual transcription. In order to address this, we advocate the use of neural
machine translation as a data augmentation technique for bootstrapping language
models. Machine translation (MT) offers a systematic way of incorporating
collections from mature, resource-rich conversational systems that may be
available for a different language. However, ingesting raw translations from a
general purpose MT system may not be effective owing to the presence of named
entities, intra sentential code-switching and the domain mismatch between the
conversational data being translated and the parallel text used for MT
training. To circumvent this, we explore the following domain adaptation
techniques: (a) sentence embedding based data selection for MT training, (b)
model finetuning, and (c) rescoring and filtering translated hypotheses. Using
Hindi as the experimental testbed, we translate US English utterances to
supplement the transcribed collections. We observe a relative word error rate
reduction of 7.8-15.6%, depending on the bootstrapping phase. Fine grained
analysis reveals that translation particularly aids the interaction scenarios
which are underrepresented in the transcribed data.Comment: Accepted by IEEE ASRU workshop, 201
- ā¦