Search CORE

5,386 research outputs found

Towards a Multi-Objective Corpus for Vietnamese Language

Author: Hoang Kiem
Huynh Bao Toan
Le Hoai Bac
Nguyen Duc Hoang Ha
Pham Nam Trung
Vu Hai Quan
Publication venue: COLIPS PUBLICATIONS
Publication date: 01/01/2003
Field of study

Sequence to Sequence Mixture Model for Diverse Machine Translation

Author: Haffari Gholamreza
He Xuanli
Norouzi Mohammad
Publication venue
Publication date: 01/01/2018
Field of study

Sequence to sequence (SEQ2SEQ) models often lack diversity in their generated translations. This can be attributed to the limitation of SEQ2SEQ models in capturing lexical and syntactic variations in a parallel corpus resulting from different styles, genres, topics, or ambiguity of the translation process. In this paper, we develop a novel sequence to sequence mixture (S2SMIX) model that improves both translation diversity and quality by adopting a committee of specialized translation models rather than a single translation model. Each mixture component selects its own training dataset via optimization of the marginal loglikelihood, which leads to a soft clustering of the parallel corpus. Experiments on four language pairs demonstrate the superiority of our mixture model compared to a SEQ2SEQ baseline with standard or diversity-boosted beam search. Our mixture model uses negligible additional parameters and incurs no extra computation cost during decoding.Comment: 11 pages, 5 figures, accepted to CoNLL201

arXiv.org e-Print Archive

Crossref

Monash University Research Portal

Masked Language Model Scoring

Author: Kirchhoff Katrin
Liang Davis
Nguyen Toan Q.
Salazar Julian
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one. We show that PLLs outperform scores from autoregressive language models like GPT-2 in a variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an end-to-end LibriSpeech model's WER by 30% relative and adds up to +1.7 BLEU on state-of-the-art baselines for low-resource translation pairs, with further gains from domain adaptation. We attribute this success to PLL's unsupervised expression of linguistic acceptability without a left-to-right bias, greatly improving on scores from GPT-2 (+10 points on island effects, NPI licensing in BLiMP). One can finetune MLMs to give scores without masking, enabling computation in a single inference pass. In all, PLLs and their associated pseudo-perplexities (PPPLs) enable plug-and-play use of the growing number of pretrained MLMs; e.g., we use a single cross-lingual model to rescore translations in multiple languages. We release our library for language model scoring at https://github.com/awslabs/mlm-scoring.Comment: ACL 2020 camera-ready (presented July 2020

arXiv.org e-Print Archive

Crossref