Search CORE

108 research outputs found

Confidence score estimation for English to Greek machine translation

Author: Orfanoudakis Dimitrios
Publication venue
Publication date: 04/06/2020
Field of study

International Hellenic University: IHU Open Access Repository

Mismatching-Aware Unsupervised Translation Quality Estimation For Low-Resource Languages

Author: Azadi Fatemeh
Dousti Mohammad Javad
Faili Heshaam
Publication venue
Publication date: 12/08/2023
Field of study

Translation Quality Estimation (QE) is the task of predicting the quality of machine translation (MT) output without any reference. This task has gained increasing attention as an important component in the practical applications of MT. In this paper, we first propose XLMRScore, which is a cross-lingual counterpart of BERTScore computed via the XLM-RoBERTa (XLMR) model. This metric can be used as a simple unsupervised QE method, while employing it results in two issues: firstly, the untranslated tokens leading to unexpectedly high translation scores, and secondly, the issue of mismatching errors between source and hypothesis tokens when applying the greedy matching in XLMRScore. To mitigate these issues, we suggest replacing untranslated words with the unknown token and the cross-lingual alignment of the pre-trained model to represent aligned words closer to each other, respectively. We evaluate the proposed method on four low-resource language pairs of WMT21 QE shared task, as well as a new English-Farsi test dataset introduced in this paper. Experiments show that our method could get comparable results with the supervised baseline for two zero-shot scenarios, i.e., with less than 0.01 difference in Pearson correlation, while outperforming unsupervised rivals in all the low-resource language pairs for above 8%, on average.Comment: Submitted to Language Resources and Evaluatio

arXiv.org e-Print Archive

Findings of the 2017 Conference on Machine Translation

Author: Bojar Ondřej
Chatterjee Rajen
Federmann Christian
Graham Yvette
Haddow Barry
Huang Shujian
Huck Matthias
Koehn Philipp
Liu Qun
Logacheva Varvara
Monz Christof
Negri Matteo
Post Matt
Rubino Raphael
Specia Lucia
Turchi Marco
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

This paper presents the results of the WMT17 shared tasks, which included three machine translation (MT) tasks (news, biomedical, and multimodal), two evaluation tasks (metrics and run-time estimation of MT quality), an automatic post-editing task, a neural MT training task, and a bandit learning task

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Irish Universities

Edinburgh Research Explorer

Biblio at Institute of Formal and Applied Linguistics

DCU Online Research Access Service

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

ARTS repository - University of Groningen

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

Dissertations of the University of Groningen

Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

Author: Edman Lukas
Noord van, Gertjan
Toral Ruiz Antonio
Publication venue
Publication date: 01/01/2020
Field of study

Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Tailoring Domain Adaptation for Machine Translation Quality Estimation

Author: Blain Frédéric
Emmery Chris
Sharami Javad Pourmostafa Roshan
Shterionov Dimitar
Sisto Mirella De
Spronck Pieter
Vanmassenhove Eva
Publication venue
Publication date: 18/04/2023
Field of study

While quality estimation (QE) can play an important role in the translation process, its effectiveness relies on the availability and quality of training data. For QE in particular, high-quality labeled data is often lacking due to the high cost and effort associated with labeling such data. Aside from the data scarcity challenge, QE models should also be generalizable, i.e., they should be able to handle data from different domains, both generic and specific. To alleviate these two main issues -- data scarcity and domain mismatch -- this paper combines domain adaptation and data augmentation within a robust QE system. Our method first trains a generic QE model and then fine-tunes it on a specific domain while retaining generic knowledge. Our results show a significant improvement for all the language pairs investigated, better cross-lingual inference, and a superior performance in zero-shot learning scenarios as compared to state-of-the-art baselines

Tilburg University Repository

Ti plasmids

Author: Van Montagu Marc
Publication venue: 'Elsevier BV'
Publication date: 01/01/2001
Field of study

Ghent University Academic Bibliography