Search CORE

2 research outputs found

Sentence-level quality estimation by predicting HTER as a multi-component metric

Author: Avramidis Eleftherios
Publication venue
Publication date: 19/07/2017
Field of study

This submission investigates alternative machine learning models for predicting the HTER score on the sentence level. Instead of directly predicting the HTER score, we suggest a model that jointly predicts the amount of the 4 distinct post-editing operations, which are then used to calculate the HTER score. This also gives the possibility to correct invalid (e.g. negative) predicted values prior to the calculation of the HTER score. Without any feature exploration, a multi-layer perceptron with 4 outputs yields small but significant improvements over the baseline.Comment: Preview for the Quality Estimation Shared Task Description Paper for the 2nd Conference of Machine Translatio

arXiv.org e-Print Archive

Fine-grained evaluation of Quality Estimation for Machine translation based on a linguistically-motivated Test Suite

Author: Eleftherios Avramidis
Lommel Arle
Macketanz Vivien
Uszkoreit Hans
Publication venue
Publication date: 16/10/2019
Field of study

We present an alternative method of evaluating Quality Estimation systems, which is based on a linguistically-motivated Test Suite. We create a test-set consisting of 14 linguistic error categories and we gather for each of them a set of samples with both correct and erroneous translations. Then, we measure the performance of 5 Quality Estimation systems by checking their ability to distinguish between the correct and the erroneous translations. The detailed results are much more informative about the ability of each system. The fact that different Quality Estimation systems perform differently at various phenomena confirms the usefulness of the Test Suite

arXiv.org e-Print Archive