13 research outputs found
Taking MT evaluation metrics to extremes : beyond correlation with human judgments
Automatic Machine Translation (MT) evaluation is an active field of research, with a handful of new metrics devised every year. Evaluation metrics are generally benchmarked against manual assessment of translation quality, with performance measured in terms of overall correlation with human scores. Much work has been dedicated to the improvement of evaluation metrics to achieve a higher correlation with human judgments. However, little insight has been provided regarding the weaknesses and strengths of existing approaches and their behavior in different settings. In this work we conduct a broad meta-evaluation study of the performance of a wide range of evaluation metrics focusing on three major aspects. First, we analyze the performance of the metrics when faced with different levels of translation quality, proposing a local dependency measure as an alternative to the standard, global correlation coefficient. We show that metric performance varies significantly across different levels of MT quality: Metrics perform poorly when faced with low-quality translations and are not able to capture nuanced quality distinctions. Interestingly, we show that evaluating low-quality translations is also more challenging for humans. Second, we show that metrics are more reliable when evaluating neural MT than the traditional statistical MT systems. Finally, we show that the difference in the evaluation accuracy for different metrics is maintained even if the gold standard scores are based on different criteria
Consistent improvement with eculizumab across muscle groups in myasthenia gravis
Objective: To assess whether eculizumab, a terminal complement inhibitor, improves patient- and physician-reported outcomes (evaluated using the myasthenia gravis activities of daily living profile and the quantitative myasthenia gravis scale, respectively) in patients with refractory anti-acetylcholine receptor antibody-positive generalized myasthenia gravis across four domains, representing ocular, bulbar, respiratory, and limb/gross motor muscle groups. Methods: Patients with refractory anti-acetylcholine receptor antibody-positive generalized myasthenia gravis were randomized 1:1 to receive either placebo or eculizumab during the REGAIN study (NCT01997229). Patients who completed REGAIN were eligible to continue into the open-label extension trial (NCT02301624) for up to 4 years. The four domain scores of each of the myasthenia gravis activities of daily living profile and the quantitative myasthenia gravis scale recorded throughout REGAIN and through 130 weeks of the open-label extension were analyzed. Results: Of the 125 patients who participated in REGAIN, 117 enrolled in the open-label extension; 61 had received placebo and 56 had received eculizumab during REGAIN. Patients experienced rapid improvements in total scores and all four domain scores of both the myasthenia gravis activities of daily living profile and the quantitative myasthenia gravis scale with eculizumab treatment. These improvements were sustained through 130 weeks of the open-label extension. Interpretation: Eculizumab treatment elicits rapid and sustained improvements in muscle strength across ocular, bulbar, respiratory, and limb/gross motor muscle groups and in associated daily activities in patients with refractory anti-acetylcholine receptor antibody-positive generalized myasthenia gravis