20 research outputs found
Trained MT Metrics Learn to Cope with Machine-translated References
Neural metrics trained on human evaluations of MT tend to correlate well with human judgments, but their behavior is not fully understood. In this paper, we perform a controlled experiment and compare a baseline metric that has not been trained on human evaluations (Prism) to a trained version of the same metric (Prism+FT). Surprisingly, we find that Prism+FT becomes more robust to machine-translated references, which are a notorious problem in MT evaluation. This suggests that the effects of metric training go beyond the intended effect of improving overall correlation with human judgments
Trained MT Metrics Learn to Cope with Machine-translated References
Neural metrics trained on human evaluations of MT tend to correlate well with
human judgments, but their behavior is not fully understood. In this paper, we
perform a controlled experiment and compare a baseline metric that has not been
trained on human evaluations (Prism) to a trained version of the same metric
(Prism+FT). Surprisingly, we find that Prism+FT becomes more robust to
machine-translated references, which are a notorious problem in MT evaluation.
This suggests that the effects of metric training go beyond the intended effect
of improving overall correlation with human judgments.Comment: WMT 202
A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism
We show that content on the web is often translated into many languages, and
the low quality of these multi-way translations indicates they were likely
created using Machine Translation (MT). Multi-way parallel, machine generated
content not only dominates the translations in lower resource languages; it
also constitutes a large fraction of the total web content in those languages.
We also find evidence of a selection bias in the type of content which is
translated into many languages, consistent with low quality English content
being translated en masse into many lower resource languages, via MT. Our work
raises serious concerns about training models such as multilingual large
language models on both monolingual and bilingual data scraped from the web.Comment: Accepted at ACL Findings 202
Searching Toward Pareto-Optimal Device-Aware Neural Architectures
Recent breakthroughs in Neural Architectural Search (NAS) have achieved
state-of-the-art performance in many tasks such as image classification and
language understanding. However, most existing works only optimize for model
accuracy and largely ignore other important factors imposed by the underlying
hardware and devices, such as latency and energy, when making inference. In
this paper, we first introduce the problem of NAS and provide a survey on
recent works. Then we deep dive into two recent advancements on extending NAS
into multiple-objective frameworks: MONAS and DPP-Net. Both MONAS and DPP-Net
are capable of optimizing accuracy and other objectives imposed by devices,
searching for neural architectures that can be best deployed on a wide spectrum
of devices: from embedded systems and mobile devices to workstations.
Experimental results are poised to show that architectures found by MONAS and
DPP-Net achieves Pareto optimality w.r.t the given objectives for various
devices.Comment: ICCAD'18 Invited Pape
