Neural machine translation represents an exciting leap forward in translation
quality. But what longstanding weaknesses does it resolve, and which remain? We
address these questions with a challenge set approach to translation evaluation
and error analysis. A challenge set consists of a small set of sentences, each
hand-designed to probe a system's capacity to bridge a particular structural
divergence between languages. To exemplify this approach, we present an
English-French challenge set, and use it to analyze phrase-based and neural
systems. The resulting analysis provides not only a more fine-grained picture
of the strengths of neural systems, but also insight into which linguistic
phenomena remain out of reach.Comment: EMNLP 2017. 28 pages, including appendix. Machine readable data
included in a separate file. This version corrects typos in the challenge se