3,099 research outputs found
Building a resource for studying translation shifts
This paper describes an interdisciplinary approach which brings together the
fields of corpus linguistics and translation studies. It presents ongoing work
on the creation of a corpus resource in which translation shifts are explicitly
annotated. Translation shifts denote departures from formal correspondence
between source and target text, i.e. deviations that have occurred during the
translation process. A resource in which such shifts are annotated in a
systematic way will make it possible to study those phenomena that need to be
addressed if machine translation output is to resemble human translation. The
resource described in this paper contains English source texts (parliamentary
proceedings) and their German translations. The shift annotation is based on
predicate-argument structures and proceeds in two steps: first, predicates and
their arguments are annotated monolingually in a straightforward manner. Then,
the corresponding English and German predicates and arguments are aligned with
each other. Whenever a shift - mainly grammatical or semantic -has occurred,
the alignment is tagged accordingly.Comment: 6 pages, 1 figur
Identifying Semantic Divergences in Parallel Text without Annotations
Recognizing that even correct translations are not always semantically
equivalent, we automatically detect meaning divergences in parallel sentence
pairs with a deep neural model of bilingual semantic similarity which can be
trained for any parallel corpus without any manual annotation. We show that our
semantic model detects divergences more accurately than models based on surface
features derived from word alignments, and that these divergences matter for
neural machine translation.Comment: Accepted as a full paper to NAACL 201
LinES: An English-Swedish Parallel Treebank
Proceedings of the 16th Nordic Conference
of Computational Linguistics NODALIDA-2007.
Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit.
University of Tartu, Tartu, 2007.
ISBN 978-9985-4-0513-0 (online)
ISBN 978-9985-4-0514-7 (CD-ROM)
pp. 270-273
- …