54,317 research outputs found
Translation of "It" in a Deep Syntax Framework
We present a novel approach to the translation of the English personal pronoun it to Czech. We conduct a linguistic analysis on how the distinct categories of it are usually mapped to their Czech counterparts. Armed with these observations, we design a discriminative translation model of it, which is then integrated into the TectoMT deep syntax MT framework. Features in the model take advantage of rich syntactic annotation TectoMT is based on, external
tools for anaphoricity resolution, lexical co-occurrence frequencies measured on a large parallel corpus and gold coreference annotation. Even though the new model for it exhibits no improvement in terms of BLEU, manual evaluation shows that it outperforms the original solution in
8.5% sentences containing it
Two Case Studies on Translating Pronouns in a Deep Syntax Framework
We focus on improving the translation of the English pronoun it and English reflexive pronouns in an English-Czech syntax-based machine translation framework. Our evaluation both from intrinsic and extrinsic perspective shows that adding specialized syntactic and coreference-related features leads to an improvement in trans-
lation quality
Discourse Structure in Machine Translation Evaluation
In this article, we explore the potential of using sentence-level discourse
structure for machine translation evaluation. We first design discourse-aware
similarity measures, which use all-subtree kernels to compare discourse parse
trees in accordance with the Rhetorical Structure Theory (RST). Then, we show
that a simple linear combination with these measures can help improve various
existing machine translation evaluation metrics regarding correlation with
human judgments both at the segment- and at the system-level. This suggests
that discourse information is complementary to the information used by many of
the existing evaluation metrics, and thus it could be taken into account when
developing richer evaluation metrics, such as the WMT-14 winning combined
metric DiscoTKparty. We also provide a detailed analysis of the relevance of
various discourse elements and relations from the RST parse trees for machine
translation evaluation. In particular we show that: (i) all aspects of the RST
tree are relevant, (ii) nuclearity is more useful than relation type, and (iii)
the similarity of the translation RST tree to the reference tree is positively
correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse
analysis. Computational Linguistics, 201
Meaningfulness, the unsaid and translatability. Instead of an introduction
The present paper opens this topical issue on translation techniques by drawing a theoretical basis for the discussion of translational issues in a linguistic perspective. In order to forward an audience- oriented definition of translation, I will describe different forms of linguistic variability, highlighting how they present different difficulties to translators, with an emphasis on the semantic and communicative complexity that a source text can exhibit. The problem is then further discussed through a comparison between Quine's radically holistic position and the translatability principle supported by such semanticists as Katz. General translatability — at the expense of additional complexity — is eventually proposed as a possible synthesis of this debate. In describing the meaningfulness levels of source texts through Hjelmslevian semiotics, and his semiotic hierarchy in particular, the paper attempts to go beyond denotative semiotic, and reframe some translational issues in a connotative semiotic and metasemiotic perspective
Non-linear Learning for Statistical Machine Translation
Modern statistical machine translation (SMT) systems usually use a linear
combination of features to model the quality of each translation hypothesis.
The linear combination assumes that all the features are in a linear
relationship and constrains that each feature interacts with the rest features
in an linear manner, which might limit the expressive power of the model and
lead to a under-fit model on the current data. In this paper, we propose a
non-linear modeling for the quality of translation hypotheses based on neural
networks, which allows more complex interaction between features. A learning
framework is presented for training the non-linear models. We also discuss
possible heuristics in designing the network structure which may improve the
non-linear learning performance. Experimental results show that with the basic
features of a hierarchical phrase-based machine translation system, our method
produce translations that are better than a linear model.Comment: submitted to a conferenc
- …