54,317 research outputs found

    Translation of "It" in a Deep Syntax Framework

    Get PDF
    We present a novel approach to the translation of the English personal pronoun it to Czech. We conduct a linguistic analysis on how the distinct categories of it are usually mapped to their Czech counterparts. Armed with these observations, we design a discriminative translation model of it, which is then integrated into the TectoMT deep syntax MT framework. Features in the model take advantage of rich syntactic annotation TectoMT is based on, external tools for anaphoricity resolution, lexical co-occurrence frequencies measured on a large parallel corpus and gold coreference annotation. Even though the new model for it exhibits no improvement in terms of BLEU, manual evaluation shows that it outperforms the original solution in 8.5% sentences containing it

    Two Case Studies on Translating Pronouns in a Deep Syntax Framework

    Get PDF
    We focus on improving the translation of the English pronoun it and English reflexive pronouns in an English-Czech syntax-based machine translation framework. Our evaluation both from intrinsic and extrinsic perspective shows that adding specialized syntactic and coreference-related features leads to an improvement in trans- lation quality

    Discourse Structure in Machine Translation Evaluation

    Full text link
    In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment- and at the system-level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular we show that: (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference tree is positively correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse analysis. Computational Linguistics, 201

    Meaningfulness, the unsaid and translatability. Instead of an introduction

    Get PDF
    The present paper opens this topical issue on translation techniques by drawing a theoretical basis for the discussion of translational issues in a linguistic perspective. In order to forward an audience- oriented definition of translation, I will describe different forms of linguistic variability, highlighting how they present different difficulties to translators, with an emphasis on the semantic and communicative complexity that a source text can exhibit. The problem is then further discussed through a comparison between Quine's radically holistic position and the translatability principle supported by such semanticists as Katz. General translatability — at the expense of additional complexity — is eventually proposed as a possible synthesis of this debate. In describing the meaningfulness levels of source texts through Hjelmslevian semiotics, and his semiotic hierarchy in particular, the paper attempts to go beyond denotative semiotic, and reframe some translational issues in a connotative semiotic and metasemiotic perspective

    Non-linear Learning for Statistical Machine Translation

    Full text link
    Modern statistical machine translation (SMT) systems usually use a linear combination of features to model the quality of each translation hypothesis. The linear combination assumes that all the features are in a linear relationship and constrains that each feature interacts with the rest features in an linear manner, which might limit the expressive power of the model and lead to a under-fit model on the current data. In this paper, we propose a non-linear modeling for the quality of translation hypotheses based on neural networks, which allows more complex interaction between features. A learning framework is presented for training the non-linear models. We also discuss possible heuristics in designing the network structure which may improve the non-linear learning performance. Experimental results show that with the basic features of a hierarchical phrase-based machine translation system, our method produce translations that are better than a linear model.Comment: submitted to a conferenc
    • …
    corecore