19 research outputs found

    Capturing lexical variation in MT evaluation using automatically built sense-cluster inventories

    Get PDF
    The strict character of most of the existing Machine Translation (MT) evaluation metrics does not permit them to capture lexical variation in translation. However, a central issue in MT evaluation is the high correlation that the metrics should have with human judgments of translation quality. In order to achieve a higher correlation, the identification of sense correspondences between the compared translations becomes really important. Given that most metrics are looking for exact correspondences, the evaluation results are often misleading concerning translation quality. Apart from that, existing metrics do not permit one to make a conclusive estimation of the impact of Word Sense Disambiguation techniques into MT systems. In this paper, we show how information acquired by an unsupervised semantic analysis method can be used to render MT evaluation more sensitive to lexical semantics. The sense inventories built by this data-driven method are incorporated into METEOR: they replace WordNet for evaluation in English and render METEOR’s synonymy module operable in French. The evaluation results demonstrate that the use of these inventories gives rise to an increase in the number of matches and the correlation with human judgments of translation quality, compared to precision-based metrics

    Amazigh Representation in the UNL Framework: Resource Implementation

    Get PDF
    AbstractThis paper discusses the first steps undertaken to create necessary linguistic resources to incorporate Amazigh language within the Universal Networking Language (UNL) framework for machine translation purpose. This universal interlanguage allows to any source text to be translated into different other related languages with UNL by converting the meaning of the source text into semantic graph. This encoding is considered as a pivot interlanguage used in translation systems. Thus in this work, we focus on presenting morphological, syntactical and lexical mapping stages needed for building an “Amazigh dictionary” according to the UNL framework and the “UNL-Amazigh Dictionary” that are both taking part in enconversion and deconversion processes

    Normalizing English for Interlingua : Multi-channel Approach to Global Machine Translation

    Get PDF
    The paper tries to demonstrate that when English is used as interlingua in translating between two languages it can be normalized for reducing unnecessary ambiguity. Current usage of English often omits such critical features as the relative pronoun and the conjunction for marking the beginning of the subordinate clause. In addition to causing ambiguity, the practice also makes it difficult to produce correct structures in target language. If the source language makes such structures explicit, it is possible to carry this information through the whole translation chain into target language. If we consider English language as an interlingua in a multilingual translation environment, we should make the intermediate stage as little ambiguous as possible. There are also other possibilities for reducing ambiguity, such as selection of less ambiguous translation equivalents. Also, long noun compounds, which are often ambiguous, can be presented in unambiguous form, when the linguistic knowledge of the source language is included.Non peer reviewe

    Myanmar Phrases Translation Model with Morphological Analysis for Statistical Myanmar to English Translation System

    Get PDF

    Clause restructuring for statistical machine translation

    Full text link

    O avtomatski evalvaciji strojnega prevajanja

    Get PDF
    Stalen del razvoja strojnega prevajanja je evalvacija prevodov, pri čemer se v glavnem uporabljajo avtomatski postopki. Ti vedno temeljijo na referenčnem prevodu. V tem prispevku pokažemo, kako zelo različni so lahko referenčni prevodi za področje podnaslavljanja ter kako lahko to vpliva na oceno – ista metrika lahko isti prevajalnik oceni kot neuporaben ali kot zelo uspešen samo na podlagi tega, da uporabimo referenčne prevode, ki so pridobljeni po različnih postopkih, vendar vedno jezikovno in pomensko povsem ustrezni
    corecore