11 research outputs found

    Integrating meaning into quality evaluation of machine translation

    Get PDF
    Machine translation (MT) quality is evaluated through comparisons between MT outputs and the human translations (HT). Traditionally, this evaluation relies on form related features (e.g. lexicon and syntax) and ignores the transfer of meaning reflected in HT outputs. Instead, we evaluate the quality of MT outputs through meaning related features (e.g. polarity, subjectivity) with two experiments. In the first experiment, the meaning related features are compared to human rankings individually. In the second experiment, combinations of meaning related features and other quality metrics are utilized to predict the same human rankings. The results of our experiments confirm the benefit of these features in predicting human evaluation of translation quality in addition to traditional metrics which focus mainly on form

    Minőségbecslő rendszer egynyelvű természetes nyelvi elemzőhöz

    Get PDF
    A pszicholingvisztikai indíttatású természetes nyelvi elemzés egy új, emberi nyelvelemzést modellező nyelvtechnológiai módszer. Ez a modell egy valós idejű elemző, amelynek párhuzamosan több szála elemzi egyszerre a bemeneten sorban érkező szavakat, kifejezéseket vagy mondatokat. A párhuzamosan futó szálak közül az egyik a minőségbecslő modul, amely menedzseli, szűri a hibás és zajos bemenetet, valamint tájékoztatja a többi szálat a bemenet aktuális minőségéről. A minőségbecslő modul felépítéséhez a gépi fordítás kiértékeléséhez használt minőségbecslés módszerét használtuk. Ahhoz, hogy a minőségbecslő modellünk a természetes nyelvi elemző egyik párhuzamosan futó szálát képezze, ötvöztük az eredeti minőségbecslő rendszert a feladatorientált architektúrával. A kutatásunk során felépítettünk egy feladatorientált minőségbecslő rendszert, amely az egynyelvű szöveg valós idejű minőségének becslésére alkalmas. Az általunk létrehozott rendszer segítségével ∼70%-os pontossággal tudjuk megbecsülni a bemeneti szöveg minőségét. A rendszer az AnaGramma magyar nyelvű elemzőhöz készült, de más nyelvekre is használható

    Document-Level Machine Translation Quality Estimation

    Get PDF
    Assessing Machine Translation (MT) quality at document level is a challenge as metrics need to account for many linguistic phenomena on different levels. Large units of text encompass different linguistic phenomena and, as a consequence, a machine translated document can have different problems. It is hard for humans to evaluate documents regarding document-wide phenomena (e.g. coherence) as they get easily distracted by problems at other levels (e.g. grammar). Although standard automatic evaluation metrics (e.g. BLEU) are often used for this purpose, they focus on n-grams matches and often disregard document-wide information. Therefore, although such metrics are useful to compare different MT systems, they may not reflect nuances of quality in individual documents. Machine translated documents can also be evaluated according to the task they will be used for. Methods based on measuring the distance between machine translations and post-edited machine translations are widely used for task-based purposes. Another task-based method is to use reading comprehension questions about the machine translated document, as a proxy of the document quality. Quality Estimation (QE) is an evaluation approach that attempts to predict MT outputs quality, using trained Machine Learning (ML) models. This method is robust because it can consider any type of quality assessment for building the QE models. Thus far, for document-level QE, BLEU-style metrics were used as quality labels, leading to unreliable predictions, as document information is neglected. Challenges of document-level QE encompass the choice of adequate labels for the task, the use of appropriate features for the task and the study of appropriate ML models. In this thesis we focus on feature engineering, the design of quality labels and the use of ML methods for document-level QE. Our new features can be classified as document-wide (use shallow document information), discourse-aware (use information about discourse structures) and consensus-based (use other machine translations as pseudo-references). New labels are proposed in order to overcome the lack of reliable labels for document-level QE. Two different approaches are proposed: one aimed at MT for assimilation with a low requirement, and another aimed at MT for dissemination with a high quality requirement. The assimilation labels use reading comprehension questions as a proxy of document quality. The dissemination approach uses a two-stage post-editing method to derive the quality labels. Different ML techniques are also explored for the document-level QE task, including the appropriate use of regression or classification and the study of kernel combination to deal with features of different nature (e.g. handcrafted features versus consensus features). We show that, in general, QE models predicting our new labels and using our discourse-aware features are more successful than models predicting automatic evaluation metrics. Regarding ML techniques, no conclusions could be drawn, given that different models performed similarly throughout the different experiments

    Human Feedback in Statistical Machine Translation

    Get PDF
    The thesis addresses the challenge of improving Statistical Machine Translation (SMT) systems via feedback given by humans on translation quality. The amount of human feedback available to systems is inherently low due to cost and time limitations. One of our goals is to simulate such information by automatically generating pseudo-human feedback. This is performed using Quality Estimation (QE) models. QE is a technique for predicting the quality of automatic translations without comparing them to oracle (human) translations, traditionally at the sentence or word levels. QE models are trained on a small collection of automatic translations manually labelled for quality, and then can predict the quality of any number of unseen translations. We propose a number of improvements for QE models in order to increase the reliability of pseudo-human feedback. These include strategies to artificially generate instances for settings where QE training data is scarce. We also introduce a new level of granularity for QE: the level of phrases. This level aims to improve the quality of QE predictions by better modelling inter-dependencies among errors at word level, and in ways that are tailored to phrase-based SMT, where the basic unit of translation is a phrase. This can thus facilitate work on incorporating human feedback during the translation process. Finally, we introduce approaches to incorporate pseudo-human feedback in the form of QE predictions in SMT systems. More specifically, we use quality predictions to select the best translation from a number of alternative suggestions produced by SMT systems, and integrate QE predictions into an SMT system decoder in order to guide the translation generation process

    Ti plasmids

    No full text

    eπQue: Gépi fordítás minőségét becslő programcsomag

    Get PDF

    XIII. Magyar Számítógépes Nyelvészeti Konferencia

    Get PDF
    corecore