8 research outputs found

    Comparative Quality Estimation for Machine Translation. An Application of Artificial Intelligence on Language Technology using Machine Learning of Human Preferences

    Get PDF
    In this thesis we focus on Comparative Quality Estimation, as the automaticprocess of analysing two or more translations produced by a Machine Translation(MT) system and expressing a judgment about their comparison. We approach theproblem from a supervised machine learning perspective, with the aim to learnfrom human preferences. As a result, we create the ranking mechanism, a pipelinethat includes the necessary tasks for ordering several MT outputs of a givensource sentence in terms of relative quality. Quality Estimation models are trained to statistically associate the judgmentswith some qualitative features. For this purpose, we design a broad set offeatures with a particular focus on the ones with a grammatical background.Through an iterative feature engineering process, we investigate several featuresets, we conclude to the ones that achieve the best performance and we proceedto linguistically intuitive observations about the contribution of individualfeatures. Additionally, we employ several feature selection and machine learning methodsto take advantage of these features. We suggest the usage of binary classifiersafter decomposing the ranking into pairwise decisions. In order to reduce theamount of uncertain decisions (ties) we weight the pairwise decisions with theirclassification probability. Through a set of experiments, we show that the ranking mechanism can learn andreproduce rankings that correlate to the ones given by humans. Most importantly,it can be successfully compared with state-of-the-art reference-aware metricsand other known ranking methods for several language pairs. We also apply thismethod for a hybrid MT system combination and we show that it is able to improvethe overall translation performance. Finally, we examine the correlation between common MT errors and decoding eventsof the phrase-based statistical MT systems. Through evidence from the decodingprocess, we identify some cases where long-distance grammatical phenomena cannotbe captured properly. An additional outcome of this thesis is the open source software Qualitative,which implements the full pipeline of ranking mechanism and the systemcombination task. It integrates a multitude of state-of-the-art natural languageprocessing tools and can support the development of new models. Apart from theusage in experiment pipelines, it can serve as an application back-end for webapplications in real-use scenaria.In dieser Promotionsarbeit konzentrieren wir uns auf die vergleichende Qualitätsschätzung der Maschinellen Übersetzung als ein automatisches Verfahren zur Analyse von zwei oder mehr Übersetzungen, die von Maschinenübersetzungssysteme erzeugt wurden, und zur Beurteilung von deren Vergleich. Wir gehen an das Problem aus der Perspektive des überwachten maschinellen Lernens heran, mit dem Ziel, von menschlichen Präferenzen zu lernen. Als Ergebnis erstellen wir einen Ranking-Mechanismus. Dabei handelt es sich um eine Pipeline, welche die notwendigen Arbeitsschritte für die Anordnung mehrerer Maschinenübersetzungen eines bestimmten Quellsatzes in Bezug auf die relative Qualität umfasst. Qualitätsschätzungsmodelle werden so trainiert, dass Vergleichsurteile mit einigen bestimmten Merkmalen statistisch verknüpft werden. Zu diesem Zweck konzipieren wir eine breite Palette von Merkmalen mit besonderem Fokus auf diejenigen mit einem grammatikalischen Hintergrund. Mit Hilfe eines iterativen Verfahrens der Merkmalskonstruktion untersuchen wir verschiedene Merkmalsreihen, erschließen diejenigen, die die beste Leistung erzielen, und leiten linguistisch motivierte Beobachtungen über die Beiträge der einzelnen Merkmale ab. Zusätzlich setzen wir verschiedene Methoden des maschinellen Lernens und der Merkmalsauswahl ein, um die Vorteile dieser Merkmale zu nutzen. Wir schlagen die Verwendung von binären Klassifikatoren nach Zerlegen des Rankings in paarweise Entscheidungen vor. Um die Anzahl der unklaren Entscheidungen (Unentschieden) zu verringern, gewichten wir die paarweisen Entscheidungen mit deren Klassifikationswahrscheinlichkeit. Mithilfe einer Reihe von Experimenten zeigen wir, dass der Ranking-Mechanismus Rankings lernen und reproduzieren kann, die mit denen von Menschen übereinstimmen. Die wichtigste Erkenntnis ist, dass der Mechanismus erfolgreich mit referenzbasierten Metriken und anderen bekannten Ranking-Methoden auf dem neusten Stand der Technik für verschiedene Sprachpaare verglichen werden kann. Diese Methode verwenden wir ebenfalls für eine hybride Systemkombination maschineller Übersetzer und zeigen, dass sie in der Lage ist, die gesamte Übersetzungsleistung zu verbessern. Abschließend untersuchen wir den Zusammenhang zwischen häufig vorkommenden Fehlern der maschinellen Übersetzung und Vorgängen, die während des internen Dekodierungsverfahrens der phrasenbasierten statistischen Maschinenübersetzungssysteme ablaufen. Durch Beweise aus dem Dekodierungsverfahren können wir einige Fälle identifizieren, in denen grammatikalische Phänomene mit Fernabhängigkeit nicht richtig erfasst werden können. Ein weiteres Ergebnis dieser Arbeit ist die quelloffene Software ``Qualitative'', welche die volle Pipeline des Ranking-Mechanismus und das System für die Kombinationsaufgabe implementiert. Die Software integriert eine Vielzahl modernster Softwaretools für die Verarbeitung natürlicher Sprache und kann die Entwicklung neuer Modelle unterstützen. Sie kann sowohl in Experimentierpipelines als auch als Anwendungs-Backend in realen Nutzungsszenarien verwendet werden

    Argumentative zoning information extraction from scientific text

    Get PDF
    Let me tell you, writing a thesis is not always a barrel of laughs—and strange things can happen, too. For example, at the height of my thesis paranoia, I had a re-current dream in which my cat Amy gave me detailed advice on how to restructure the thesis chapters, which was awfully nice of her. But I also had a lot of human help throughout this time, whether things were going fine or beserk. Most of all, I want to thank Marc Moens: I could not have had a better or more knowledgable supervisor. He always took time for me, however busy he might have been, reading chapters thoroughly in two days. He both had the calmness of mind to give me lots of freedom in research, and the right judgement to guide me away, tactfully but determinedly, from the occasional catastrophe or other waiting along the way. He was great fun to work with and also became a good friend. My work has profitted from the interdisciplinary, interactive and enlightened atmosphere at the Human Communication Centre and the Centre for Cognitive Science (which is now called something else). The Language Technology Group was a great place to work in, as my research was grounded in practical applications develope

    The procedure to construct a word predictor in a speech understanding system from a task-specific grammar defined in a CFG or a DCG

    No full text

    Simplifying, reading, and machine translating health content: an empirical investigation of usability

    Get PDF
    Text simplification, through plain language (PL) or controlled language (CL), is adopted to increase readability, comprehension and machine translatability of (health) content. Cochrane is a non-profit organisation where volunteer authors summarise and simplify health-related English texts on the impact of treatments and interventions into plain language summaries (PLS), which are then disseminated online to the lay audience and translated. Cochrane’s simplification approach is non-automated, and involves the manual checking and implementation of different sets of PL guidelines, which can be an unsatisfactory, challenging and time-consuming task. This thesis examined if using the Acrolinx CL checker to automatically and consistently check PLS for readability and translatability issues would increase the usability of Cochrane’s simplification approach and, more precisely: (i) authors’ satisfaction; and (ii) authors’ effectiveness in terms of readability, comprehensibility, and machine translatability into Spanish. Data on satisfaction were collected from twelve Cochrane authors by means of the System Usability Scale and follow-up preference questions. Readability was analysed through the computational tool Coh-Metrix. Evidence on comprehensibility was gathered through ratings and recall protocols produced by lay readers, both native and non-native speakers of English. Machine translatability was assessed in terms of adequacy and fluency with forty-one Cochrane contributors, all native speakers of Spanish. Authors seemed to welcome the introduction of Acrolinx, and the adoption of this CL checker reduced word length, sentence length, and syntactic complexity. No significant impact on comprehensibility and machine translatability was identified. We observed that reading skills and characteristics other than simplified language (e.g. formatting) might influence comprehension. Machine translation quality was relatively high, with mainly style issues. This thesis presented an environment that could boost volunteer authors’ satisfaction and foster their adoption of simple language. We also discussed strategies to increase the accessibility of online health content among lay readers with different skills and language backgrounds

    Individual Differences in the Variable of Presented Personality

    Get PDF
    The thesis is concerned with research in the field of human decision-making, concentrating on techniques of gaming for the pursuit of this research. Following an introduction to the work and a statement of the research programme as it was initially conceived, some current ideas in gaming are investigated. The Superior Commander system of game control is introduced. The content of research games is discussed, and the Organisational Control Game, a board war game designed for research, is described. It is shown that the Organisational Control Game and Superior Commander system successfully meet the requirements for a useful research game and gaming methodology. A detailed literature survey of the psychological secondary task technique for assessing mental processing load is presented. It is noted that the technique might be extended to the study of tasks which have a large problem-solving component. A secondary task experiment on such a task, a chess problem task, is described. It is demonstrated that the secondary task approach can provide techniques for the investigation of complex problem-solving and decision-making tasks. A series of plays of the Organisational Control Game, in which the players had had previous military experience, is described. These games are compared with an earlier series of games, in which the players were students. Certain differences in playing style are identified. The research programme is re-examined, and modifications to it are described. The need for a technique for elucidation and examination of an individual decision-maker's perceptions of his decision-making environment is identified. The technique of cognitive mapping is shown to be suitable for this purpose. A cognitive map analysis of a series of games in which the players were serving army officers is presented.
    corecore