Search CORE

95 research outputs found

Recommended from our members

Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation

Author: Byrne WJ
Iglesias Gonzalo
Saunders Danielle
Stahlberg F
Publication venue: AMTA 2018
Publication date: 17/03/2018
Field of study

SGNMT is a decoding platform for machine translation which allows paring various modern neural models of translation with different kinds of constraints and symbolic models. In this paper, we describe three use cases in which SGNMT is currently playing an active role: (1) teaching as SGNMT is being used for course work and student theses in the MPhil in Machine Learning, Speech and Language Technology at the University of Cambridge, (2) research as most of the research work of the Cambridge MT group is based on SGNMT, and (3) technology transfer as we show how SGNMT is helping to transfer research findings from the laboratory to the industry, eg. into a product of SDL plc

Apollo (Cambridge)

Inquiries into words, constraints and contexts : Festschrift in the honour of Kimmo Koskenniemi on his 60th birthday

Author: Arppe Antti
Carlson Lauri
Linden Krister
Piitulainen Jussi Olavi
Suominen Mickael
Vainio Martti
Westerlund Hanna
Yli-Jyrä Anssi Mikael
Publication venue: CSLI publications
Publication date: 01/01/2005
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

CLAIRE makes machine translation BLEU no more

Author: Mohammad Ali (Ali H.)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Sc. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 133-139).We introduce CLAIRE, a mathematically principled model for inferring ranks and scores for arbitrary items based on forced-choice binary comparisons, and show how to apply this technique to statistical models to take advantage of problem-specific assistance from non-experts. We apply this technique to two language processing problems: parsing and machine translation. This leads to an analysis which casts doubts on modern evaluation methods for machine translation systems, and an application of CLAIRE as a new technique for evaluating machine translation systems which is inexpensive, has theoretical guarantees, and correlates strongly in practice with more expensive human judgments of system quality. Our analysis reverses several major tenants of the mainstream machine translation research agenda, suggesting in particular that the use of linguistic models should be reexamined.by Ali Mohammad.Sc.D

DSpace@MIT

A Formal Model of Ambiguity and its Applications in Machine Translation

Author: Dyer Christopher
Publication venue
Publication date: 01/01/2010
Field of study

Systems that process natural language must cope with and resolve ambiguity. In this dissertation, a model of language processing is advocated in which multiple inputs and multiple analyses of inputs are considered concurrently and a single analysis is only a last resort. Compared to conventional models, this approach can be understood as replacing single-element inputs and outputs with weighted sets of inputs and outputs. Although processing components must deal with sets (rather than individual elements), constraints are imposed on the elements of these sets, and the representations from existing models may be reused. However, to deal efficiently with large (or infinite) sets, compact representations of sets that share structure between elements, such as weighted finite-state transducers and synchronous context-free grammars, are necessary. These representations and algorithms for manipulating them are discussed in depth in depth. To establish the effectiveness and tractability of the proposed processing model, it is applied to several problems in machine translation. Starting with spoken language translation, it is shown that translating a set of transcription hypotheses yields better translations compared to a baseline in which a single (1-best) transcription hypothesis is selected and then translated, independent of the translation model formalism used. More subtle forms of ambiguity that arise even in text-only translation (such as decisions conventionally made during system development about how to preprocess text) are then discussed, and it is shown that the ambiguity-preserving paradigm can be employed in these cases as well, again leading to improved translation quality. A model for supervised learning that learns from training data where sets (rather than single elements) of correct labels are provided for each training instance and use it to learn a model of compound word segmentation is also introduced, which is used as a preprocessing step in machine translation

Digital Repository at the University of Maryland