61 research outputs found

    A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Get PDF
    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

    Context-aware graph segmentation for graph-based translation

    Get PDF
    In this paper, we present an improved graph-based translation model which segments an input graph into node-induced subgraphs by taking source context into consideration. Translations are generated by combining subgraph translations leftto-right using beam search. Experiments on Chinese–English and German–English demonstrate that the context-aware segmentation significantly improves the baseline graph-based model

    Dependency reordering features for Japanese-English phrase-based translation

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 101-106).Translating Japanese into English is very challenging because of the vast difference in word order between the two languages. For example, the main verb is always at the very end of a Japanese sentence, whereas it comes near the beginning of an English sentence. In this thesis, we develop a Japanese-to-English translation system capable of performing the long-distance reordering necessary to fluently translate Japanese into English. Our system uses novel feature functions, based on a dependency parse of the input Japanese sentence, which identify candidate translations that put dependency relationships into correct English order. For example, one feature identifies translations that put verbs before their objects. The weights for these feature functions are discriminatively trained, and so can be used for any language pair. In our Japanese-to-English system, they improve the BLEU score from 27.96 to 28.54, and we show clear improvements in subjective quality. We also experiment with a well-known technique of training the translation system on a Japanese training corpus that has been reordered into an English-like word order. Impressive results can be achieved by naively reordering each Japanese sentence into reverse order. Translating these reversed sentences with the dependency-parse-based feature functions gives further improvement. Finally, we evaluate our translation systems with human judgment, BLEU score, and METEOR score. We compare these metrics on corpus and sentence level and examine how well they capture improvements in translation word order.by Jason Edward Katz-Brown.M.Eng

    Dependency structures and lexicalized grammars

    Get PDF
    In this dissertation, we show that that both the generative capacity and the parsing complexity of lexicalized grammar formalisms are systematically related to structural properties of the dependency structures that these formalisms can induce. Dependency structures model the syntactic dependencies among the words of a sentence. We identify three empirically relevant classes of dependency structures, and show how they can be characterized both in terms of restrictions on the relation between dependency and word-order and within an algebraic framework. In the second part of the dissertation, we develop natural notions of automata and grammars for dependency structures, show how these yield infinite hierarchies of ever more powerful dependency languages, and classify several grammar formalisms with respect to the languages in these hierarchies that they are able to characterize. Our results provide fundamental insights into the relation between dependency structures and lexicalized grammars.In dieser Arbeit zeigen wir, dass sowohl die AusdrucksmĂ€chtigkeit als auch die VerarbeitungskomplexitĂ€t von lexikalisierten Grammatikformalismen auf systematische Art und Weise von strukturellen Eigenschaften der Dependenzstrukturen abhĂ€ngen, die diese Formalismen induzieren. Dependenzstrukturen modellieren die syntaktischen AbhĂ€ngigkeiten zwischen den Wörtern eines Satzes. Wir identifizieren drei empirisch relevante Klassen von Dependenzstrukturen und zeigen, wie sich diese sowohl durch EinschrĂ€nkungen der Interaktion zwischen Dependenz und Wortstellung, als auch in einem algebraischen Rahmen charakterisieren lassen. Im zweiten Teil der Arbeit entwickeln wir natĂŒrliche Begriffe von Automaten und Grammatiken fĂŒr Dependenzstrukturen, zeigen, wie diese zu unendlichen Hierarchien immer ausdrucksmĂ€chtigerer Dependenzsprachen fĂŒhren, und klassifizieren mehrere Grammatikformalismen in Bezug auf die Sprachen in diesen Hierarchien, die von ihnen charakterisiert werden können. Unsere Resultate liefern grundlegende Einsichten in das VerhĂ€ltnis zwischen Dependenzstrukturen und lexikalisierten Grammatiken

    Syntax-based machine translation using dependency grammars and discriminative machine learning

    Get PDF
    Machine translation underwent huge improvements since the groundbreaking introduction of statistical methods in the early 2000s, going from very domain-specific systems that still performed relatively poorly despite the painstakingly crafting of thousands of ad-hoc rules, to general-purpose systems automatically trained on large collections of bilingual texts which manage to deliver understandable translations that convey the general meaning of the original input. These approaches however still perform quite below the level of human translators, typically failing to convey detailed meaning and register, and producing translations that, while readable, are often ungrammatical and unidiomatic. This quality gap, which is considerably large compared to most other natural language processing tasks, has been the focus of the research in recent years, with the development of increasingly sophisticated models that attempt to exploit the syntactical structure of human languages, leveraging the technology of statistical parsers, as well as advanced machine learning methods such as marging-based structured prediction algorithms and neural networks. The translation software itself became more complex in order to accommodate for the sophistication of these advanced models: the main translation engine (the decoder) is now often combined with a pre-processor which reorders the words of the source sentences to a target language word order, or with a post-processor that ranks and selects a translation according according to fine model from a list of candidate translations generated by a coarse model. In this thesis we investigate the statistical machine translation problem from various angles, focusing on translation from non-analytic languages whose syntax is best described by fluid non-projective dependency grammars rather than the relatively strict phrase-structure grammars or projectivedependency grammars which are most commonly used in the literature. We propose a framework for modeling word reordering phenomena between language pairs as transitions on non-projective source dependency parse graphs. We quantitatively characterize reordering phenomena for the German-to-English language pair as captured by this framework, specifically investigating the incidence and effects of the non-projectivity of source syntax and the non-locality of word movement w.r.t. the graph structure. We evaluated several variants of hand-coded pre-ordering rules in order to assess the impact of these phenomena on translation quality. We propose a class of dependency-based source pre-ordering approaches that reorder sentences based on a flexible models trained by SVMs and and several recurrent neural network architectures. We also propose a class of translation reranking models, both syntax-free and source dependency-based, which make use of a type of neural networks known as graph echo state networks which is highly flexible and requires extremely little training resources, overcoming one of the main limitations of neural network models for natural language processing tasks

    Getting Past the Language Gap: Innovations in Machine Translation

    Get PDF
    In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

    Deep Syntax in Statistical Machine Translation

    Get PDF
    Statistical Machine Translation (SMT) via deep syntactic transfer employs a three-stage architecture, (i) parse source language (SL) input, (ii) transfer SL deep syntactic structure to the target language (TL), and (iii) generate a TL translation. The deep syntactic transfer architecture achieves a high level of language pair independence compared to other Machine Translation (MT) approaches, as translation is carried out at the more language independent deep syntactic representation. TL word order can be generated independently of SL word order and therefore no reordering model between source and target words is required. In addition, words in dependency relations are adjacent in the deep syntactic structure, allowing the extraction of more general transfer rules, compared to other rules/phrases extracted from the surface form corpus, as such words are often distant in surface form strings, as well as allowing the use of a TL deep syntax language model, which models a deeper notion of fluency than a string-based language model and may lead to better lexical choice. The deep syntactic representation also contains words in lemma form with morpho-syntactic information, and this enables new inflections of lemmas not observed in bilingual training data, that are out of coverage for other SMT approaches, to fall within coverage of deep syntactic transfer. In this thesis, we adapt existing methods already successful in Phrase-Based SMT (PB-SMT) to deep syntactic transfer as well as presenting new methods of our own. We present a new definition for consistent deep syntax transfer rules, inspired by the definition for a consistent phrase in PB-SMT, and we extract all rules consistent with the node alignment, as smaller rules provide high coverage of unseen data, while larger rules provide more fluent combinations of TL words. Since large numbers of consistent transfer rules exist per sentence pair, we also provide an efficient method of extracting rules as well as an efficient method of storing them. We also present a deep syntax translation model, as in other SMT approaches, we use a log-linear combination of features functions, and include a translation model computed from relative frequencies of transfer rules, lexical weighting, as well as a deep syntax language model and string-based language model. In addition, we describe methods of carrying out transfer decoding, the search for TL deep syntactic structures, and how we efficiently integrate a deep syntax trigram language model to decoding, as well as methods of translating morpho-syntactic information separately from lemmas, using an adaptation of Factored Models. Finally, we include an experimental evaluation, in which we compare MT output for different configurations of our SMT via deep syntactic transfer system. We investigate various methods of word alignment, methods of translating morpho-syntactic information, limits on transfer rule size, different beam sizes during transfer decoding, generating from different sized lists of TL decoder output structures, as well as deterministic versus non-deterministic generation. We also include an evaluation of the deep syntax language model in isolation to the MT system and compare it to a string-based language model. Finally, we compare the performance and types of translations our system produces with a state-of-the-art phrase-based statistical machine translation system and although the deep syntax system in general currently under-performs, it does achieve state-of-the-art performance for translation of a specific syntactic construction, the compound noun, and for translations within coverage of the TL precision grammar used for generation. We provide the software for transfer rule extraction, as well as the transfer decoder, as open source tools to assist future research
    • 

    corecore