3,066 research outputs found

    Target-Side Context for Discriminative Models in Statistical Machine Translation

    Get PDF
    Discriminative translation models utilizing source context have been shown to help statistical machine translation performance. We propose a novel extension of this work using target context information. Surprisingly, we show that this model can be efficiently integrated directly in the decoding process. Our approach scales to large training data sizes and results in consistent improvements in translation quality on four language pairs. We also provide an analysis comparing the strengths of the baseline source-context model with our extended source-context and target-context model and we show that our extension allows us to better capture morphological coherence. Our work is freely available as part of Moses.Comment: Accepted as a long paper for ACL 201

    A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Get PDF
    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

    A syntactified direct translation model with linear-time decoding

    Get PDF
    Recent syntactic extensions of statistical translation models work with a synchronous context-free or tree-substitution grammar extracted from an automatically parsed parallel corpus. The decoders accompanying these extensions typically exceed quadratic time complexity. This paper extends the Direct Translation Model 2 (DTM2) with syntax while maintaining linear-time decoding. We employ a linear-time parsing algorithm based on an eager, incremental interpretation of Combinatory Categorial Grammar (CCG). As every input word is processed, the local parsing decisions resolve ambiguity eagerly, by selecting a single supertag–operator pair for extending the dependency parse incrementally. Alongside translation features extracted from the derived parse tree, we explore syntactic features extracted from the incremental derivation process. Our empirical experiments show that our model significantly outperforms the state-of-the art DTM2 system

    Tuning syntactically enhanced word alignment for statistical machine translation

    Get PDF
    We introduce a syntactically enhanced word alignment model that is more flexible than state-of-the-art generative word alignment models and can be tuned according to different end tasks. First of all, this model takes the advantages of both unsupervised and supervised word alignment approaches by obtaining anchor alignments from unsupervised generative models and seeding the anchor alignments into a supervised discriminative model. Second, this model offers the flexibility of tuning the alignment according to different optimisation criteria. Our experiments show that using our word alignment in a Phrase-Based Statistical Machine Translation system yields a 5.38% relative increase on IWSLT 2007 task in terms of BLEU score

    Deep Recurrent Generative Decoder for Abstractive Text Summarization

    Full text link
    We propose a new framework for abstractive text summarization based on a sequence-to-sequence oriented encoder-decoder model equipped with a deep recurrent generative decoder (DRGN). Latent structure information implied in the target summaries is learned based on a recurrent latent random model for improving the summarization quality. Neural variational inference is employed to address the intractable posterior inference for the recurrent latent variables. Abstractive summaries are generated based on both the generative latent variables and the discriminative deterministic states. Extensive experiments on some benchmark datasets in different languages show that DRGN achieves improvements over the state-of-the-art methods.Comment: 10 pages, EMNLP 201

    Using TERp to augment the system combination for SMT

    Get PDF
    TER-Plus (TERp) is an extended TER evaluation metric incorporating morphology, synonymy and paraphrases. There are three new edit operations in TERp: Stem Matches, Synonym Matches and Phrase Substitutions (Para-phrases). In this paper, we propose a TERp-based augmented system combination in terms of the backbone selection and consensus decoding network. Combining the new properties\ud of the TERp, we also propose a two-pass decoding strategy for the lattice-based phrase-level confusion network(CN) to generate the final result. The experiments conducted on the NIST2008 Chinese-to-English test set show that our TERp-based augmented system combination framework achieves significant improvements in terms of BLEU and TERp scores compared to the state-of-the-art word-level system combination framework and a TER-based combination strategy

    A memory-based classification approach to marker-based EBMT

    Get PDF
    We describe a novel approach to example-based machine translation that makes use of marker-based chunks, in which the decoder is a memory-based classifier. The classifier is trained to map trigrams of source-language chunks onto trigrams of target-language chunks; then, in a second decoding step, the predicted trigrams are rearranged according to their overlap. We present the first results of this method on a Dutch-to-English translation system using Europarl data. Sparseness of the class space causes the results to lag behind a baseline phrase-based SMT system. In a further comparison, we also apply the method to a word-aligned version of the same data, and report a smaller difference with a word-based SMT system. We explore the scaling abilities of the memory-based approach, and observe linear scaling behavior in training and classification speed and memory costs, and loglinear BLEU improvements in the amount of training examples
    corecore