292 research outputs found

    Gradient-based Inference for Networks with Output Constraints

    Full text link
    Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for each input at test-time, we nudge continuous model weights until the network's unconstrained inference procedure generates an output that satisfies the constraints. We study the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction. In each case, the algorithm not only satisfies constraints but improves accuracy, even when the underlying network is state-of-the-art.Comment: AAAI 201

    Efficient Lagrangian relaxation algorithms for exact inference in natural language tasks

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-99).For many tasks in natural language processing, finding the best solution requires a search over a large set of possible structures. Solving these combinatorial search problems exactly can be inefficient, and so researchers often use approximate techniques at the cost of model accuracy. In this thesis, we turn to Lagrangian relaxation as an alternative to approximate inference in natural language tasks. We demonstrate that Lagrangian relaxation algorithms provide efficient solutions while still maintaining formal guarantees. The approach leads to inference algorithms with the following properties: " The resulting algorithms are simple and efficient, building on standard combinatorial algorithms for relaxed problems. " The algorithms provably solve a linear programming (LP) relaxation of the original inference problem. " Empirically, the relaxation often leads to an exact solution to the original problem. We develop Lagrangian relaxation algorithms for several important tasks in natural language processing including higher-order non-projective dependency parsing, syntactic machine translation, integrated constituency and dependency parsing, and part-of-speech tagging with inter-sentence constraints. For each of these tasks, we show that the Lagrangian relaxation algorithms are often significantly faster than exact methods while finding the exact solution with a certificate of optimality in the vast majority of examples.by Alexander M. Rush.S.M

    Exact Decoding for Phrase-Based Statistical Machine Translation

    Full text link

    Exact decoding of phrase-based translation models through Lagrangian relaxation

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 69-72).This thesis describes two algorithms for exact decoding of phrase-based translation models, based on Lagrangian relaxation. Both methods recovers exact solutions, with certificates of optimality, on over 99% of test examples. The first method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our methods to MOSES [6], and give precise estimates of the number and magnitude of search errors that MOSES makes.by Yin-Wen Chang.S.M

    Exact decoding for phrase-based statistical machine translation

    Get PDF
    © 2014 Association for Computational Linguistics. The combinatorial space of translation derivations in phrase-based statistical machine translation is given by the intersection between a translation lattice and a target language model. We replace this intractable intersection by a tractable relaxation which incorporates a low-order upperbound on the language model. Exact optimisation is achieved through a coarseto- fine strategy with connections to adaptive rejection sampling. We perform exact optimisation with unpruned language models of order 3 to 5 and show searcherror curves for beam search and cube pruning on standard test sets. This is the first work to tractably tackle exact optimisation with language models of orders higher than 3

    Exact Decoding for Phrase-Based Statistical Machine Translation

    Get PDF
    Abstract The combinatorial space of translation derivations in phrase-based statistical machine translation is given by the intersection between a translation lattice and a target language model. We replace this intractable intersection by a tractable relaxation which incorporates a low-order upperbound on the language model. Exact optimisation is achieved through a coarseto-fine strategy with connections to adaptive rejection sampling. We perform exact optimisation with unpruned language models of order 3 to 5 and show searcherror curves for beam search and cube pruning on standard test sets. This is the first work to tractably tackle exact optimisation with language models of orders higher than 3
    • …
    corecore