Search CORE

292 research outputs found

Gradient-based Inference for Networks with Output Constraints

Author: Carbonell Jaime
Lee Jay Yoon
Mehta Sanket Vaibhav
Tristan Jean-Baptiste
Wick Michael
Publication venue
Publication date: 22/04/2019
Field of study

Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for each input at test-time, we nudge continuous model weights until the network's unconstrained inference procedure generates an output that satisfies the constraints. We study the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction. In each case, the algorithm not only satisfies constraints but improves accuracy, even when the underlying network is state-of-the-art.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recommended from our members

Exact and Approximate Methods for Machine Translation Decoding

Author: Chang Yin-Wen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2015
Field of study

Statistical methods have been the major force driving the advance of machine translation in recent years. Complex models are designed to improve translation performance, but the added complexity also makes decoding more challenging. In this thesis, we focus on designing exact and approximate algorithms for machine translation decoding. More specifically, we will discuss the decoding problems for phrase-based translation models and bidirectional word alignment. The techniques explored in this thesis are Lagrangian relaxation and local search. Lagrangian relaxation based algorithms give us exact methods that have formal guarantees while being efficient in practice. We study extensions to Lagrangian relaxation that improve the convergence rate on machine translation decoding problems. The extensions include a tightening technique that adds constraints incrementally, optimality-preserving pruning to manage the search space size and utilizing the bounding properties of Lagrangian relaxation to develop an exact beam search algorithm. In addition to having the potential to improve translation accuracy, exact decoding deepens our understanding of the model that we are using, since it separates model errors from optimization errors. This leads to the question of designing models that improve the translation quality. We design a syntactic phrase-based model that incorporates a dependency language model to evaluate the fluency level of the target language. By employing local search, an approximate method, to decode this richer model, we discuss the trade-off between the complexity of a model and the decoding efficiency with the model

Columbia University Academic Commons

Efficient Lagrangian relaxation algorithms for exact inference in natural language tasks

Author: Rush Alexander M. (Alexander Matthew)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2011
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-99).For many tasks in natural language processing, finding the best solution requires a search over a large set of possible structures. Solving these combinatorial search problems exactly can be inefficient, and so researchers often use approximate techniques at the cost of model accuracy. In this thesis, we turn to Lagrangian relaxation as an alternative to approximate inference in natural language tasks. We demonstrate that Lagrangian relaxation algorithms provide efficient solutions while still maintaining formal guarantees. The approach leads to inference algorithms with the following properties: " The resulting algorithms are simple and efficient, building on standard combinatorial algorithms for relaxed problems. " The algorithms provably solve a linear programming (LP) relaxation of the original inference problem. " Empirically, the relaxation often leads to an exact solution to the original problem. We develop Lagrangian relaxation algorithms for several important tasks in natural language processing including higher-order non-projective dependency parsing, syntactic machine translation, integrated constituency and dependency parsing, and part-of-speech tagging with inter-sentence constraints. For each of these tasks, we show that the Lagrangian relaxation algorithms are often significantly faster than exact methods while finding the exact solution with a certificate of optimality in the vast majority of examples.by Alexander M. Rush.S.M

DSpace@MIT

Exact Decoding for Phrase-Based Statistical Machine Translation

Author
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

Crossref

Exact decoding of phrase-based translation models through Lagrangian relaxation

Author: Chang Yin-Wen, S.M. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 69-72).This thesis describes two algorithms for exact decoding of phrase-based translation models, based on Lagrangian relaxation. Both methods recovers exact solutions, with certificates of optimality, on over 99% of test examples. The first method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our methods to MOSES [6], and give precise estimates of the number and magnitude of search errors that MOSES makes.by Yin-Wen Chang.S.M

DSpace@MIT

Exact decoding for phrase-based statistical machine translation

Author: Aziz W.
Dymetman M.
Specia L.
Publication venue
Publication date: 01/01/2014
Field of study

© 2014 Association for Computational Linguistics. The combinatorial space of translation derivations in phrase-based statistical machine translation is given by the intersection between a translation lattice and a target language model. We replace this intractable intersection by a tractable relaxation which incorporates a low-order upperbound on the language model. Exact optimisation is achieved through a coarseto- fine strategy with connections to adaptive rejection sampling. We perform exact optimisation with unpruned language models of order 3 to 5 and show searcherror curves for beam search and cube pruning on standard test sets. This is the first work to tractably tackle exact optimisation with language models of orders higher than 3

CiteSeerX

Crossref

White Rose Research Online

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Exact Decoding for Phrase-Based Statistical Machine Translation

Author: Aziz W.
Dymetman M.
Specia L.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

International Migration, Integration and Social Cohesion online publications

Exact Decoding for Phrase-Based Statistical Machine Translation

Author: L
M
W Aziz
Publication venue
Publication date: 06/03/2020
Field of study

Abstract The combinatorial space of translation derivations in phrase-based statistical machine translation is given by the intersection between a translation lattice and a target language model. We replace this intractable intersection by a tractable relaxation which incorporates a low-order upperbound on the language model. Exact optimisation is achieved through a coarseto-fine strategy with connections to adaptive rejection sampling. We perform exact optimisation with unpruned language models of order 3 to 5 and show searcherror curves for beam search and cube pruning on standard test sets. This is the first work to tractably tackle exact optimisation with language models of orders higher than 3

CiteSeerX