3,066 research outputs found
Efficient Lagrangian relaxation algorithms for exact inference in natural language tasks
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-99).For many tasks in natural language processing, finding the best solution requires a search over a large set of possible structures. Solving these combinatorial search problems exactly can be inefficient, and so researchers often use approximate techniques at the cost of model accuracy. In this thesis, we turn to Lagrangian relaxation as an alternative to approximate inference in natural language tasks. We demonstrate that Lagrangian relaxation algorithms provide efficient solutions while still maintaining formal guarantees. The approach leads to inference algorithms with the following properties: " The resulting algorithms are simple and efficient, building on standard combinatorial algorithms for relaxed problems. " The algorithms provably solve a linear programming (LP) relaxation of the original inference problem. " Empirically, the relaxation often leads to an exact solution to the original problem. We develop Lagrangian relaxation algorithms for several important tasks in natural language processing including higher-order non-projective dependency parsing, syntactic machine translation, integrated constituency and dependency parsing, and part-of-speech tagging with inter-sentence constraints. For each of these tasks, we show that the Lagrangian relaxation algorithms are often significantly faster than exact methods while finding the exact solution with a certificate of optimality in the vast majority of examples.by Alexander M. Rush.S.M
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
Language is increasingly being used to define rich visual recognition
problems with supporting image collections sourced from the web. Structured
prediction models are used in these tasks to take advantage of correlations
between co-occurring labels and visual input but risk inadvertently encoding
social biases found in web corpora. In this work, we study data and models
associated with multilabel object classification and visual semantic role
labeling. We find that (a) datasets for these tasks contain significant gender
bias and (b) models trained on these datasets further amplify existing bias.
For example, the activity cooking is over 33% more likely to involve females
than males in a training set, and a trained model further amplifies the
disparity to 68% at test time. We propose to inject corpus-level constraints
for calibrating existing structured prediction models and design an algorithm
based on Lagrangian relaxation for collective inference. Our method results in
almost no performance loss for the underlying recognition task but decreases
the magnitude of bias amplification by 47.5% and 40.5% for multilabel
classification and visual semantic role labeling, respectively.Comment: 11 pages, published in EMNLP 201
Gradient-based Inference for Networks with Output Constraints
Practitioners apply neural networks to increasingly complex problems in
natural language processing, such as syntactic parsing and semantic role
labeling that have rich output structures. Many such structured-prediction
problems require deterministic constraints on the output values; for example,
in sequence-to-sequence syntactic parsing, we require that the sequential
outputs encode valid trees. While hidden units might capture such properties,
the network is not always able to learn such constraints from the training data
alone, and practitioners must then resort to post-processing. In this paper, we
present an inference method for neural networks that enforces deterministic
constraints on outputs without performing rule-based post-processing or
expensive discrete search. Instead, in the spirit of gradient-based training,
we enforce constraints with gradient-based inference (GBI): for each input at
test-time, we nudge continuous model weights until the network's unconstrained
inference procedure generates an output that satisfies the constraints. We
study the efficacy of GBI on three tasks with hard constraints: semantic role
labeling, syntactic parsing, and sequence transduction. In each case, the
algorithm not only satisfies constraints but improves accuracy, even when the
underlying network is state-of-the-art.Comment: AAAI 201
Generalized sequential tree-reweighted message passing
This paper addresses the problem of approximate MAP-MRF inference in general
graphical models. Following [36], we consider a family of linear programming
relaxations of the problem where each relaxation is specified by a set of
nested pairs of factors for which the marginalization constraint needs to be
enforced. We develop a generalization of the TRW-S algorithm [9] for this
problem, where we use a decomposition into junction chains, monotonic w.r.t.
some ordering on the nodes. This generalizes the monotonic chains in [9] in a
natural way. We also show how to deal with nested factors in an efficient way.
Experiments show an improvement over min-sum diffusion, MPLP and subgradient
ascent algorithms on a number of computer vision and natural language
processing problems
- …