7,647 research outputs found
On Correcting Inputs: Inverse Optimization for Online Structured Prediction
Algorithm designers typically assume that the input data is correct, and then
proceed to find "optimal" or "sub-optimal" solutions using this input data.
However this assumption of correct data does not always hold in practice,
especially in the context of online learning systems where the objective is to
learn appropriate feature weights given some training samples. Such scenarios
necessitate the study of inverse optimization problems where one is given an
input instance as well as a desired output and the task is to adjust the input
data so that the given output is indeed optimal. Motivated by learning
structured prediction models, in this paper we consider inverse optimization
with a margin, i.e., we require the given output to be better than all other
feasible outputs by a desired margin. We consider such inverse optimization
problems for maximum weight matroid basis, matroid intersection, perfect
matchings, minimum cost maximum flows, and shortest paths and derive the first
known results for such problems with a non-zero margin. The effectiveness of
these algorithmic approaches to online learning for structured prediction is
also discussed.Comment: Conference version to appear in FSTTCS, 201
Learning Semantic Correspondences in Technical Documentation
We consider the problem of translating high-level textual descriptions to
formal representations in technical documentation as part of an effort to model
the meaning of such documentation. We focus specifically on the problem of
learning translational correspondences between text descriptions and grounded
representations in the target documentation, such as formal representation of
functions or code templates. Our approach exploits the parallel nature of such
documentation, or the tight coupling between high-level text and the low-level
representations we aim to learn. Data is collected by mining technical
documents for such parallel text-representation pairs, which we use to train a
simple semantic parsing model. We report new baseline results on sixteen novel
datasets, including the standard library documentation for nine popular
programming languages across seven natural languages, and a small collection of
Unix utility manuals.Comment: accepted to ACL-201
Neighborhood Matching Network for Entity Alignment
Structural heterogeneity between knowledge graphs is an outstanding challenge
for entity alignment. This paper presents Neighborhood Matching Network (NMN),
a novel entity alignment framework for tackling the structural heterogeneity
challenge. NMN estimates the similarities between entities to capture both the
topological structure and the neighborhood difference. It provides two
innovative components for better learning representations for entity alignment.
It first uses a novel graph sampling method to distill a discriminative
neighborhood for each entity. It then adopts a cross-graph neighborhood
matching module to jointly encode the neighborhood difference for a given
entity pair. Such strategies allow NMN to effectively construct
matching-oriented entity representations while ignoring noisy neighbors that
have a negative impact on the alignment task. Extensive experiments performed
on three entity alignment datasets show that NMN can well estimate the
neighborhood similarity in more tough cases and significantly outperforms 12
previous state-of-the-art methods.Comment: 11 pages, accepted by ACL 202
Gap between theory and practice: noise sensitive word alignment in machine translation
Word alignment is to estimate a lexical translation probability p(e|f), or to estimate the correspondence g(e, f) where a function g outputs either 0 or 1, between a source word f and a target word e for given bilingual sentences. In practice, this formulation does not consider the existence of ‘noise’ (or outlier) which may cause problems depending on the corpus. N-to-m mapping objects, such as paraphrases, non-literal translations, and multiword
expressions, may appear as both noise and also as valid training data. From this perspective, this paper tries to answer the following two questions: 1) how to detect stable
patterns where noise seems legitimate, and 2) how to reduce such noise, where applicable, by supplying extra information as prior knowledge to a word aligner
- …