204 research outputs found
Blending Learning and Inference in Structured Prediction
In this paper we derive an efficient algorithm to learn the parameters of
structured predictors in general graphical models. This algorithm blends the
learning and inference tasks, which results in a significant speedup over
traditional approaches, such as conditional random fields and structured
support vector machines. For this purpose we utilize the structures of the
predictors to describe a low dimensional structured prediction task which
encourages local consistencies within the different structures while learning
the parameters of the model. Convexity of the learning task provides the means
to enforce the consistencies between the different parts. The
inference-learning blending algorithm that we propose is guaranteed to converge
to the optimum of the low dimensional primal and dual programs. Unlike many of
the existing approaches, the inference-learning blending allows us to learn
efficiently high-order graphical models, over regions of any size, and very
large number of parameters. We demonstrate the effectiveness of our approach,
while presenting state-of-the-art results in stereo estimation, semantic
segmentation, shape reconstruction, and indoor scene understanding
Emergent Predication Structure in Hidden State Vectors of Neural Readers
A significant number of neural architectures for reading comprehension have
recently been developed and evaluated on large cloze-style datasets. We present
experiments supporting the emergence of "predication structure" in the hidden
state vectors of these readers. More specifically, we provide evidence that the
hidden state vectors represent atomic formulas where is a
semantic property (predicate) and is a constant symbol entity identifier.Comment: Accepted for Repl4NLP: 2nd Workshop on Representation Learning for
NL
Who did What: A Large-Scale Person-Centered Cloze Dataset
We have constructed a new "Who-did-What" dataset of over 200,000
fill-in-the-gap (cloze) multiple choice reading comprehension problems
constructed from the LDC English Gigaword newswire corpus. The WDW dataset has
a variety of novel features. First, in contrast with the CNN and Daily Mail
datasets (Hermann et al., 2015) we avoid using article summaries for question
formation. Instead, each problem is formed from two independent articles --- an
article given as the passage to be read and a separate article on the same
events used to form the question. Second, we avoid anonymization --- each
choice is a person named entity. Third, the problems have been filtered to
remove a fraction that are easily solved by simple baselines, while remaining
84% solvable by humans. We report performance benchmarks of standard systems
and propose the WDW dataset as a challenge task for the community.Comment: To appear at EMNLP 2016. Our dataset is available at
tticnlp.github.io/who_did_wha
- …