96 research outputs found
Neural CRF Parsing
This paper describes a parsing model that combines the exact dynamic
programming of CRF parsing with the rich nonlinear featurization of neural net
approaches. Our model is structurally a CRF that factors over anchored rule
productions, but instead of linear potential functions based on sparse
features, we use nonlinear potentials computed via a feedforward neural
network. Because potentials are still local to anchored rules, structured
inference (CKY) is unchanged from the sparse case. Computing gradients during
learning involves backpropagating an error signal formed from standard CRF
sufficient statistics (expected rule counts). Using only dense features, our
neural CRF already exceeds a strong baseline CRF model (Hall et al., 2014). In
combination with sparse features, our system achieves 91.1 F1 on section 23 of
the Penn Treebank, and more generally outperforms the best prior single parser
results on a range of languages.Comment: Accepted for publication at ACL 201
Modeling Semantic Plausibility by Injecting World Knowledge
Distributional data tells us that a man can swallow candy, but not that a man
can swallow a paintball, since this is never attested. However both are
physically plausible events. This paper introduces the task of semantic
plausibility: recognizing plausible but possibly novel events. We present a new
crowdsourced dataset of semantic plausibility judgments of single events such
as "man swallow paintball". Simple models based on distributional
representations perform poorly on this task, despite doing well on selection
preference, but injecting manually elicited knowledge about entity properties
provides a substantial performance boost. Our error analysis shows that our new
dataset is a great testbed for semantic plausibility models: more sophisticated
knowledge representation and propagation could address many of the remaining
errors.Comment: camera-ready draft (with link to data), Published at NAACL 2018 as a
conference paper (oral
Domain Agnostic Real-Valued Specificity Prediction
Sentence specificity quantifies the level of detail in a sentence,
characterizing the organization of information in discourse. While this
information is useful for many downstream applications, specificity prediction
systems predict very coarse labels (binary or ternary) and are trained on and
tailored toward specific domains (e.g., news). The goal of this work is to
generalize specificity prediction to domains where no labeled data is available
and output more nuanced real-valued specificity ratings.
We present an unsupervised domain adaptation system for sentence specificity
prediction, specifically designed to output real-valued estimates from binary
training labels. To calibrate the values of these predictions appropriately, we
regularize the posterior distribution of the labels towards a reference
distribution. We show that our framework generalizes well to three different
domains with 50%~68% mean absolute error reduction than the current
state-of-the-art system trained for news sentence specificity. We also
demonstrate the potential of our work in improving the quality and
informativeness of dialogue generation systems.Comment: AAAI 2019 camera read
- …
