3,446 research outputs found
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
We present TriviaQA, a challenging reading comprehension dataset containing
over 650K question-answer-evidence triples. TriviaQA includes 95K
question-answer pairs authored by trivia enthusiasts and independently gathered
evidence documents, six per question on average, that provide high quality
distant supervision for answering the questions. We show that, in comparison to
other recently introduced large-scale datasets, TriviaQA (1) has relatively
complex, compositional questions, (2) has considerable syntactic and lexical
variability between questions and corresponding answer-evidence sentences, and
(3) requires more cross sentence reasoning to find answers. We also present two
baseline algorithms: a feature-based classifier and a state-of-the-art neural
network, that performs well on SQuAD reading comprehension. Neither approach
comes close to human performance (23% and 40% vs. 80%), suggesting that
TriviaQA is a challenging testbed that is worth significant future study. Data
and code available at -- http://nlp.cs.washington.edu/triviaqa/Comment: Added references, fixed typos, minor baseline updat
Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms
In NLP, convolutional neural networks (CNNs) have benefited less than
recurrent neural networks (RNNs) from attention mechanisms. We hypothesize that
this is because the attention in CNNs has been mainly implemented as attentive
pooling (i.e., it is applied to pooling) rather than as attentive convolution
(i.e., it is integrated into convolution). Convolution is the differentiator of
CNNs in that it can powerfully model the higher-level representation of a word
by taking into account its local fixed-size context in the input text t^x. In
this work, we propose an attentive convolution network, ATTCONV. It extends the
context scope of the convolution operation, deriving higher-level features for
a word not only from local context, but also information extracted from
nonlocal context by the attention mechanism commonly used in RNNs. This
nonlocal context can come (i) from parts of the input text t^x that are distant
or (ii) from extra (i.e., external) contexts t^y. Experiments on sentence
modeling with zero-context (sentiment analysis), single-context (textual
entailment) and multiple-context (claim verification) demonstrate the
effectiveness of ATTCONV in sentence representation learning with the
incorporation of context. In particular, attentive convolution outperforms
attentive pooling and is a strong competitor to popular attentive RNNs.Comment: Camera-ready for TACL. 16 page
Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs
To make machines better understand sentiments, research needs to move from
polarity identification to understanding the reasons that underlie the
expression of sentiment. Categorizing the goals or needs of humans is one way
to explain the expression of sentiment in text. Humans are good at
understanding situations described in natural language and can easily connect
them to the character's psychological needs using commonsense knowledge. We
present a novel method to extract, rank, filter and select multi-hop relation
paths from a commonsense knowledge resource to interpret the expression of
sentiment in terms of their underlying human needs. We efficiently integrate
the acquired knowledge paths in a neural model that interfaces context
representations with knowledge using a gated attention mechanism. We assess the
model's performance on a recently published dataset for categorizing human
needs. Selectively integrating knowledge paths boosts performance and
establishes a new state-of-the-art. Our model offers interpretability through
the learned attention map over commonsense knowledge paths. Human evaluation
highlights the relevance of the encoded knowledge
- …