65,290 research outputs found
Structured Attentions for Visual Question Answering
Visual attention, which assigns weights to image regions according to their
relevance to a question, is considered as an indispensable part by most Visual
Question Answering models. Although the questions may involve complex relations
among multiple regions, few attention models can effectively encode such
cross-region relations. In this paper, we demonstrate the importance of
encoding such relations by showing the limited effective receptive field of
ResNet on two datasets, and propose to model the visual attention as a
multivariate distribution over a grid-structured Conditional Random Field on
image regions. We demonstrate how to convert the iterative inference
algorithms, Mean Field and Loopy Belief Propagation, as recurrent layers of an
end-to-end neural network. We empirically evaluated our model on 3 datasets, in
which it surpasses the best baseline model of the newly released CLEVR dataset
by 9.5%, and the best published model on the VQA dataset by 1.25%. Source code
is available at https: //github.com/zhuchen03/vqa-sva.Comment: ICCV201
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
We propose a multi-task learning framework to learn a joint Machine Reading
Comprehension (MRC) model that can be applied to a wide range of MRC tasks in
different domains. Inspired by recent ideas of data selection in machine
translation, we develop a novel sample re-weighting scheme to assign
sample-specific weights to the loss. Empirical study shows that our approach
can be applied to many existing MRC models. Combined with contextual
representations from pre-trained language models (such as ELMo), we achieve new
state-of-the-art results on a set of MRC benchmark datasets. We release our
code at https://github.com/xycforgithub/MultiTask-MRC.Comment: North American Chapter of the Association for Computational
Linguistics (NAACL) 201
Multi-hop Reading Comprehension via Deep Reinforcement Learning based Document Traversal
Reading Comprehension has received significant attention in recent years as
high quality Question Answering (QA) datasets have become available. Despite
state-of-the-art methods achieving strong overall accuracy, Multi-Hop (MH)
reasoning remains particularly challenging. To address MH-QA specifically, we
propose a Deep Reinforcement Learning based method capable of learning
sequential reasoning across large collections of documents so as to pass a
query-aware, fixed-size context subset to existing models for answer
extraction. Our method is comprised of two stages: a linker, which decomposes
the provided support documents into a graph of sentences, and an extractor,
which learns where to look based on the current question and already-visited
sentences. The result of the linker is a novel graph structure at the sentence
level that preserves logical flow while still allowing rapid movement between
documents. Importantly, we demonstrate that the sparsity of the resultant graph
is invariant to context size. This translates to fewer decisions required from
the Deep-RL trained extractor, allowing the system to scale effectively to
large collections of documents.
The importance of sequential decision making in the document traversal step
is demonstrated by comparison to standard IE methods, and we additionally
introduce a BM25-based IR baseline that retrieves documents relevant to the
query only. We examine the integration of our method with existing models on
the recently proposed QAngaroo benchmark and achieve consistent increases in
accuracy across the board, as well as a 2-3x reduction in training time
An Introductory Survey on Attention Mechanisms in NLP Problems
First derived from human intuition, later adapted to machine translation for
automatic token alignment, attention mechanism, a simple method that can be
used for encoding sequence data based on the importance score each element is
assigned, has been widely applied to and attained significant improvement in
various tasks in natural language processing, including sentiment
classification, text summarization, question answering, dependency parsing,
etc. In this paper, we survey through recent works and conduct an introductory
summary of the attention mechanism in different NLP problems, aiming to provide
our readers with basic knowledge on this widely used method, discuss its
different variants for different tasks, explore its association with other
techniques in machine learning, and examine methods for evaluating its
performance.Comment: 9 page
Neural Approaches to Conversational AI
The present paper surveys neural approaches to conversational AI that have
been developed in the last few years. We group conversational systems into
three categories: (1) question answering agents, (2) task-oriented dialogue
agents, and (3) chatbots. For each category, we present a review of
state-of-the-art neural approaches, draw the connection between them and
traditional approaches, and discuss the progress that has been made and
challenges still being faced, using specific systems and models as case
studies.Comment: Foundations and Trends in Information Retrieval (95 pages
Neural Models for Key Phrase Detection and Question Generation
We propose a two-stage neural model to tackle question generation from
documents. First, our model estimates the probability that word sequences in a
document are ones that a human would pick when selecting candidate answers by
training a neural key-phrase extractor on the answers in a question-answering
corpus. Predicted key phrases then act as target answers and condition a
sequence-to-sequence question-generation model with a copy mechanism.
Empirically, our key-phrase extraction model significantly outperforms an
entity-tagging baseline and existing rule-based approaches. We further
demonstrate that our question generation system formulates fluent, answerable
questions from key phrases. This two-stage system could be used to augment or
generate reading comprehension datasets, which may be leveraged to improve
machine reading systems or in educational settings.Comment: Machine Reading for Question Answering workshop at ACL 201
Scaffolding Networks: Incremental Learning and Teaching Through Questioning
We introduce a new paradigm of learning for reasoning, understanding, and
prediction, as well as the scaffolding network to implement this paradigm. The
scaffolding network embodies an incremental learning approach that is
formulated as a teacher-student network architecture to teach machines how to
understand text and do reasoning. The key to our computational scaffolding
approach is the interactions between the teacher and the student through
sequential questioning. The student observes each sentence in the text
incrementally, and it uses an attention-based neural net to discover and
register the key information in relation to its current memory. Meanwhile, the
teacher asks questions about the observed text, and the student network gets
rewarded by correctly answering these questions. The entire network is updated
continually using reinforcement learning. Our experimental results on synthetic
and real datasets show that the scaffolding network not only outperforms
state-of-the-art methods but also learns to do reasoning in a scalable way even
with little human generated input.Comment: 11 pages + Abstract + 3 figure
Recent Advances in Neural Question Generation
Emerging research in Neural Question Generation (NQG) has started to
integrate a larger variety of inputs, and generating questions requiring higher
levels of cognition. These trends point to NQG as a bellwether for NLP, about
how human intelligence embodies the skills of curiosity and integration.
We present a comprehensive survey of neural question generation, examining
the corpora, methodologies, and evaluation methods. From this, we elaborate on
what we see as emerging on NQG's trend: in terms of the learning paradigms,
input modalities, and cognitive levels considered by NQG. We end by pointing
out the potential directions ahead.Comment: Survey of neural question generatio
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Inspired by how humans summarize long documents, we propose an accurate and
fast summarization model that first selects salient sentences and then rewrites
them abstractively (i.e., compresses and paraphrases) to generate a concise
overall summary. We use a novel sentence-level policy gradient method to bridge
the non-differentiable computation between these two neural networks in a
hierarchical way, while maintaining language fluency. Empirically, we achieve
the new state-of-the-art on all metrics (including human evaluation) on the
CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.
Moreover, by first operating at the sentence-level and then the word-level, we
enable parallel decoding of our neural generative model that results in
substantially faster (10-20x) inference speed as well as 4x faster training
convergence than previous long-paragraph encoder-decoder models. We also
demonstrate the generalization of our model on the test-only DUC-2002 dataset,
where we achieve higher scores than a state-of-the-art model.Comment: ACL 2018 (17 pages
Progressive Attention Memory Network for Movie Story Question Answering
This paper proposes the progressive attention memory network (PAMN) for movie
story question answering (QA). Movie story QA is challenging compared to VQA in
two aspects: (1) pinpointing the temporal parts relevant to answer the question
is difficult as the movies are typically longer than an hour, (2) it has both
video and subtitle where different questions require different modality to
infer the answer. To overcome these challenges, PAMN involves three main
features: (1) progressive attention mechanism that utilizes cues from both
question and answer to progressively prune out irrelevant temporal parts in
memory, (2) dynamic modality fusion that adaptively determines the contribution
of each modality for answering the current question, and (3) belief correction
answering scheme that successively corrects the prediction score on each
candidate answer. Experiments on publicly available benchmark datasets, MovieQA
and TVQA, demonstrate that each feature contributes to our movie story QA
architecture, PAMN, and improves performance to achieve the state-of-the-art
result. Qualitative analysis by visualizing the inference mechanism of PAMN is
also provided.Comment: CVPR 2019, Accepte
- …