42,879 research outputs found
Incorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks
Visual Question Answering (VQA) has attracted much attention since it offers
insight into the relationships between the multi-modal analysis of images and
natural language. Most of the current algorithms are incapable of answering
open-domain questions that require to perform reasoning beyond the image
contents. To address this issue, we propose a novel framework which endows the
model capabilities in answering more complex questions by leveraging massive
external knowledge with dynamic memory networks. Specifically, the questions
along with the corresponding images trigger a process to retrieve the relevant
information in external knowledge bases, which are embedded into a continuous
vector space by preserving the entity-relation structures. Afterwards, we
employ dynamic memory networks to attend to the large body of facts in the
knowledge graph and images, and then perform reasoning over these facts to
generate corresponding answers. Extensive experiments demonstrate that our
model not only achieves the state-of-the-art performance in the visual question
answering task, but can also answer open-domain questions effectively by
leveraging the external knowledge
UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering
In relation extraction for knowledge-based question answering, searching from
one entity to another entity via a single relation is called "one hop". In
related work, an exhaustive search from all one-hop relations, two-hop
relations, and so on to the max-hop relations in the knowledge graph is
necessary but expensive. Therefore, the number of hops is generally restricted
to two or three. In this paper, we propose UHop, an unrestricted-hop framework
which relaxes this restriction by use of a transition-based search framework to
replace the relation-chain-based search one. We conduct experiments on
conventional 1- and 2-hop questions as well as lengthy questions, including
datasets such as WebQSP, PathQuestion, and Grid World. Results show that the
proposed framework enables the ability to halt, works well with
state-of-the-art models, achieves competitive performance without exhaustive
searches, and opens the performance gap for long relation paths.Comment: To appear in NAACL-HLT 201
Dynamic Memory Networks for Visual and Textual Question Answering
Neural network architectures with memory and attention mechanisms exhibit
certain reasoning capabilities required for question answering. One such
architecture, the dynamic memory network (DMN), obtained high accuracy on a
variety of language tasks. However, it was not shown whether the architecture
achieves strong results for question answering when supporting facts are not
marked during training or whether it could be applied to other modalities such
as images. Based on an analysis of the DMN, we propose several improvements to
its memory and input modules. Together with these changes we introduce a novel
input module for images in order to be able to answer visual questions. Our new
DMN+ model improves the state of the art on both the Visual Question Answering
dataset and the \babi-10k text question-answering dataset without supporting
fact supervision
R: Reinforced Reader-Ranker for Open-Domain Question Answering
In recent years researchers have achieved considerable success applying
neural network methods to question answering (QA). These approaches have
achieved state of the art results in simplified closed-domain settings such as
the SQuAD (Rajpurkar et al., 2016) dataset, which provides a pre-selected
passage, from which the answer to a given question may be extracted. More
recently, researchers have begun to tackle open-domain QA, in which the model
is given a question and access to a large corpus (e.g., wikipedia) instead of a
pre-selected passage (Chen et al., 2017a). This setting is more complex as it
requires large-scale search for relevant passages by an information retrieval
component, combined with a reading comprehension model that "reads" the
passages to generate an answer to the question. Performance in this setting
lags considerably behind closed-domain performance. In this paper, we present a
novel open-domain QA system called Reinforced Ranker-Reader , based on
two algorithmic innovations. First, we propose a new pipeline for open-domain
QA with a Ranker component, which learns to rank retrieved passages in terms of
likelihood of generating the ground-truth answer to a given question. Second,
we propose a novel method that jointly trains the Ranker along with an
answer-generation Reader model, based on reinforcement learning. We report
extensive experimental results showing that our method significantly improves
on the state of the art for multiple open-domain QA datasets.Comment: 8 pages, accepted by AAAI 201
Memory-augmented Dialogue Management for Task-oriented Dialogue Systems
Dialogue management (DM) decides the next action of a dialogue system
according to the current dialogue state, and thus plays a central role in
task-oriented dialogue systems. Since dialogue management requires to have
access to not only local utterances, but also the global semantics of the
entire dialogue session, modeling the long-range history information is a
critical issue. To this end, we propose a novel Memory-Augmented Dialogue
management model (MAD) which employs a memory controller and two additional
memory structures, i.e., a slot-value memory and an external memory. The
slot-value memory tracks the dialogue state by memorizing and updating the
values of semantic slots (for instance, cuisine, price, and location), and the
external memory augments the representation of hidden states of traditional
recurrent neural networks through storing more context information. To update
the dialogue state efficiently, we also propose slot-level attention on user
utterances to extract specific semantic information for each slot. Experiments
show that our model can obtain state-of-the-art performance and outperforms
existing baselines.Comment: 25 pages, 9 figures, Under review of ACM Transactions on Information
Systems (TOIS
Cognitive Graph for Multi-Hop Reading Comprehension at Scale
We propose a new CogQA framework for multi-hop question answering in
web-scale documents. Inspired by the dual process theory in cognitive science,
the framework gradually builds a \textit{cognitive graph} in an iterative
process by coordinating an implicit extraction module (System 1) and an
explicit reasoning module (System 2). While giving accurate answers, our
framework further provides explainable reasoning paths. Specifically, our
implementation based on BERT and graph neural network efficiently handles
millions of documents for multi-hop reasoning questions in the HotpotQA
fullwiki dataset, achieving a winning joint score of 34.9 on the
leaderboard, compared to 23.6 of the best competitor.Comment: ACL 201
Dynamically Fused Graph Network for Multi-hop Reasoning
Text-based question answering (TBQA) has been studied extensively in recent
years. Most existing approaches focus on finding the answer to a question
within a single paragraph. However, many difficult questions require multiple
supporting evidence from scattered text among two or more documents. In this
paper, we propose Dynamically Fused Graph Network(DFGN), a novel method to
answer those questions requiring multiple scattered evidence and reasoning over
them. Inspired by human's step-by-step reasoning behavior, DFGN includes a
dynamic fusion layer that starts from the entities mentioned in the given
query, explores along the entity graph dynamically built from the text, and
gradually finds relevant supporting entities from the given documents. We
evaluate DFGN on HotpotQA, a public TBQA dataset requiring multi-hop reasoning.
DFGN achieves competitive results on the public board. Furthermore, our
analysis shows DFGN produces interpretable reasoning chains.Comment: Accepted by ACL 1
Representation Learning for Dynamic Graphs: A Survey
Graphs arise naturally in many real-world applications including social
networks, recommender systems, ontologies, biology, and computational finance.
Traditionally, machine learning models for graphs have been mostly designed for
static graphs. However, many applications involve evolving graphs. This
introduces important challenges for learning and inference since nodes,
attributes, and edges change over time. In this survey, we review the recent
advances in representation learning for dynamic graphs, including dynamic
knowledge graphs. We describe existing models from an encoder-decoder
perspective, categorize these encoders and decoders based on the techniques
they employ, and analyze the approaches in each category. We also review
several prominent applications and widely used datasets and highlight
directions for future research.Comment: Accepted at JMLR, 73 pages, 2 figure
Visual Relationship Detection using Scene Graphs: A Survey
Understanding a scene by decoding the visual relationships depicted in an
image has been a long studied problem. While the recent advances in deep
learning and the usage of deep neural networks have achieved near human
accuracy on many tasks, there still exists a pretty big gap between human and
machine level performance when it comes to various visual relationship
detection tasks. Developing on earlier tasks like object recognition,
segmentation and captioning which focused on a relatively coarser image
understanding, newer tasks have been introduced recently to deal with a finer
level of image understanding. A Scene Graph is one such technique to better
represent a scene and the various relationships present in it. With its wide
number of applications in various tasks like Visual Question Answering,
Semantic Image Retrieval, Image Generation, among many others, it has proved to
be a useful tool for deeper and better visual relationship understanding. In
this paper, we present a detailed survey on the various techniques for scene
graph generation, their efficacy to represent visual relationships and how it
has been used to solve various downstream tasks. We also attempt to analyze the
various future directions in which the field might advance in the future. Being
one of the first papers to give a detailed survey on this topic, we also hope
to give a succinct introduction to scene graphs, and guide practitioners while
developing approaches for their applications
Attentive Memory Networks: Efficient Machine Reading for Conversational Search
Recent advances in conversational systems have changed the search paradigm.
Traditionally, a user poses a query to a search engine that returns an answer
based on its index, possibly leveraging external knowledge bases and
conditioning the response on earlier interactions in the search session. In a
natural conversation, there is an additional source of information to take into
account: utterances produced earlier in a conversation can also be referred to
and a conversational IR system has to keep track of information conveyed by the
user during the conversation, even if it is implicit.
We argue that the process of building a representation of the conversation
can be framed as a machine reading task, where an automated system is presented
with a number of statements about which it should answer questions. The
questions should be answered solely by referring to the statements provided,
without consulting external knowledge. The time is right for the information
retrieval community to embrace this task, both as a stand-alone task and
integrated in a broader conversational search setting.
In this paper, we focus on machine reading as a stand-alone task and present
the Attentive Memory Network (AMN), an end-to-end trainable machine reading
algorithm. Its key contribution is in efficiency, achieved by having an
hierarchical input encoder, iterating over the input only once. Speed is an
important requirement in the setting of conversational search, as gaps between
conversational turns have a detrimental effect on naturalness. On 20 datasets
commonly used for evaluating machine reading algorithms we show that the AMN
achieves performance comparable to the state-of-the-art models, while using
considerably fewer computations
- …