1,919 research outputs found
Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering
The most approaches to Knowledge Base Question Answering are based on
semantic parsing. In this paper, we address the problem of learning vector
representations for complex semantic parses that consist of multiple entities
and relations. Previous work largely focused on selecting the correct semantic
relations for a question and disregarded the structure of the semantic parse:
the connections between entities and the directions of the relations. We
propose to use Gated Graph Neural Networks to encode the graph structure of the
semantic parse. We show on two data sets that the graph networks outperform all
baseline models that do not explicitly model the structure. The error analysis
confirms that our approach can successfully process complex semantic parses.Comment: Accepted as COLING 2018 Long Paper, 12 page
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering (VQA) is a challenging task that has received
increasing attention from both the computer vision and the natural language
processing communities. Given an image and a question in natural language, it
requires reasoning over visual elements of the image and general knowledge to
infer the correct answer. In the first part of this survey, we examine the
state of the art by comparing modern approaches to the problem. We classify
methods by their mechanism to connect the visual and textual modalities. In
particular, we examine the common approach of combining convolutional and
recurrent neural networks to map images and questions to a common feature
space. We also discuss memory-augmented and modular architectures that
interface with structured knowledge bases. In the second part of this survey,
we review the datasets available for training and evaluating VQA systems. The
various datatsets contain questions at different levels of complexity, which
require different capabilities and types of reasoning. We examine in depth the
question/answer pairs from the Visual Genome project, and evaluate the
relevance of the structured annotations of images with scene graphs for VQA.
Finally, we discuss promising future directions for the field, in particular
the connection to structured knowledge bases and the use of natural language
processing models.Comment: 25 page
- …