2,243 research outputs found
Question Answering by Reasoning Across Documents with Graph Convolutional Networks
Most research in reading comprehension has focused on answering questions
based on individual documents or even single paragraphs. We introduce a neural
model which integrates and reasons relying on information spread within
documents and across multiple documents. We frame it as an inference problem on
a graph. Mentions of entities are nodes of this graph while edges encode
relations between different mentions (e.g., within- and cross-document
co-reference). Graph convolutional networks (GCNs) are applied to these graphs
and trained to perform multi-step reasoning. Our Entity-GCN method is scalable
and compact, and it achieves state-of-the-art results on a multi-document
question answering dataset, WikiHop (Welbl et al., 2018).Comment: To appear in Conference of the North American Chapter of the
Association for Computational Linguistics (NAACL), 2019. 13 pages, 3 figures,
6 table
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
In this paper, we present a two stage model for multi-hop question answering.
The first stage is a hierarchical graph network, which is used to reason over
multi-hop question and is capable to capture different levels of granularity
using the nature structure(i.e., paragraphs, questions, sentences and entities)
of documents. The reasoning process is convert to node classify task(i.e.,
paragraph nodes and sentences nodes). The second stage is a language model
fine-tuning task. In a word, stage one use graph neural network to select and
concatenate support sentences as one paragraph, and stage two find the answer
span in language model fine-tuning paradigm.Comment: the experience result is not as good as I excep
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Existing visual reasoning datasets such as Visual Question Answering (VQA),
often suffer from biases conditioned on the question, image or answer
distributions. The recently proposed CLEVR dataset addresses these limitations
and requires fine-grained reasoning but the dataset is synthetic and consists
of similar objects and sentence structures across the dataset.
In this paper, we introduce a new inference task, Visual Entailment (VE) -
consisting of image-sentence pairs whereby a premise is defined by an image,
rather than a natural language sentence as in traditional Textual Entailment
tasks. The goal of a trained VE model is to predict whether the image
semantically entails the text. To realize this task, we build a dataset SNLI-VE
based on the Stanford Natural Language Inference corpus and Flickr30k dataset.
We evaluate various existing VQA baselines and build a model called Explainable
Visual Entailment (EVE) system to address the VE task. EVE achieves up to 71%
accuracy and outperforms several other state-of-the-art VQA based models.
Finally, we demonstrate the explainability of EVE through cross-modal attention
visualizations. The SNLI-VE dataset is publicly available at
https://github.com/ necla-ml/SNLI-VE
- …