3 research outputs found
Assertion-based QA with Question-Aware Open Information Extraction
We present assertion based question answering (ABQA), an open domain question
answering task that takes a question and a passage as inputs, and outputs a
semi-structured assertion consisting of a subject, a predicate and a list of
arguments. An assertion conveys more evidences than a short answer span in
reading comprehension, and it is more concise than a tedious passage in
passage-based QA. These advantages make ABQA more suitable for human-computer
interaction scenarios such as voice-controlled speakers. Further progress
towards improving ABQA requires richer supervised dataset and powerful models
of text understanding. To remedy this, we introduce a new dataset called
WebAssertions, which includes hand-annotated QA labels for 358,427 assertions
in 55,960 web passages. To address ABQA, we develop both generative and
extractive approaches. The backbone of our generative approach is sequence to
sequence learning. In order to capture the structure of the output assertion,
we introduce a hierarchical decoder that first generates the structure of the
assertion and then generates the words of each field. The extractive approach
is based on learning to rank. Features at different levels of granularity are
designed to measure the semantic relevance between a question and an assertion.
Experimental results show that our approaches have the ability to infer
question-aware assertions from a passage. We further evaluate our approaches by
incorporating the ABQA results as additional features in passage-based QA.
Results on two datasets show that ABQA features significantly improve the
accuracy on passage-based~QA.Comment: To be published at AAAI 201
Knowledge Based Machine Reading Comprehension
Machine reading comprehension (MRC) requires reasoning about both the
knowledge involved in a document and knowledge about the world. However,
existing datasets are typically dominated by questions that can be well solved
by context matching, which fail to test this capability. To encourage the
progress on knowledge-based reasoning in MRC, we present knowledge-based MRC in
this paper, and build a new dataset consisting of 40,047 question-answer pairs.
The annotation of this dataset is designed so that successfully answering the
questions requires understanding and the knowledge involved in a document. We
implement a framework consisting of both a question answering model and a
question generation model, both of which take the knowledge extracted from the
document as well as relevant facts from an external knowledge base such as
Freebase/ProBase/Reverb/NELL. Results show that incorporating side information
from external KB improves the accuracy of the baseline question answer system.
We compare it with a standard MRC model BiDAF, and also provide the difficulty
of the dataset and lay out remaining challenges
Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding
Question Aware Open Information Extraction (Question aware Open IE) takes
question and passage as inputs, outputting an answer tuple which contains a
subject, a predicate, and one or more arguments. Each field of answer is a
natural language word sequence and is extracted from the passage. The
semi-structured answer has two advantages which are more readable and
falsifiable compared to span answer. There are two approaches to solve this
problem. One is an extractive method which extracts candidate answers from the
passage with the Open IE model, and ranks them by matching with questions. It
fully uses the passage information at the extraction step, but the extraction
is independent to the question. The other one is the generative method which
uses a sequence to sequence model to generate answers directly. It combines the
question and passage as input at the same time, but it generates the answer
from scratch, which does not use the facts that most of the answer words come
from in the passage. To guide the generation by passage, we present a two-stage
decoding model which contains a tagging decoder and a correction decoder. At
the first stage, the tagging decoder will tag keywords from the passage. At the
second stage, the correction decoder will generate answers based on tagged
keywords. Our model could be trained end-to-end although it has two stages.
Compared to previous generative models, we generate better answers by
generating coarse to fine. We evaluate our model on WebAssertions (Yan et al.,
2018) which is a Question aware Open IE dataset. Our model achieves a BLEU
score of 59.32, which is better than previous generative methods.Comment: 11 pages, 1 figure, 4 table