61,862 research outputs found
Retrieve-and-Read: Multi-task Learning of Information Retrieval and Reading Comprehension
This study considers the task of machine reading at scale (MRS) wherein,
given a question, a system first performs the information retrieval (IR) task
of finding relevant passages in a knowledge source and then carries out the
reading comprehension (RC) task of extracting an answer span from the passages.
Previous MRS studies, in which the IR component was trained without considering
answer spans, struggled to accurately find a small number of relevant passages
from a large set of passages. In this paper, we propose a simple and effective
approach that incorporates the IR and RC tasks by using supervised multi-task
learning in order that the IR component can be trained by considering answer
spans. Experimental results on the standard benchmark, answering SQuAD
questions using the full Wikipedia as the knowledge source, showed that our
model achieved state-of-the-art performance. Moreover, we thoroughly evaluated
the individual contributions of our model components with our new Japanese
dataset and SQuAD. The results showed significant improvements in the IR task
and provided a new perspective on IR for RC: it is effective to teach which
part of the passage answers the question rather than to give only a relevance
score to the whole passage.Comment: 10 pages, 6 figure. Accepted as a full paper at CIKM 201
Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning
This paper focuses on how to take advantage of external relational knowledge
to improve machine reading comprehension (MRC) with multi-task learning. Most
of the traditional methods in MRC assume that the knowledge used to get the
correct answer generally exists in the given documents. However, in real-world
task, part of knowledge may not be mentioned and machines should be equipped
with the ability to leverage external knowledge. In this paper, we integrate
relational knowledge into MRC model for commonsense reasoning. Specifically,
based on a pre-trained language model (LM). We design two auxiliary
relation-aware tasks to predict if there exists any commonsense relation and
what is the relation type between two words, in order to better model the
interactions between document and candidate answer option. We conduct
experiments on two multi-choice benchmark datasets: the SemEval-2018 Task 11
and the Cloze Story Test. The experimental results demonstrate the
effectiveness of the proposed method, which achieves superior performance
compared with the comparable baselines on both datasets.Comment: Accepted at CIKM'19, 4 page
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
In this paper, we present a two stage model for multi-hop question answering.
The first stage is a hierarchical graph network, which is used to reason over
multi-hop question and is capable to capture different levels of granularity
using the nature structure(i.e., paragraphs, questions, sentences and entities)
of documents. The reasoning process is convert to node classify task(i.e.,
paragraph nodes and sentences nodes). The second stage is a language model
fine-tuning task. In a word, stage one use graph neural network to select and
concatenate support sentences as one paragraph, and stage two find the answer
span in language model fine-tuning paradigm.Comment: the experience result is not as good as I excep
Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension
Reading comprehension is a challenging task in natural language processing
and requires a set of skills to be solved. While current approaches focus on
solving the task as a whole, in this paper, we propose to use a neural network
`skill' transfer approach. We transfer knowledge from several lower-level
language tasks (skills) including textual entailment, named entity recognition,
paraphrase detection and question type classification into the reading
comprehension model.
We conduct an empirical evaluation and show that transferring language skill
knowledge leads to significant improvements for the task with much fewer steps
compared to the baseline model. We also show that the skill transfer approach
is effective even with small amounts of training data. Another finding of this
work is that using token-wise deep label supervision for text classification
improves the performance of transfer learning
Named Entity Recognition via Machine Reading Comprehension: A Multi-Task Learning Approach
Named Entity Recognition (NER) aims to extract and classify entity mentions
in the text into pre-defined types (e.g., organization or person name).
Recently, many works have been proposed to shape the NER as a machine reading
comprehension problem (also termed MRC-based NER), in which entity recognition
is achieved by answering the formulated questions related to pre-defined entity
types through MRC, based on the contexts. However, these works ignore the label
dependencies among entity types, which are critical for precisely recognizing
named entities. In this paper, we propose to incorporate the label dependencies
among entity types into a multi-task learning framework for better MRC-based
NER. We decompose MRC-based NER into multiple tasks and use a self-attention
module to capture label dependencies. Comprehensive experiments on both nested
NER and flat NER datasets are conducted to validate the effectiveness of the
proposed Multi-NER. Experimental results show that Multi-NER can achieve better
performance on all datasets
Reading Wikipedia to Answer Open-Domain Questions
This paper proposes to tackle open- domain question answering using Wikipedia
as the unique knowledge source: the answer to any factoid question is a text
span in a Wikipedia article. This task of machine reading at scale combines the
challenges of document retrieval (finding the relevant articles) with that of
machine comprehension of text (identifying the answer spans from those
articles). Our approach combines a search component based on bigram hashing and
TF-IDF matching with a multi-layer recurrent neural network model trained to
detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA
datasets indicate that (1) both modules are highly competitive with respect to
existing counterparts and (2) multitask learning using distant supervision on
their combination is an effective complete system on this challenging task.Comment: ACL2017, 10 page
- …