141,474 research outputs found
A Span-Extraction Dataset for Chinese Machine Reading Comprehension
Machine Reading Comprehension (MRC) has become enormously popular recently
and has attracted a lot of attention. However, the existing reading
comprehension datasets are mostly in English. In this paper, we introduce a
Span-Extraction dataset for Chinese machine reading comprehension to add
language diversities in this area. The dataset is composed by near 20,000 real
questions annotated on Wikipedia paragraphs by human experts. We also annotated
a challenge set which contains the questions that need comprehensive
understanding and multi-sentence inference throughout the context. We present
several baseline systems as well as anonymous submissions for demonstrating the
difficulties in this dataset. With the release of the dataset, we hosted the
Second Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC
2018). We hope the release of the dataset could further accelerate the Chinese
machine reading comprehension research. Resources are available:
https://github.com/ymcui/cmrc2018Comment: 6 pages, accepted as a conference paper at EMNLP-IJCNLP 2019 (short
paper
Reinforced Mnemonic Reader for Machine Reading Comprehension
In this paper, we introduce the Reinforced Mnemonic Reader for machine
reading comprehension tasks, which enhances previous attentive readers in two
aspects. First, a reattention mechanism is proposed to refine current
attentions by directly accessing to past attentions that are temporally
memorized in a multi-round alignment architecture, so as to avoid the problems
of attention redundancy and attention deficiency. Second, a new optimization
approach, called dynamic-critical reinforcement learning, is introduced to
extend the standard supervised method. It always encourages to predict a more
acceptable answer so as to address the convergence suppression problem occurred
in traditional reinforcement learning algorithms. Extensive experiments on the
Stanford Question Answering Dataset (SQuAD) show that our model achieves
state-of-the-art results. Meanwhile, our model outperforms previous systems by
over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD
datasets.Comment: Published in 27th International Joint Conference on Artificial
Intelligence (IJCAI), 201
STARC: Structured Annotations for Reading Comprehension
We present STARC (Structured Annotations for Reading Comprehension), a new
annotation framework for assessing reading comprehension with multiple choice
questions. Our framework introduces a principled structure for the answer
choices and ties them to textual span annotations. The framework is implemented
in OneStopQA, a new high-quality dataset for evaluation and analysis of reading
comprehension in English. We use this dataset to demonstrate that STARC can be
leveraged for a key new application for the development of SAT-like reading
comprehension materials: automatic annotation quality probing via span ablation
experiments. We further show that it enables in-depth analyses and comparisons
between machine and human reading comprehension behavior, including error
distributions and guessing ability. Our experiments also reveal that the
standard multiple choice dataset in NLP, RACE, is limited in its ability to
measure reading comprehension. 47% of its questions can be guessed by machines
without accessing the passage, and 18% are unanimously judged by humans as not
having a unique correct answer. OneStopQA provides an alternative test set for
reading comprehension which alleviates these shortcomings and has a
substantially higher human ceiling performance.Comment: ACL 2020. OneStopQA dataset, STARC guidelines and human experiments
data are available at https://github.com/berzak/onestop-q
Training a Ranking Function for Open-Domain Question Answering
In recent years, there have been amazing advances in deep learning methods
for machine reading. In machine reading, the machine reader has to extract the
answer from the given ground truth paragraph. Recently, the state-of-the-art
machine reading models achieve human level performance in SQuAD which is a
reading comprehension-style question answering (QA) task. The success of
machine reading has inspired researchers to combine information retrieval with
machine reading to tackle open-domain QA. However, these systems perform poorly
compared to reading comprehension-style QA because it is difficult to retrieve
the pieces of paragraphs that contain the answer to the question. In this
study, we propose two neural network rankers that assign scores to different
passages based on their likelihood of containing the answer to a given
question. Additionally, we analyze the relative importance of semantic
similarity and word level relevance matching in open-domain QA.Comment: To appear at NAACL-SRW 201
Stochastic Answer Networks for Machine Reading Comprehension
We propose a simple yet robust stochastic answer network (SAN) that simulates
multi-step reasoning in machine reading comprehension. Compared to previous
work such as ReasoNet which used reinforcement learning to determine the number
of steps, the unique feature is the use of a kind of stochastic prediction
dropout on the answer module (final layer) of the neural network during the
training. We show that this simple trick improves robustness and achieves
results competitive to the state-of-the-art on the Stanford Question Answering
Dataset (SQuAD), the Adversarial SQuAD, and the Microsoft MAchine Reading
COmprehension Dataset (MS MARCO).Comment: 11 pages, 5 figures, Accepted to ACL 201
- …
