4,365 research outputs found
Multi-grained Evidence Inference for Multi-choice Reading Comprehension
Multi-choice Machine Reading Comprehension (MRC) is a major and challenging
task for machines to answer questions according to provided options. Answers in
multi-choice MRC cannot be directly extracted in the given passages, and
essentially require machines capable of reasoning from accurate extracted
evidence. However, the critical evidence may be as simple as just one word or
phrase, while it is hidden in the given redundant, noisy passage with multiple
linguistic hierarchies from phrase, fragment, sentence until the entire
passage. We thus propose a novel general-purpose model enhancement which
integrates multi-grained evidence comprehensively, named Multi-grained evidence
inferencer (Mugen), to make up for the inability. Mugen extracts three
different granularities of evidence: coarse-, middle- and fine-grained
evidence, and integrates evidence with the original passages, achieving
significant and consistent performance improvement on four multi-choice MRC
benchmarks.Comment: Accepted by TASLP 2023, vol. 31, pp. 3896-390
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
In this paper, we present a two stage model for multi-hop question answering.
The first stage is a hierarchical graph network, which is used to reason over
multi-hop question and is capable to capture different levels of granularity
using the nature structure(i.e., paragraphs, questions, sentences and entities)
of documents. The reasoning process is convert to node classify task(i.e.,
paragraph nodes and sentences nodes). The second stage is a language model
fine-tuning task. In a word, stage one use graph neural network to select and
concatenate support sentences as one paragraph, and stage two find the answer
span in language model fine-tuning paradigm.Comment: the experience result is not as good as I excep
Exploiting Sentence Embedding for Medical Question Answering
Despite the great success of word embedding, sentence embedding remains a
not-well-solved problem. In this paper, we present a supervised learning
framework to exploit sentence embedding for the medical question answering
task. The learning framework consists of two main parts: 1) a sentence
embedding producing module, and 2) a scoring module. The former is developed
with contextual self-attention and multi-scale techniques to encode a sentence
into an embedding tensor. This module is shortly called Contextual
self-Attention Multi-scale Sentence Embedding (CAMSE). The latter employs two
scoring strategies: Semantic Matching Scoring (SMS) and Semantic Association
Scoring (SAS). SMS measures similarity while SAS captures association between
sentence pairs: a medical question concatenated with a candidate choice, and a
piece of corresponding supportive evidence. The proposed framework is examined
by two Medical Question Answering(MedicalQA) datasets which are collected from
real-world applications: medical exam and clinical diagnosis based on
electronic medical records (EMR). The comparison results show that our proposed
framework achieved significant improvements compared to competitive baseline
approaches. Additionally, a series of controlled experiments are also conducted
to illustrate that the multi-scale strategy and the contextual self-attention
layer play important roles for producing effective sentence embedding, and the
two kinds of scoring strategies are highly complementary to each other for
question answering problems.Comment: 8 page
Constructing Datasets for Multi-hop Reading Comprehension Across Documents
Most Reading Comprehension methods limit themselves to queries which can be
answered using a single sentence, paragraph, or document. Enabling models to
combine disjoint pieces of textual evidence would extend the scope of machine
comprehension methods, but currently there exist no resources to train and test
this capability. We propose a novel task to encourage the development of models
for text understanding across multiple documents and to investigate the limits
of existing methods. In our task, a model learns to seek and combine evidence -
effectively performing multi-hop (alias multi-step) inference. We devise a
methodology to produce datasets for this task, given a collection of
query-answer pairs and thematically linked documents. Two datasets from
different domains are induced, and we identify potential pitfalls and devise
circumvention strategies. We evaluate two previously proposed competitive
models and find that one can integrate information across documents. However,
both models struggle to select relevant information, as providing documents
guaranteed to be relevant greatly improves their performance. While the models
outperform several strong baselines, their best accuracy reaches 42.9% compared
to human performance at 74.0% - leaving ample room for improvement.Comment: This paper directly corresponds to the TACL version
(https://transacl.org/ojs/index.php/tacl/article/view/1325) apart from minor
changes in wording, additional footnotes, and appendice
XTQA: Span-Level Explanations of the Textbook Question Answering
Textbook Question Answering (TQA) is a task that one should answer a
diagram/non-diagram question given a large multi-modal context consisting of
abundant essays and diagrams. We argue that the explainability of this task
should place students as a key aspect to be considered. To address this issue,
we devise a novel architecture towards span-level eXplanations of the TQA
(XTQA) based on our proposed coarse-to-fine grained algorithm, which can
provide not only the answers but also the span-level evidences to choose them
for students. This algorithm first coarsely chooses top paragraphs relevant
to questions using the TF-IDF method, and then chooses top evidence spans
finely from all candidate spans within these paragraphs by computing the
information gain of each span to questions. Experimental results shows that
XTQA significantly improves the state-of-the-art performance compared with
baselines. The source code is available at
https://github.com/keep-smile-001/opentqaComment: 10 page
- …