11 research outputs found
Propagate-Selector: Detecting Supporting Sentences for Question Answering via Graph Neural Networks
In this study, we propose a novel graph neural network called
propagate-selector (PS), which propagates information over sentences to
understand information that cannot be inferred when considering sentences in
isolation. First, we design a graph structure in which each node represents an
individual sentence, and some pairs of nodes are selectively connected based on
the text structure. Then, we develop an iterative attentive aggregation and a
skip-combine method in which a node interacts with its neighborhood nodes to
accumulate the necessary information. To evaluate the performance of the
proposed approaches, we conduct experiments with the standard HotpotQA dataset.
The empirical results demonstrate the superiority of our proposed approach,
which obtains the best performances, compared to the widely used
answer-selection models that do not consider the intersentential relationship.Comment: 8 pages, Accepted as a conference paper at LREC 202
TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection
We propose TANDA, an effective technique for fine-tuning pre-trained
Transformer models for natural language tasks. Specifically, we first transfer
a pre-trained model into a model for a general task by fine-tuning it with a
large and high-quality dataset. We then perform a second fine-tuning step to
adapt the transferred model to the target domain. We demonstrate the benefits
of our approach for answer sentence selection, which is a well-known inference
task in Question Answering. We built a large scale dataset to enable the
transfer step, exploiting the Natural Questions dataset. Our approach
establishes the state of the art on two well-known benchmarks, WikiQA and
TREC-QA, achieving MAP scores of 92% and 94.3%, respectively, which largely
outperform the previous highest scores of 83.4% and 87.5%, obtained in very
recent work. We empirically show that TANDA generates more stable and robust
models reducing the effort required for selecting optimal hyper-parameters.
Additionally, we show that the transfer step of TANDA makes the adaptation step
more robust to noise. This enables a more effective use of noisy datasets for
fine-tuning. Finally, we also confirm the positive impact of TANDA in an
industrial setting, using domain specific datasets subject to different types
of noise.Comment: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020),
Oral Presentatio