Search CORE

11 research outputs found

Propagate-Selector: Detecting Supporting Sentences for Question Answering via Graph Neural Networks

Author: Bui Trung
Dernoncourt Franck
Jung Kyomin
Kim Doo Soon
Yoon Seunghyun
Publication venue
Publication date: 16/02/2020
Field of study

In this study, we propose a novel graph neural network called propagate-selector (PS), which propagates information over sentences to understand information that cannot be inferred when considering sentences in isolation. First, we design a graph structure in which each node represents an individual sentence, and some pairs of nodes are selectively connected based on the text structure. Then, we develop an iterative attentive aggregation and a skip-combine method in which a node interacts with its neighborhood nodes to accumulate the necessary information. To evaluate the performance of the proposed approaches, we conduct experiments with the standard HotpotQA dataset. The empirical results demonstrate the superiority of our proposed approach, which obtains the best performances, compared to the widely used answer-selection models that do not consider the intersentential relationship.Comment: 8 pages, Accepted as a conference paper at LREC 202

arXiv.org e-Print Archive

SNU Open Repository and Archive

TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Author: Garg Siddhant
Moschitti Alessandro
Vu Thuy
Publication venue
Publication date: 20/11/2019
Field of study

We propose TANDA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high-quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for answer sentence selection, which is a well-known inference task in Question Answering. We built a large scale dataset to enable the transfer step, exploiting the Natural Questions dataset. Our approach establishes the state of the art on two well-known benchmarks, WikiQA and TREC-QA, achieving MAP scores of 92% and 94.3%, respectively, which largely outperform the previous highest scores of 83.4% and 87.5%, obtained in very recent work. We empirically show that TANDA generates more stable and robust models reducing the effort required for selecting optimal hyper-parameters. Additionally, we show that the transfer step of TANDA makes the adaptation step more robust to noise. This enables a more effective use of noisy datasets for fine-tuning. Finally, we also confirm the positive impact of TANDA in an industrial setting, using domain specific datasets subject to different types of noise.Comment: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020), Oral Presentatio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications