2 research outputs found
Unsupervised Multi-hop Question Answering by Question Generation
Obtaining training data for multi-hop question answering (QA) is
time-consuming and resource-intensive. We explore the possibility to train a
well-performed multi-hop QA model without referencing any human-labeled
multi-hop question-answer pairs, i.e., unsupervised multi-hop QA. We propose
MQA-QG, an unsupervised framework that can generate human-like multi-hop
training data from both homogeneous and heterogeneous data sources. MQA-QG
generates questions by first selecting/generating relevant information from
each data source and then integrating the multiple information to form a
multi-hop question. Using only generated training data, we can train a
competent multi-hop QA which achieves 61% and 83% of the supervised learning
performance for the HybridQA and the HotpotQA dataset, respectively. We also
show that pretraining the QA system with the generated data would greatly
reduce the demand for human-annotated training data. Our codes are publicly
available at https://github.com/teacherpeterpan/Unsupervised-Multi-hop-QA.Comment: NAACL 2021 (long paper
Summary-Oriented Question Generation for Informational Queries
Users frequently ask simple factoid questions for question answering (QA)
systems, attenuating the impact of myriad recent works that support more
complex questions. Prompting users with automatically generated suggested
questions (SQs) can improve user understanding of QA system capabilities and
thus facilitate more effective use. We aim to produce self-explanatory
questions that focus on main document topics and are answerable with variable
length passages as appropriate. We satisfy these requirements by using a
BERT-based Pointer-Generator Network trained on the Natural Questions (NQ)
dataset. Our model shows SOTA performance of SQ generation on the NQ dataset
(20.1 BLEU-4). We further apply our model on out-of-domain news articles,
evaluating with a QA system due to the lack of gold questions and demonstrate
that our model produces better SQs for news articles -- with further
confirmation via a human evaluation.Comment: 17 page