3 research outputs found
Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus
The ability to ask questions is important in both human and machine
intelligence. Learning to ask questions helps knowledge acquisition, improves
question-answering and machine reading comprehension tasks, and helps a chatbot
to keep the conversation flowing with a human. Existing question generation
models are ineffective at generating a large amount of high-quality
question-answer pairs from unstructured text, since given an answer and an
input passage, question generation is inherently a one-to-many mapping. In this
paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which
aims at automatically generating high-quality and diverse question-answer pairs
from unlabeled text corpus at scale by imitating the way a human asks
questions. Our system consists of: i) an information extractor, which samples
from the text multiple types of assistive information to guide question
generation; ii) neural question generators, which generate diverse and
controllable questions, leveraging the extracted assistive information; and
iii) a neural quality controller, which removes low-quality generated data
based on text entailment. We compare our question generation models with
existing approaches and resort to voluntary human evaluation to assess the
quality of the generated question-answer pairs. The evaluation results suggest
that our system dramatically outperforms state-of-the-art neural question
generation models in terms of the generation quality, while being scalable in
the meantime. With models trained on a relatively smaller amount of data, we
can generate 2.8 million quality-assured question-answer pairs from a million
sentences found in Wikipedia.Comment: Accepted by The Web Conference 2020 (WWW 2020) as full paper (oral
presentation