20 research outputs found
Open-Retrieval Conversational Question Answering
Conversational search is one of the ultimate goals of information retrieval.
Recent research approaches conversational search by simplified settings of
response ranking and conversational question answering, where an answer is
either selected from a given candidate set or extracted from a given passage.
These simplifications neglect the fundamental role of retrieval in
conversational search. To address this limitation, we introduce an
open-retrieval conversational question answering (ORConvQA) setting, where we
learn to retrieve evidence from a large collection before extracting answers,
as a further step towards building functional conversational search systems. We
create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an
end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader
that are all based on Transformers. Our extensive experiments on OR-QuAC
demonstrate that a learnable retriever is crucial for ORConvQA. We further show
that our system can make a substantial improvement when we enable history
modeling in all system components. Moreover, we show that the reranker
component contributes to the model performance by providing a regularization
effect. Finally, further in-depth analyses are performed to provide new
insights into ORConvQA.Comment: Accepted to SIGIR'2
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension
In this paper, we study machine reading comprehension (MRC) on long texts,
where a model takes as inputs a lengthy document and a question and then
extracts a text span from the document as an answer. State-of-the-art models
tend to use a pretrained transformer model (e.g., BERT) to encode the joint
contextual information of document and question. However, these
transformer-based models can only take a fixed-length (e.g., 512) text as its
input. To deal with even longer text inputs, previous approaches usually chunk
them into equally-spaced segments and predict answers based on each segment
independently without considering the information from other segments. As a
result, they may form segments that fail to cover the correct answer span or
retain insufficient contexts around it, which significantly degrades the
performance. Moreover, they are less capable of answering questions that need
cross-segment information.
We propose to let a model learn to chunk in a more flexible way via
reinforcement learning: a model can decide the next segment that it wants to
process in either direction. We also employ recurrent mechanisms to enable
information to flow across segments. Experiments on three MRC datasets -- CoQA,
QuAC, and TriviaQA -- demonstrate the effectiveness of our proposed recurrent
chunking mechanisms: we can obtain segments that are more likely to contain
complete answers and at the same time provide sufficient contexts around the
ground truth answers for better predictions