52 research outputs found
Open-Retrieval Conversational Question Answering
Conversational search is one of the ultimate goals of information retrieval.
Recent research approaches conversational search by simplified settings of
response ranking and conversational question answering, where an answer is
either selected from a given candidate set or extracted from a given passage.
These simplifications neglect the fundamental role of retrieval in
conversational search. To address this limitation, we introduce an
open-retrieval conversational question answering (ORConvQA) setting, where we
learn to retrieve evidence from a large collection before extracting answers,
as a further step towards building functional conversational search systems. We
create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an
end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader
that are all based on Transformers. Our extensive experiments on OR-QuAC
demonstrate that a learnable retriever is crucial for ORConvQA. We further show
that our system can make a substantial improvement when we enable history
modeling in all system components. Moreover, we show that the reranker
component contributes to the model performance by providing a regularization
effect. Finally, further in-depth analyses are performed to provide new
insights into ORConvQA.Comment: Accepted to SIGIR'2
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Large pre-trained language models have been shown to store factual knowledge
in their parameters, and achieve state-of-the-art results when fine-tuned on
downstream NLP tasks. However, their ability to access and precisely manipulate
knowledge is still limited, and hence on knowledge-intensive tasks, their
performance lags behind task-specific architectures. Additionally, providing
provenance for their decisions and updating their world knowledge remain open
research problems. Pre-trained models with a differentiable access mechanism to
explicit non-parametric memory can overcome this issue, but have so far been
only investigated for extractive downstream tasks. We explore a general-purpose
fine-tuning recipe for retrieval-augmented generation (RAG) -- models which
combine pre-trained parametric and non-parametric memory for language
generation. We introduce RAG models where the parametric memory is a
pre-trained seq2seq model and the non-parametric memory is a dense vector index
of Wikipedia, accessed with a pre-trained neural retriever. We compare two RAG
formulations, one which conditions on the same retrieved passages across the
whole generated sequence, the other can use different passages per token. We
fine-tune and evaluate our models on a wide range of knowledge-intensive NLP
tasks and set the state-of-the-art on three open domain QA tasks, outperforming
parametric seq2seq models and task-specific retrieve-and-extract architectures.
For language generation tasks, we find that RAG models generate more specific,
diverse and factual language than a state-of-the-art parametric-only seq2seq
baseline.Comment: Accepted at NeurIPS 202
Cited Text Spans for Citation Text Generation
An automatic citation generation system aims to concisely and accurately
describe the relationship between two scientific articles. To do so, such a
system must ground its outputs to the content of the cited paper to avoid
non-factual hallucinations. Due to the length of scientific documents, existing
abstractive approaches have conditioned only on cited paper abstracts. We
demonstrate empirically that the abstract is not always the most appropriate
input for citation generation and that models trained in this way learn to
hallucinate. We propose to condition instead on the cited text span (CTS) as an
alternative to the abstract. Because manual CTS annotation is extremely time-
and labor-intensive, we experiment with distant labeling of candidate CTS
sentences, achieving sufficiently strong performance to substitute for
expensive human annotations in model training, and we propose a
human-in-the-loop, keyword-based CTS retrieval approach that makes generating
citation texts grounded in the full text of cited papers both promising and
practical
Dense Retrieval as Indirect Supervision for Large-space Decision Making
Many discriminative natural language understanding (NLU) tasks have large
label spaces. Learning such a process of large-space decision making is
particularly challenging due to the lack of training instances per label and
the difficulty of selection among many fine-grained labels. Inspired by dense
retrieval methods for passage finding in open-domain QA, we propose a
reformulation of large-space discriminative NLU tasks as a learning-to-retrieve
task, leading to a novel solution named Dense Decision Retrieval (DDR ).
Instead of predicting fine-grained decisions as logits, DDR adopts a
dual-encoder architecture that learns to predict by retrieving from a decision
thesaurus. This approach not only leverages rich indirect supervision signals
from easy-to-consume learning resources for dense retrieval, it also leads to
enhanced prediction generalizability with a semantically meaningful
representation of the large decision space. When evaluated on tasks with
decision spaces ranging from hundreds to hundred-thousand scales, DDR
outperforms strong baselines greatly by 27.54% in P@1 on two extreme
multi-label classification tasks, 1.17% in F1 score ultra-fine entity typing,
and 1.26% in accuracy on three few-shot intent classification tasks on average.
Code and resources are available at https://github.com/luka-group/DDRComment: EMNLP 2023 (Findings
Recommended from our members
History Modeling for Conversational Information Retrieval
Conversational search is an embodiment of an iterative and interactive approach to information retrieval (IR) that has been studied for decades. Due to the recent rise of intelligent personal assistants, such as Siri, Alexa, AliMe, Cortana, and Google Assistant, a growing part of the population is moving their information-seeking activities to voice- or text-based conversational interfaces. One of the major challenges of conversational search is to leverage the conversation history to understand and fulfill the users\u27 information needs. In this dissertation work, we investigate history modeling approaches for conversational information retrieval. We start from history modeling for user intent prediction. We analyze information-seeking conversations by user intent distribution, co-occurrence, and flow patterns, followed by a study of user intent prediction in an information-seeking setting with both feature-based methods and deep learning methods. We then move to history modeling for conversational question answering (ConvQA), which can be considered as a simplified setting of conversational search. We first propose a positional history answer embedding (PosHAE) method to seamlessly integrate conversation history into a ConvQA model based on BERT. We then build upon this method and design a history attention mechanism (HAM) to conduct a ``soft selection\u27\u27 for conversation history. After this, we extend the previous ConvQA task to an open-retrieval (ORConvQA) setting to emphasize the fundamental role of retrieval in conversational search. In this setting, we learn to retrieve evidence from a large collection before extracting answers. We build an end-to-end system for ORConvQA, featuring a learnable dense retriever. We conduct experiments with both fully-supervised and weakly-supervised approaches to tackle the training challenges of ORConvQA. Finally, we study history modeling for conversational re-ranking. Given a history of user feedback behaviors, such as issuing a query, clicking a document, and skipping a document, we propose to introduce behavior awareness to a neural ranker. Our experimental results show that the history modeling approaches proposed in this dissertation can effectively improve the performance of different conversation tasks and provide new insights into conversational information retrieval
NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework
Information retrieval aims to find information that meets users' needs from
the corpus. Different needs correspond to different IR tasks such as document
retrieval, open-domain question answering, retrieval-based dialogue, etc.,
while they share the same schema to estimate the relationship between texts. It
indicates that a good IR model can generalize to different tasks and domains.
However, previous studies indicate that state-of-the-art neural information
retrieval (NIR) models, e.g, pre-trained language models (PLMs) are hard to
generalize. Mainly because the end-to-end fine-tuning paradigm makes the model
overemphasize task-specific signals and domain biases but loses the ability to
capture generalized essential signals. To address this problem, we propose a
novel NIR training framework named NIR-Prompt for retrieval and reranking
stages based on the idea of decoupling signal capturing and combination.
NIR-Prompt exploits Essential Matching Module (EMM) to capture the essential
matching signals and gets the description of tasks by Matching Description
Module (MDM). The description is used as task-adaptation information to combine
the essential matching signals to adapt to different tasks. Experiments under
in-domain multi-task, out-of-domain multi-task, and new task adaptation
settings show that NIR-Prompt can improve the generalization of PLMs in NIR for
both retrieval and reranking stages compared with baselines.Comment: This article is the extension of arXiv:2204.02725 and accepted by
TOI
- …