324 research outputs found
Neural Vector Spaces for Unsupervised Information Retrieval
We propose the Neural Vector Space Model (NVSM), a method that learns
representations of documents in an unsupervised manner for news article
retrieval. In the NVSM paradigm, we learn low-dimensional representations of
words and documents from scratch using gradient descent and rank documents
according to their similarity with query representations that are composed from
word representations. We show that NVSM performs better at document ranking
than existing latent semantic vector space methods. The addition of NVSM to a
mixture of lexical language models and a state-of-the-art baseline vector space
model yields a statistically significant increase in retrieval effectiveness.
Consequently, NVSM adds a complementary relevance signal. Next to semantic
matching, we find that NVSM performs well in cases where lexical matching is
needed.
NVSM learns a notion of term specificity directly from the document
collection without feature engineering. We also show that NVSM learns
regularities related to Luhn significance. Finally, we give advice on how to
deploy NVSM in situations where model selection (e.g., cross-validation) is
infeasible. We find that an unsupervised ensemble of multiple models trained
with different hyperparameter values performs better than a single
cross-validated model. Therefore, NVSM can safely be used for ranking documents
without supervised relevance judgments.Comment: TOIS 201
Training Curricula for Open Domain Answer Re-Ranking
In precision-oriented tasks like answer ranking, it is more important to rank
many relevant answers highly than to retrieve all relevant answers. It follows
that a good ranking strategy would be to learn how to identify the easiest
correct answers first (i.e., assign a high ranking score to answers that have
characteristics that usually indicate relevance, and a low ranking score to
those with characteristics that do not), before incorporating more complex
logic to handle difficult cases (e.g., semantic matching or reasoning). In this
work, we apply this idea to the training of neural answer rankers using
curriculum learning. We propose several heuristics to estimate the difficulty
of a given training sample. We show that the proposed heuristics can be used to
build a training curriculum that down-weights difficult samples early in the
training process. As the training process progresses, our approach gradually
shifts to weighting all samples equally, regardless of difficulty. We present a
comprehensive evaluation of our proposed idea on three answer ranking datasets.
Results show that our approach leads to superior performance of two leading
neural ranking architectures, namely BERT and ConvKNRM, using both pointwise
and pairwise losses. When applied to a BERT-based ranker, our method yields up
to a 4% improvement in MRR and a 9% improvement in P@1 (compared to the model
trained without a curriculum). This results in models that can achieve
comparable performance to more expensive state-of-the-art techniques.Comment: Accepted at SIGIR 2020 (long
Efficient Document Re-Ranking for Transformers by Precomputing Term Representations
Deep pretrained transformer networks are effective at various ranking tasks,
such as question answering and ad-hoc document ranking. However, their
computational expenses deem them cost-prohibitive in practice. Our proposed
approach, called PreTTR (Precomputing Transformer Term Representations),
considerably reduces the query-time latency of deep transformer networks (up to
a 42x speedup on web document ranking) making these networks more practical to
use in a real-time ranking scenario. Specifically, we precompute part of the
document term representations at indexing time (without a query), and merge
them with the query representation at query time to compute the final ranking
score. Due to the large size of the token representations, we also propose an
effective approach to reduce the storage requirement by training a compression
layer to match attention scores. Our compression technique reduces the storage
required up to 95% and it can be applied without a substantial degradation in
ranking performance.Comment: Accepted at SIGIR 2020 (long
Context Aware Query Rewriting for Text Rankers using LLM
Query rewriting refers to an established family of approaches that are
applied to underspecified and ambiguous queries to overcome the vocabulary
mismatch problem in document ranking. Queries are typically rewritten during
query processing time for better query modelling for the downstream ranker.
With the advent of large-language models (LLMs), there have been initial
investigations into using generative approaches to generate pseudo documents to
tackle this inherent vocabulary gap. In this work, we analyze the utility of
LLMs for improved query rewriting for text ranking tasks. We find that there
are two inherent limitations of using LLMs as query re-writers -- concept drift
when using only queries as prompts and large inference costs during query
processing. We adopt a simple, yet surprisingly effective, approach called
context aware query rewriting (CAR) to leverage the benefits of LLMs for query
understanding. Firstly, we rewrite ambiguous training queries by context-aware
prompting of LLMs, where we use only relevant documents as context.Unlike
existing approaches, we use LLM-based query rewriting only during the training
phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of
the original queries during training. In our extensive experiments, we find
that fine-tuning a ranker using re-written queries offers a significant
improvement of up to 33% on the passage ranking task and up to 28% on the
document ranking task when compared to the baseline performance of using
original queries
Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval
State-of-the-art neural (re)rankers are notoriously data hungry which - given
the lack of large-scale training data in languages other than English - makes
them rarely used in multilingual and cross-lingual retrieval settings. Current
approaches therefore typically transfer rankers trained on English data to
other languages and cross-lingual setups by means of multilingual encoders:
they fine-tune all the parameters of a pretrained massively multilingual
Transformer (MMT, e.g., multilingual BERT) on English relevance judgments and
then deploy it in the target language. In this work, we show that two
parameter-efficient approaches to cross-lingual transfer, namely Sparse
Fine-Tuning Masks (SFTMs) and Adapters, allow for a more lightweight and more
effective zero-shot transfer to multilingual and cross-lingual retrieval tasks.
We first train language adapters (or SFTMs) via Masked Language Modelling and
then train retrieval (i.e., reranking) adapters (SFTMs) on top while keeping
all other parameters fixed. At inference, this modular design allows us to
compose the ranker by applying the task adapter (or SFTM) trained with source
language data together with the language adapter (or SFTM) of a target
language. Besides improved transfer performance, these two approaches offer
faster ranker training, with only a fraction of parameters being updated
compared to full MMT fine-tuning. We benchmark our models on the CLEF-2003
benchmark, showing that our parameter-efficient methods outperform standard
zero-shot transfer with full MMT fine-tuning, while enabling modularity and
reducing training times. Further, we show on the example of Swahili and Somali
that, for low(er)-resource languages, our parameter-efficient neural re-rankers
can improve the ranking of the competitive machine translation-based ranker
Selecting which Dense Retriever to use for Zero-Shot Search
We propose the new problem of choosing which dense retrieval model to use
when searching on a new collection for which no labels are available, i.e. in a
zero-shot setting. Many dense retrieval models are readily available. Each
model however is characterized by very differing search effectiveness -- not
just on the test portion of the datasets in which the dense representations
have been learned but, importantly, also across different datasets for which
data was not used to learn the dense representations. This is because dense
retrievers typically require training on a large amount of labeled data to
achieve satisfactory search effectiveness in a specific dataset or domain.
Moreover, effectiveness gains obtained by dense retrievers on datasets for
which they are able to observe labels during training, do not necessarily
generalise to datasets that have not been observed during training. This is
however a hard problem: through empirical experimentation we show that methods
inspired by recent work in unsupervised performance evaluation with the
presence of domain shift in the area of computer vision and machine learning
are not effective for choosing highly performing dense retrievers in our setup.
The availability of reliable methods for the selection of dense retrieval
models in zero-shot settings that do not require the collection of labels for
evaluation would allow to streamline the widespread adoption of dense
retrieval. This is therefore an important new problem we believe the
information retrieval community should consider. Implementation of methods,
along with raw result files and analysis scripts are made publicly available at
https://www.github.com/anonymized
Query Resolution for Conversational Search with Limited Supervision
In this work we focus on multi-turn passage retrieval as a crucial component
of conversational search. One of the key challenges in multi-turn passage
retrieval comes from the fact that the current turn query is often
underspecified due to zero anaphora, topic change, or topic return. Context
from the conversational history can be used to arrive at a better expression of
the current turn query, defined as the task of query resolution. In this paper,
we model the query resolution task as a binary term classification problem: for
each term appearing in the previous turns of the conversation decide whether to
add it to the current turn query or not. We propose QuReTeC (Query Resolution
by Term Classification), a neural query resolution model based on bidirectional
transformers. We propose a distant supervision method to automatically generate
training data by using query-passage relevance labels. Such labels are often
readily available in a collection either as human annotations or inferred from
user interactions. We show that QuReTeC outperforms state-of-the-art models,
and furthermore, that our distant supervision method can be used to
substantially reduce the amount of human-curated data required to train
QuReTeC. We incorporate QuReTeC in a multi-turn, multi-stage passage retrieval
architecture and demonstrate its effectiveness on the TREC CAsT dataset.Comment: SIGIR 2020 full conference pape
- …