640 research outputs found
Constructing Tree-based Index for Efficient and Effective Dense Retrieval
Recent studies have shown that Dense Retrieval (DR) techniques can
significantly improve the performance of first-stage retrieval in IR systems.
Despite its empirical effectiveness, the application of DR is still limited. In
contrast to statistic retrieval models that rely on highly efficient inverted
index solutions, DR models build dense embeddings that are difficult to be
pre-processed with most existing search indexing systems. To avoid the
expensive cost of brute-force search, the Approximate Nearest Neighbor (ANN)
algorithm and corresponding indexes are widely applied to speed up the
inference process of DR models. Unfortunately, while ANN can improve the
efficiency of DR models, it usually comes with a significant price on retrieval
performance.
To solve this issue, we propose JTR, which stands for Joint optimization of
TRee-based index and query encoding. Specifically, we design a new unified
contrastive learning loss to train tree-based index and query encoder in an
end-to-end manner. The tree-based negative sampling strategy is applied to make
the tree have the maximum heap property, which supports the effectiveness of
beam search well. Moreover, we treat the cluster assignment as an optimization
problem to update the tree-based index that allows overlapped clustering. We
evaluate JTR on numerous popular retrieval benchmarks. Experimental results
show that JTR achieves better retrieval performance while retaining high system
efficiency compared with widely-adopted baselines. It provides a potential
solution to balance efficiency and effectiveness in neural retrieval system
designs.Comment: 10 pages, accepted at SIGIR 202
Dense Text Retrieval based on Pretrained Language Models: A Survey
Text retrieval is a long-standing research topic on information seeking,
where a system is required to return relevant information resources to user's
queries in natural language. From classic retrieval methods to learning-based
ranking functions, the underlying retrieval models have been continually
evolved with the ever-lasting technical innovation. To design effective
retrieval models, a key point lies in how to learn the text representation and
model the relevance matching. The recent success of pretrained language models
(PLMs) sheds light on developing more capable text retrieval approaches by
leveraging the excellent modeling capacity of PLMs. With powerful PLMs, we can
effectively learn the representations of queries and texts in the latent
representation space, and further construct the semantic matching function
between the dense vectors for relevance modeling. Such a retrieval approach is
referred to as dense retrieval, since it employs dense vectors (a.k.a.,
embeddings) to represent the texts. Considering the rapid progress on dense
retrieval, in this survey, we systematically review the recent advances on
PLM-based dense retrieval. Different from previous surveys on dense retrieval,
we take a new perspective to organize the related work by four major aspects,
including architecture, training, indexing and integration, and summarize the
mainstream techniques for each aspect. We thoroughly survey the literature, and
include 300+ related reference papers on dense retrieval. To support our
survey, we create a website for providing useful resources, and release a code
repertory and toolkit for implementing dense retrieval models. This survey aims
to provide a comprehensive, practical reference focused on the major progress
for dense text retrieval
EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval
Dense embedding-based retrieval is now the industry standard for semantic
search and ranking problems, like obtaining relevant web documents for a given
query. Such techniques use a two-stage process: (a) contrastive learning to
train a dual encoder to embed both the query and documents and (b) approximate
nearest neighbor search (ANNS) for finding similar documents for a given query.
These two stages are disjoint; the learned embeddings might be ill-suited for
the ANNS method and vice-versa, leading to suboptimal performance. In this
work, we propose End-to-end Hierarchical Indexing -- EHI -- that jointly learns
both the embeddings and the ANNS structure to optimize retrieval performance.
EHI uses a standard dual encoder model for embedding queries and documents
while learning an inverted file index (IVF) style tree structure for efficient
ANNS. To ensure stable and efficient learning of discrete tree-based ANNS
structure, EHI introduces the notion of dense path embedding that captures the
position of a query/document in the tree. We demonstrate the effectiveness of
EHI on several benchmarks, including de-facto industry standard MS MARCO (Dev
set and TREC DL19) datasets. For example, with the same compute budget, EHI
outperforms state-of-the-art (SOTA) in by 0.6% (MRR@10) on MS MARCO dev set and
by 4.2% (nDCG@10) on TREC DL19 benchmarks
Efficient Neural Ranking using Forward Indexes and Lightweight Encoders
Dual-encoder-based dense retrieval models have become the standard in IR.
They employ large Transformer-based language models, which are notoriously
inefficient in terms of resources and latency. We propose Fast-Forward indexes
-- vector forward indexes which exploit the semantic matching capabilities of
dual-encoder models for efficient and effective re-ranking. Our framework
enables re-ranking at very high retrieval depths and combines the merits of
both lexical and semantic matching via score interpolation. Furthermore, in
order to mitigate the limitations of dual-encoders, we tackle two main
challenges: Firstly, we improve computational efficiency by either
pre-computing representations, avoiding unnecessary computations altogether, or
reducing the complexity of encoders. This allows us to considerably improve
ranking efficiency and latency. Secondly, we optimize the memory footprint and
maintenance cost of indexes; we propose two complementary techniques to reduce
the index size and show that, by dynamically dropping irrelevant document
tokens, the index maintenance efficiency can be improved substantially. We
perform evaluation to show the effectiveness and efficiency of Fast-Forward
indexes -- our method has low latency and achieves competitive results without
the need for hardware acceleration, such as GPUs.Comment: Accepted at ACM TOIS. arXiv admin note: text overlap with
arXiv:2110.0605
Vector Search with OpenAI Embeddings: Lucene Is All You Need
We provide a reproducible, end-to-end demonstration of vector search with
OpenAI embeddings using Lucene on the popular MS MARCO passage ranking test
collection. The main goal of our work is to challenge the prevailing narrative
that a dedicated vector store is necessary to take advantage of recent advances
in deep neural networks as applied to search. Quite the contrary, we show that
hierarchical navigable small-world network (HNSW) indexes in Lucene are
adequate to provide vector search capabilities in a standard bi-encoder
architecture. This suggests that, from a simple cost-benefit analysis, there
does not appear to be a compelling reason to introduce a dedicated vector store
into a modern "AI stack" for search, since such applications have already
received substantial investments in existing, widely deployed infrastructure
- …