1,617 research outputs found
Survey of Vector Database Management Systems
There are now over 20 commercial vector database management systems (VDBMSs),
all produced within the past five years. But embedding-based retrieval has been
studied for over ten years, and similarity search a staggering half century and
more. Driving this shift from algorithms to systems are new data intensive
applications, notably large language models, that demand vast stores of
unstructured data coupled with reliable, secure, fast, and scalable query
processing capability. A variety of new data management techniques now exist
for addressing these needs, however there is no comprehensive survey to
thoroughly review these techniques and systems. We start by identifying five
main obstacles to vector data management, namely vagueness of semantic
similarity, large size of vectors, high cost of similarity comparison, lack of
natural partitioning that can be used for indexing, and difficulty of
efficiently answering hybrid queries that require both attributes and vectors.
Overcoming these obstacles has led to new approaches to query processing,
storage and indexing, and query optimization and execution. For query
processing, a variety of similarity scores and query types are now well
understood; for storage and indexing, techniques include vector compression,
namely quantization, and partitioning based on randomization, learning
partitioning, and navigable partitioning; for query optimization and execution,
we describe new operators for hybrid queries, as well as techniques for plan
enumeration, plan selection, and hardware accelerated execution. These
techniques lead to a variety of VDBMSs across a spectrum of design and runtime
characteristics, including native systems specialized for vectors and extended
systems that incorporate vector capabilities into existing systems. We then
discuss benchmarks, and finally we outline research challenges and point the
direction for future work.Comment: 25 page
Vector Search with OpenAI Embeddings: Lucene Is All You Need
We provide a reproducible, end-to-end demonstration of vector search with
OpenAI embeddings using Lucene on the popular MS MARCO passage ranking test
collection. The main goal of our work is to challenge the prevailing narrative
that a dedicated vector store is necessary to take advantage of recent advances
in deep neural networks as applied to search. Quite the contrary, we show that
hierarchical navigable small-world network (HNSW) indexes in Lucene are
adequate to provide vector search capabilities in a standard bi-encoder
architecture. This suggests that, from a simple cost-benefit analysis, there
does not appear to be a compelling reason to introduce a dedicated vector store
into a modern "AI stack" for search, since such applications have already
received substantial investments in existing, widely deployed infrastructure
Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search
Retrieval pipelines commonly rely on a term-based search to obtain candidate
records, which are subsequently re-ranked. Some candidates are missed by this
approach, e.g., due to a vocabulary mismatch. We address this issue by
replacing the term-based search with a generic k-NN retrieval algorithm, where
a similarity function can take into account subtle term associations. While an
exact brute-force k-NN search using this similarity function is slow, we
demonstrate that an approximate algorithm can be nearly two orders of magnitude
faster at the expense of only a small loss in accuracy. A retrieval pipeline
using an approximate k-NN search can be more effective and efficient than the
term-based pipeline. This opens up new possibilities for designing effective
retrieval pipelines. Our software (including data-generating code) and
derivative data based on the Stack Overflow collection is available online
Vector database management systems: Fundamental concepts, use-cases, and current challenges
Vector database management systems have emerged as an important component in
modern data management, driven by the growing importance for the need to
computationally describe rich data such as texts, images and video in various
domains such as recommender systems, similarity search, and chatbots. These
data descriptions are captured as numerical vectors that are computationally
inexpensive to store and compare. However, the unique characteristics of
vectorized data, including high dimensionality and sparsity, demand specialized
solutions for efficient storage, retrieval, and processing. This study provides
an accessible introduction to the fundamental concepts, use-cases, and current
challenges associated with vector database management systems, offering an
overview for researchers and practitioners seeking to explore this burgeoning
technology aimed to facilitate effective vector data management.Comment: 12 pages, 5 figure
Serving Deep Learning Model in Relational Databases
Serving deep learning (DL) models on relational data has become a critical
requirement across diverse commercial and scientific domains, sparking growing
interest recently. In this visionary paper, we embark on a comprehensive
exploration of representative architectures to address the requirement. We
highlight three pivotal paradigms: The state-of-the-artDL-Centricarchitecture
offloadsDL computations to dedicated DL frameworks. The potential UDF-Centric
architecture encapsulates one or more tensor computations into User Defined
Functions (UDFs) within the database system. The
potentialRelation-Centricarchitecture aims to represent a large-scale tensor
computation through relational operators. While each of these architectures
demonstrates promise in specific use scenarios, we identify urgent requirements
for seamless integration of these architectures and the middle ground between
these architectures. We delve into the gaps that impede the integration and
explore innovative strategies to close them. We present a pathway to establish
a novel database system for enabling a broad class of data-intensive DL
inference applications.Comment: Authors are ordered alphabetically; Jia Zou is the corresponding
autho
CAPS: A Practical Partition Index for Filtered Similarity Search
With the surging popularity of approximate near-neighbor search (ANNS),
driven by advances in neural representation learning, the ability to serve
queries accompanied by a set of constraints has become an area of intense
interest. While the community has recently proposed several algorithms for
constrained ANNS, almost all of these methods focus on integration with
graph-based indexes, the predominant class of algorithms achieving
state-of-the-art performance in latency-recall tradeoffs. In this work, we take
a different approach and focus on developing a constrained ANNS algorithm via
space partitioning as opposed to graphs. To that end, we introduce Constrained
Approximate Partitioned Search (CAPS), an index for ANNS with filters via space
partitions that not only retains the benefits of a partition-based algorithm
but also outperforms state-of-the-art graph-based constrained search techniques
in recall-latency tradeoffs, with only 10% of the index size.Comment: 14 page
- …