1 research outputs found
BrePartition: Optimized High-Dimensional kNN Search with Bregman Distances
Bregman distances (also known as Bregman divergences) are widely used in
machine learning, speech recognition and signal processing, and kNN searches
with Bregman distances have become increasingly important with the rapid
advances of multimedia applications. Data in multimedia applications such as
images and videos are commonly transformed into space of hundreds of
dimensions. Such high-dimensional space has posed significant challenges for
existing kNN search algorithms with Bregman distances, which could only handle
data of medium dimensionality (typically less than 100). This paper addresses
the urgent problem of high-dimensional kNN search with Bregman distances. We
propose a novel partition-filter-refinement framework. Specifically, we propose
an optimized dimensionality partitioning scheme to solve several non-trivial
issues. First, an effective bound from each partitioned subspace to obtain
exact kNN results is derived. Second, we conduct an in-depth analysis of the
optimized number of partitions and devise an effective strategy for
partitioning. Third, we design an efficient integrated index structure for all
the subspaces together to accelerate the search processing. Moreover, we extend
our exact solution to an approximate version by a trade-off between the
accuracy and efficiency. Experimental results on four real-world datasets and
two synthetic datasets show the clear advantage of our method in comparison to
state-of-the-art algorithms