136,544 research outputs found
Hybrid LSH: Faster Near Neighbors Reporting in High-dimensional Space
We study the -near neighbors reporting problem (-NN), i.e., reporting
\emph{all} points in a high-dimensional point set that lie within a radius
of a given query point . Our approach builds upon on the
locality-sensitive hashing (LSH) framework due to its appealing asymptotic
sublinear query time for near neighbor search problems in high-dimensional
space. A bottleneck of the traditional LSH scheme for solving -NN is that
its performance is sensitive to data and query-dependent parameters. On
datasets whose data distributions have diverse local density patterns, LSH with
inappropriate tuning parameters can sometimes be outperformed by a simple
linear search.
In this paper, we introduce a hybrid search strategy between LSH-based search
and linear search for -NN in high-dimensional space. By integrating an
auxiliary data structure into LSH hash tables, we can efficiently estimate the
computational cost of LSH-based search for a given query regardless of the data
distribution. This means that we are able to choose the appropriate search
strategy between LSH-based search and linear search to achieve better
performance. Moreover, the integrated data structure is time efficient and fits
well with many recent state-of-the-art LSH-based approaches. Our experiments on
real-world datasets show that the hybrid search approach outperforms (or is
comparable to) both LSH-based search and linear search for a wide range of
search radii and data distributions in high-dimensional space.Comment: Accepted as a short paper in EDBT 201
Orthogonal Range Reporting and Rectangle Stabbing for Fat Rectangles
In this paper we study two geometric data structure problems in the special
case when input objects or queries are fat rectangles. We show that in this
case a significant improvement compared to the general case can be achieved.
We describe data structures that answer two- and three-dimensional orthogonal
range reporting queries in the case when the query range is a \emph{fat}
rectangle. Our two-dimensional data structure uses words and supports
queries in time, where is the number of points in the
data structure, is the size of the universe and is the number of points
in the query range. Our three-dimensional data structure needs
words of space and answers queries in time. We also consider the rectangle stabbing problem on a set of
three-dimensional fat rectangles. Our data structure uses space and
answers stabbing queries in time.Comment: extended version of a WADS'19 pape
Optimal Color Range Reporting in One Dimension
Color (or categorical) range reporting is a variant of the orthogonal range
reporting problem in which every point in the input is assigned a \emph{color}.
While the answer to an orthogonal point reporting query contains all points in
the query range , the answer to a color reporting query contains only
distinct colors of points in . In this paper we describe an O(N)-space data
structure that answers one-dimensional color reporting queries in optimal
time, where is the number of colors in the answer and is the
number of points in the data structure. Our result can be also dynamized and
extended to the external memory model
- …