141,587 research outputs found
Adaptive Range Counting and Other Frequency-Based Range Query Problems
We consider variations of range searching in which, given a query range, our goal is to compute some function based on frequencies of points that lie in the range. The most basic such computation involves counting the number of points in a query range. Data structures that compute this function solve the well-studied range counting problem. We consider adaptive and approximate data structures for the 2-D orthogonal range counting problem under the w-bit word RAM model. The query time of an adaptive range counting data structure is sensitive to k, the number of points being counted. We give an adaptive data structure that requires O(n loglog n) space and O(loglog n + log_w k) query time. Non-adaptive data structures on the other hand require Ω(log_w n) query time (Pătraşcu, 2007). Our specific bounds are interesting for two reasons. First, when k=O(1), our bounds match the state of the art for the 2-D orthogonal range emptiness problem (Chan et al., 2011). Second, when k=Θ(n), our data structure is tight to the aforementioned Ω(log_w n) query time lower bound.
We also give approximate data structures for 2-D orthogonal range counting whose bounds match the state of the art for the 2-D orthogonal range emptiness problem. Our first data structure requires O(n loglog n) space and O(loglog n) query time. Our second data structure requires O(n) space and O(log^ε n) query time for any fixed constant ε>0. These data structures compute an approximation k' such that (1-δ)k≤k'≤(1+δ)k for any fixed constant δ>0.
The range selection query problem in an array involves finding the kth lowest element in a given subarray. Range selection in an array is very closely related to 3-sided 2-D orthogonal range counting. An extension of our technique for 3-sided 2-D range counting yields an efficient solution to adaptive range selection in an array. In particular, we present an adaptive data structure that requires O(n) space and O(log_w k) query time, exactly matching a recent lower bound (Jørgensen and Larsen, 2011).
We next consider a variety of frequency-based range query problems in arrays. We give efficient data structures for the range mode and least frequent element query problems and also exhibit the hardness of these problems by reducing Boolean matrix multiplication to the construction and use of a range mode or least frequent element data structure. We also give data structures for the range α-majority and α-minority query problems. An α-majority is an element whose frequency in a subarray is greater than an α fraction of the size of the subarray; any other element is an α-minority. Surprisingly, geometric insights prove to be useful even in the design of our 1-D range α-majority and α-minority data structures
On Geometric Range Searching, Approximate Counting and Depth Problems
In this thesis we deal with problems connected to range searching,
which is one of the central areas of computational geometry.
The dominant problems in this area are
halfspace range searching, simplex range searching and orthogonal range searching and
research into these problems has spanned decades.
For many range searching problems, the best possible
data structures cannot offer fast (i.e., polylogarithmic) query
times if we limit ourselves to near linear storage.
Even worse, it is conjectured (and proved in some cases)
that only very small improvements to these might be possible.
This inefficiency has encouraged many researchers to seek alternatives through approximations.
In this thesis we continue this line of research and focus on
relative approximation of range counting problems.
One important problem where it is possible to achieve significant speedup
through approximation is halfspace range counting in 3D.
Here we continue the previous research done
and obtain the first optimal data structure for approximate halfspace range counting in 3D.
Our data structure has the slight advantage of being Las Vegas (the result is always correct) in contrast
to the previous methods that were Monte Carlo (the correctness holds with high probability).
Another series of problems where approximation can provide us with
substantial speedup comes from robust statistics.
We recognize three problems here:
approximate Tukey depth, regression depth and simplicial depth queries.
In 2D, we obtain an optimal data structure capable of approximating
the regression depth of a query hyperplane.
We also offer a linear space data structure which can answer approximate
Tukey depth queries efficiently in 3D.
These data structures are obtained by applying our ideas for the
approximate halfspace counting problem.
Approximating the simplicial depth turns out to be much more
difficult, however.
Computing the simplicial depth of a given point is more computationally
challenging than most other definitions of data depth.
In 2D we obtain the first data structure which uses near linear space
and can answer approximate simplicial depth queries in polylogarithmic time.
As applications of this result, we provide two non-trivial methods to
approximate the simplicial depth of a given point in higher dimension.
Along the way, we establish a tight combinatorial relationship between
the Tukey depth of any given point and its simplicial depth.
Another problem investigated in this thesis is the dominance reporting problem,
an important special case of orthogonal range reporting.
In three dimensions, we solve this
problem in the pointer machine model and the external memory model
by offering the first optimal data structures in these models of computation.
Also, in the RAM model and for points from
an integer grid we reduce the space complexity of the fastest
known data structure to optimal.
Using known techniques in the literature, we can use our
results to obtain solutions for the orthogonal range searching problem as well.
The query complexity offered by our orthogonal range reporting data structures
match the most efficient query complexities
known in the literature but our space bounds are lower than the previous methods in the external
memory model and RAM model where the input is a subset of an integer grid.
The results also yield improved orthogonal range searching in
higher dimensions (which shows the significance
of the dominance reporting problem).
Intersection searching is a generalization of range searching where
we deal with more complicated geometric objects instead of points.
We investigate the rectilinear disjoint polygon counting problem
which is a specialized intersection counting problem.
We provide a linear-size data structure capable of counting
the number of disjoint rectilinear polygons
intersecting any rectilinear polygon of constant size.
The query time (as well as some other properties of our data structure) resembles
the classical simplex range searching data structures
Succinct Color Searching in One Dimension
In this paper we study succinct data structures for one-dimensional color reporting and color counting problems.
We are given a set of n points with integer coordinates in the range [1,m] and every point is assigned a color from the set {1,...sigma}.
A color reporting query asks for the list of distinct colors that occur in a query interval [a,b] and a color counting query asks for the number of distinct colors in [a,b].
We describe a succinct data structure that answers approximate color counting queries in O(1) time and uses mathcal{B}(n,m) + O(n) + o(mathcal{B}(n,m)) bits,
where mathcal{B}(n,m) is the minimum number of bits required to represent an arbitrary set of size n from a universe of m elements. Thus we show, somewhat counterintuitively,
that it is not necessary to store colors of points in order to answer approximate color counting queries.
In the special case when points are in the rank space (i.e., when n=m), our data structure needs only O(n) bits.
Also, we show that Omega(n) bits are necessary in that case.
Then we turn to succinct data structures for color reporting.
We describe a data structure that uses mathcal{B}(n,m) + nH_d(S) + o(mathcal{B}(n,m)) + o(nlgsigma) bits and answers queries in O(k+1) time,
where k is the number of colors in the answer, and nH_d(S) (d=log_sigma n) is the d-th order empirical entropy of the color sequence. Finally, we consider succinct color reporting under restricted updates. Our dynamic data structure uses nH_d(S)+o(nlgsigma) bits and supports queries in O(k+1) time
Dynamic Range Majority Data Structures
Given a set of coloured points on the real line, we study the problem of
answering range -majority (or "heavy hitter") queries on . More
specifically, for a query range , we want to return each colour that is
assigned to more than an -fraction of the points contained in . We
present a new data structure for answering range -majority queries on a
dynamic set of points, where . Our data structure uses O(n)
space, supports queries in time, and updates in amortized time. If the coordinates of the points are integers,
then the query time can be improved to . For constant values of , this improved query
time matches an existing lower bound, for any data structure with
polylogarithmic update time. We also generalize our data structure to handle
sets of points in d-dimensions, for , as well as dynamic arrays, in
which each entry is a colour.Comment: 16 pages, Preliminary version appeared in ISAAC 201
Linear-Space Data Structures for Range Mode Query in Arrays
A mode of a multiset is an element of maximum multiplicity;
that is, occurs at least as frequently as any other element in . Given a
list of items, we consider the problem of constructing a data
structure that efficiently answers range mode queries on . Each query
consists of an input pair of indices for which a mode of must
be returned. We present an -space static data structure
that supports range mode queries in time in the worst case, for
any fixed . When , this corresponds to
the first linear-space data structure to guarantee query time. We
then describe three additional linear-space data structures that provide
, , and query time, respectively, where denotes the
number of distinct elements in and denotes the frequency of the mode of
. Finally, we examine generalizing our data structures to higher dimensions.Comment: 13 pages, 2 figure
Planar Visibility: Testing and Counting
In this paper we consider query versions of visibility testing and visibility
counting. Let be a set of disjoint line segments in and let
be an element of . Visibility testing is to preprocess so that we can
quickly determine if is visible from a query point . Visibility counting
involves preprocessing so that one can quickly estimate the number of
segments in visible from a query point .
We present several data structures for the two query problems. The structures
build upon a result by O'Rourke and Suri (1984) who showed that the subset,
, of that is weakly visible from a segment can be
represented as the union of a set, , of triangles, even though
the complexity of can be . We define a variant of their
covering, give efficient output-sensitive algorithms for computing it, and
prove additional properties needed to obtain approximation bounds. Some of our
bounds rely on a new combinatorial result that relates the number of segments
of visible from a point to the number of triangles in that contain .Comment: 22 page
- …