477 research outputs found
A New Lower Bound for Semigroup Orthogonal Range Searching
We report the first improvement in the space-time trade-off of lower bounds
for the orthogonal range searching problem in the semigroup model, since
Chazelle's result from 1990. This is one of the very fundamental problems in
range searching with a long history. Previously, Andrew Yao's influential
result had shown that the problem is already non-trivial in one
dimension~\cite{Yao-1Dlb}: using units of space, the query time must
be where is the
inverse Ackermann's function, a very slowly growing function.
In dimensions, Bernard Chazelle~\cite{Chazelle.LB.II} proved that the
query time must be where .
Chazelle's lower bound is known to be tight for when space consumption is
`high' i.e., . We have two main results.
The first is a lower bound that shows Chazelle's lower bound was not tight for
`low space': we prove that we must have . Our lower bound does not close the gap to the existing data
structures, however, our second result is that our analysis is tight. Thus, we
believe the gap is in fact natural since lower bounds are proven for idempotent
semigroups while the data structures are built for general semigroups and thus
they cannot assume (and use) the properties of an idempotent semigroup. As a
result, we believe to close the gap one must study lower bounds for
non-idempotent semigroups or building data structures for idempotent
semigroups. We develope significantly new ideas for both of our results that
could be useful in pursuing either of these directions
A New Lower Bound for Semigroup Orthogonal Range Searching
We report the first improvement in the space-time trade-off of lower bounds for the orthogonal range searching problem in the semigroup model, since Chazelle\u27s result from 1990. This is one of the very fundamental problems in range searching with a long history. Previously, Andrew Yao\u27s influential result had shown that the problem is already non-trivial in one dimension [Yao, 1982]: using m units of space, the query time Q(n) must be Omega(alpha(m,n) + n/(m-n+1)) where alpha(*,*) is the inverse Ackermann\u27s function, a very slowly growing function. In d dimensions, Bernard Chazelle [Chazelle, 1990] proved that the query time must be Q(n) = Omega((log_beta n)^{d-1}) where beta = 2m/n. Chazelle\u27s lower bound is known to be tight for when space consumption is "high" i.e., m = Omega(n log^{d+epsilon}n).
We have two main results. The first is a lower bound that shows Chazelle\u27s lower bound was not tight for "low space": we prove that we must have m Q(n) = Omega(n (log n log log n)^{d-1}). Our lower bound does not close the gap to the existing data structures, however, our second result is that our analysis is tight. Thus, we believe the gap is in fact natural since lower bounds are proven for idempotent semigroups while the data structures are built for general semigroups and thus they cannot assume (and use) the properties of an idempotent semigroup. As a result, we believe to close the gap one must study lower bounds for non-idempotent semigroups or building data structures for idempotent semigroups. We develope significantly new ideas for both of our results that could be useful in pursuing either of these directions
On Range Searching with Semialgebraic Sets II
Let be a set of points in . We present a linear-size data
structure for answering range queries on with constant-complexity
semialgebraic sets as ranges, in time close to . It essentially
matches the performance of similar structures for simplex range searching, and,
for , significantly improves earlier solutions by the first two authors
obtained in~1994. This almost settles a long-standing open problem in range
searching.
The data structure is based on the polynomial-partitioning technique of Guth
and Katz [arXiv:1011.4105], which shows that for a parameter , , there exists a -variate polynomial of degree such that
each connected component of contains at most points
of , where is the zero set of . We present an efficient randomized
algorithm for computing such a polynomial partition, which is of independent
interest and is likely to have additional applications
Weighted Min-Cut: Sequential, Cut-Query and Streaming Algorithms
Consider the following 2-respecting min-cut problem. Given a weighted graph
and its spanning tree , find the minimum cut among the cuts that contain
at most two edges in . This problem is an important subroutine in Karger's
celebrated randomized near-linear-time min-cut algorithm [STOC'96]. We present
a new approach for this problem which can be easily implemented in many
settings, leading to the following randomized min-cut algorithms for weighted
graphs.
* An -time sequential algorithm:
This improves Karger's and bounds when the input graph is not extremely
sparse or dense. Improvements over Karger's bounds were previously known only
under a rather strong assumption that the input graph is simple [Henzinger et
al. SODA'17; Ghaffari et al. SODA'20]. For unweighted graphs with parallel
edges, our bound can be improved to .
* An algorithm requiring cut queries to compute the min-cut of
a weighted graph: This answers an open problem by Rubinstein et al. ITCS'18,
who obtained a similar bound for simple graphs.
* A streaming algorithm that requires space and
passes to compute the min-cut: The only previous non-trivial exact min-cut
algorithm in this setting is the 2-pass -space algorithm on simple
graphs [Rubinstein et al., ITCS'18] (observed by Assadi et al. STOC'19).
In contrast to Karger's 2-respecting min-cut algorithm which deploys
sophisticated dynamic programming techniques, our approach exploits some cute
structural properties so that it only needs to compute the values of cuts corresponding to removing pairs of tree edges, an
operation that can be done quickly in many settings.Comment: Updates on this version: (1) Minor corrections in Section 5.1, 5.2;
(2) Reference to newer results by GMW SOSA21 (arXiv:2008.02060v2), DEMN
STOC21 (arXiv:2004.09129v2) and LMN 21 (arXiv:2102.06565v1
Stronger Tradeoffs for Orthogonal Range Querying in the Semigroup Model
In this paper, we focus on lower bounds for data structures supporting orthogonal range querying on m points in n-dimensions in the semigroup model. Such a data structure usually maintains a family of "canonical subsets" of the given set of points and on a range query, it outputs a disjoint union of the appropriate subsets. Fredman showed that in order to prove lower bounds in the semigroup model, it suffices to prove a lower bound on a certain combinatorial tradeoff between two parameters: (a) the total sizes of the canonical subsets, and (b) the total number of canonical subsets required to cover all query ranges. In particular, he showed that the arithmetic mean of these two parameters is Omega(m log^n m). We strengthen this tradeoff by showing that the geometric mean of the same two parameters is Omega(m log^n m).
Our second result is an alternate proof of Fredman\u27s tradeoff in the one dimensional setting. The problem of answering range queries using canonical subsets can be formulated as factoring a specific boolean matrix as a product of two boolean matrices, one representing the canonical sets and the other capturing the appropriate disjoint unions of the former to output all possible range queries. In this formulation, we can ask what is an optimal data structure, i.e., a data structure that minimizes the sum of the two parameters mentioned above, and how does the balanced binary search tree compare with this optimal data structure in the two parameters? The problem of finding an optimal data structure is a non-linear optimization problem. In one dimension, Fredman\u27s result implies that the minimum value of the objective function is Omega(m log m), which means that at least one of the parameters has to be Omega(m log m). We show that both the parameters in an optimal solution have to be Omega(m log m). This implies that balanced binary search trees are near optimal data structures for range querying in one dimension. We derive intermediate results on factoring matrices, not necessarily boolean, while trying to minimize the norms of the factors, that may be of independent interest
On the complexity of range searching among curves
Modern tracking technology has made the collection of large numbers of
densely sampled trajectories of moving objects widely available. We consider a
fundamental problem encountered when analysing such data: Given polygonal
curves in , preprocess into a data structure that answers
queries with a query curve and radius for the curves of that
have \Frechet distance at most to .
We initiate a comprehensive analysis of the space/query-time trade-off for
this data structuring problem. Our lower bounds imply that any data structure
in the pointer model model that achieves query time, where is
the output size, has to use roughly space in
the worst case, even if queries are mere points (for the discrete \Frechet
distance) or line segments (for the continuous \Frechet distance). More
importantly, we show that more complex queries and input curves lead to
additional logarithmic factors in the lower bound. Roughly speaking, the number
of logarithmic factors added is linear in the number of edges added to the
query and input curve complexity. This means that the space/query time
trade-off worsens by an exponential factor of input and query complexity. This
behaviour addresses an open question in the range searching literature: whether
it is possible to avoid the additional logarithmic factors in the space and
query time of a multilevel partition tree. We answer this question negatively.
On the positive side, we show we can build data structures for the \Frechet
distance by using semialgebraic range searching. Our solution for the discrete
\Frechet distance is in line with the lower bound, as the number of levels in
the data structure is , where denotes the maximal number of vertices
of a curve. For the continuous \Frechet distance, the number of levels
increases to
- …