12 research outputs found
Bringing Order to Special Cases of Klee's Measure Problem
Klee's Measure Problem (KMP) asks for the volume of the union of n
axis-aligned boxes in d-space. Omitting logarithmic factors, the best algorithm
has runtime O*(n^{d/2}) [Overmars,Yap'91]. There are faster algorithms known
for several special cases: Cube-KMP (where all boxes are cubes), Unitcube-KMP
(where all boxes are cubes of equal side length), Hypervolume (where all boxes
share a vertex), and k-Grounded (where the projection onto the first k
dimensions is a Hypervolume instance).
In this paper we bring some order to these special cases by providing
reductions among them. In addition to the trivial inclusions, we establish
Hypervolume as the easiest of these special cases, and show that the runtimes
of Unitcube-KMP and Cube-KMP are polynomially related. More importantly, we
show that any algorithm for one of the special cases with runtime T(n,d)
implies an algorithm for the general case with runtime T(n,2d), yielding the
first non-trivial relation between KMP and its special cases. This allows to
transfer W[1]-hardness of KMP to all special cases, proving that no n^{o(d)}
algorithm exists for any of the special cases under reasonable complexity
theoretic assumptions. Furthermore, assuming that there is no improved
algorithm for the general case of KMP (no algorithm with runtime O(n^{d/2 -
eps})) this reduction shows that there is no algorithm with runtime
O(n^{floor(d/2)/2 - eps}) for any of the special cases. Under the same
assumption we show a tight lower bound for a recent algorithm for 2-Grounded
[Yildiz,Suri'12].Comment: 17 page
A practical and robust method to compute the boundary of three-dimensional axis-aligned boxes
The union of axis-aligned boxes results in a constrained structure that is advantageous for solving certain geometrical problems. A widely used scheme for solid modelling systems is the boundary representation (Brep). We present a method to obtain the B-rep of a union of axis-aligned boxes. Our method computes all boundary vertices, and additional information for each vertex that allows us to apply already existing methods to extract the B-rep. It is based on dividing the three-dimensional problem into two-dimensional boundary computations and combining their results. The method can deal with all geometrical degeneracies that may arise. Experimental results prove that our approach outperforms existing general methods, both in efficiency and robustness.)Peer ReviewedPostprint (author’s final draft
Faster Algorithms for Rectangular Matrix Multiplication
Let {\alpha} be the maximal value such that the product of an n x n^{\alpha}
matrix by an n^{\alpha} x n matrix can be computed with n^{2+o(1)} arithmetic
operations. In this paper we show that \alpha>0.30298, which improves the
previous record \alpha>0.29462 by Coppersmith (Journal of Complexity, 1997).
More generally, we construct a new algorithm for multiplying an n x n^k matrix
by an n^k x n matrix, for any value k\neq 1. The complexity of this algorithm
is better than all known algorithms for rectangular matrix multiplication. In
the case of square matrix multiplication (i.e., for k=1), we recover exactly
the complexity of the algorithm by Coppersmith and Winograd (Journal of
Symbolic Computation, 1990).
These new upper bounds can be used to improve the time complexity of several
known algorithms that rely on rectangular matrix multiplication. For example,
we directly obtain a O(n^{2.5302})-time algorithm for the all-pairs shortest
paths problem over directed graphs with small integer weights, improving over
the O(n^{2.575})-time algorithm by Zwick (JACM 2002), and also improve the time
complexity of sparse square matrix multiplication.Comment: 37 pages; v2: some additions in the acknowledgment
Approximate Range Counting Revisited
We study range-searching for colored objects, where one has to count (approximately) the number of colors present in a query range. The problems studied mostly involve orthogonal range-searching in two and three dimensions, and the dual setting of rectangle stabbing by points. We present optimal and near-optimal solutions for these problems. Most of the results are obtained via reductions to the approximate uncolored version, and improved data-structures for them. An additional contribution of this work is the introduction of nested shallow cuttings
Fast 2-Approximate All-Pairs Shortest Paths
In this paper, we revisit the classic approximate All-Pairs Shortest Paths
(APSP) problem in undirected graphs. For unweighted graphs, we provide an
algorithm for -approximate APSP in time,
for any . This is time, using known bounds for
rectangular matrix multiplication~~[Le Gall, Urrutia, SODA
2018]. Our result improves on the bound of [Roddity, STOC
2023], and on the bound of [Baswana, Kavitha, SICOMP
2010] for graphs with edges.
For weighted graphs, we obtain -approximate APSP in time, for any . This is
time using known bounds for . It improves on the state of the art
bound of by [Kavitha, Algorithmica 2012]. Our techniques further
lead to improved bounds in a wide range of density for weighted graphs. In
particular, for the sparse regime we construct a distance oracle in time that supports -approximate queries in constant time. For
sparse graphs, the preprocessing time of the algorithm matches conditional
lower bounds [Patrascu, Roditty, Thorup, FOCS 2012; Abboud, Bringmann, Fischer,
STOC 2023]. To the best of our knowledge, this is the first 2-approximate
distance oracle that has subquadratic preprocessing time in sparse graphs.
We also obtain new bounds in the near additive regime for unweighted graphs.
We give faster algorithms for -approximate APSP, for
.
We obtain these results by incorporating fast rectangular matrix
multiplications into various combinatorial algorithms that carefully balance
out distance computation on layers of sparse graphs preserving certain distance
information
Doctor of Philosophy
dissertationThe contributions of this dissertation are centered around designing new algorithms in the general area of sublinear algorithms such as streaming, core sets and sublinear verification, with a special interest in problems arising from data analysis including data summarization, clustering, matrix problems and massive graphs. In the first part, we focus on summaries and coresets, which are among the main techniques for designing sublinear algorithms for massive data sets. We initiate the study of coresets for uncertain data and study coresets for various types of range counting queries on uncertain data. We focus mainly on the indecisive model of locational uncertainty since it comes up frequently in real-world applications when multiple readings of the same object are made. In this model, each uncertain point has a probability density describing its location, defined as distinct locations. Our goal is to construct a subset of the uncertain points, including their locational uncertainty, so that range counting queries can be answered by examining only this subset. For each type of query we provide coreset constructions with approximation-size trade-offs. We show that random sampling can be used to construct each type of coreset, and we also provide significantly improved bounds using discrepancy-based techniques on axis-aligned range queries. In the second part, we focus on designing sublinear-space algorithms for approximate computations on massive graphs. In particular, we consider graph MAXCUT and correlation clustering problems and develop sampling based approaches to construct truly sublinear () sized coresets for graphs that have polynomial (i.e., for any ) average degree. Our technique is based on analyzing properties of random induced subprograms of the linear program formulations of the problems. We demonstrate this technique with two examples. Firstly, we present a sublinear sized core set to approximate the value of the MAX CUT in a graph to a factor. To the best of our knowledge, all the known methods in this regime rely crucially on near-regularity assumptions. Secondly, we apply the same framework to construct a sublinear-sized coreset for correlation clustering. Our coreset construction also suggests 2-pass streaming algorithms for computing the MAX CUT and correlation clustering objective values which are left as future work at the time of writing this dissertation. Finally, we focus on streaming verification algorithms as another model for designing sublinear algorithms. We give the first polylog space and sublinear (in number of edges) communication protocols for any streaming verification problems in graphs. We present efficient streaming interactive proofs that can verify maximum matching exactly. Our results cover all flavors of matchings (bipartite/ nonbipartite and weighted). In addition, we also present streaming verifiers for approximate metric TSP and exact triangle counting, as well as for graph primitives such as the number of connected components, bipartiteness, minimum spanning tree and connectivity. In particular, these are the first results for weighted matchings and for metric TSP in any streaming verification model. Our streaming verifiers use only polylogarithmic space while exchanging only polylogarithmic communication with the prover in addition to the output size of the relevant solution. We also initiate a study of streaming interactive proofs (SIPs) for problems in data analysis and present efficient SIPs for some fundamental problems. We present protocols for clustering and shape fitting including minimum enclosing ball (MEB), width of a point set, -centers and -slab problem. We also present protocols for fundamental matrix analysis problems: We provide an improved protocol for rectangular matrix problems, which in turn can be used to verify (approximate) eigenvectors of an integer matrix . In general our solutions use polylogarithmic rounds of communication and polylogarithmic total communication and verifier space
Algorithms and Data Structures for Geometric Intersection Query Problems
University of Minnesota Ph.D. dissertation. September 2017. Major: Computer Science. Advisor: Ravi Janardan. 1 computer file (PDF); xi, 126 pages.The focus of this thesis is the topic of geometric intersection queries (GIQ) which has been very well studied by the computational geometry community and the database community. In a GIQ problem, the user is not interested in the entire input geometric dataset, but only in a small subset of it and requests an informative summary of that small subset of data. Formally, the goal is to preprocess a set A of n geometric objects into a data structure so that given a query geometric object q, a certain aggregation function can be applied efficiently on the objects of A intersecting q. The classical aggregation functions studied in the literature are reporting or counting the objects of A intersecting q. In many applications, the same set A is queried several times, in which case one would like to answer a query faster by preprocessing A into a data structure. The goal is to organize the data into a data structure which occupies a small amount of space and yet responds to any user query in real-time. In this thesis the study of the GIQ problems was conducted from the point-of-view of a computational geometry researcher. Given a model of computation and a GIQ problem, what are the best possible upper bounds (resp., lower bounds) on the space and the query time that can be achieved by a data structure? Also, what is the relative hardness of various GIQ problems and aggregate functions. Here relative hardness means that given two GIQ problems A and B (or, two aggregate functions f(A, q) and g(A, q)), which of them can be answered faster by a computer (assuming data structures for both of them occupy asymptotically the same amount of space)? This thesis presents results which increase our understanding of the above questions. For many GIQ problems, data structures with optimal (or near-optimal) space and query time bounds have been achieved. The geometric settings studied are primarily orthogonal range searching where the input is points and the query is an axes-aligned rectangle, and the dual setting of rectangle stabbing where the input is a set of axes-aligned rectangles and the query is a point. The aggregation functions studied are primarily reporting, top-k, and approximate counting. Most of the data structures are built for the internal memory model (word-RAM or pointer machine model), but in some settings they are generic enough to be efficient in the I/O-model as well
Computing Volumes and Convex Hulls: Variations and Extensions
Geometric techniques are frequently utilized to analyze and reason about multi-dimensional data. When confronted with large quantities of such data, simplifying geometric statistics or summaries are often a necessary first step. In this thesis, we make contributions to two such fundamental concepts of computational geometry: Klee's Measure and Convex Hulls. The former is concerned with computing the total volume occupied by a set of overlapping rectangular boxes in d-dimensional space, while the latter is concerned with identifying extreme vertices in a multi-dimensional set of points. Both problems are frequently used to analyze optimal solutions to multi-objective optimization problems: a variant of Klee's problem called the Hypervolume Indicator gives a quantitative measure for the quality of a discrete Pareto Optimal set, while the Convex Hull represents the subset of solutions that are optimal with respect to at least one linear optimization function.In the first part of the thesis, we investigate several practical and natural variations of Klee's Measure Problem. We develop a specialized algorithm for a specific case of Klee's problem called the “grounded” case, which also solves the Hypervolume Indicator problem faster than any earlier solution for certain dimensions. Next, we extend Klee's problem to an uncertainty setting where the existence of the input boxes are defined probabilistically, and study computing the expectation of the volume. Additionally, we develop efficient algorithms for a discrete version of the problem, where the volume of a box is redefined to be the cardinality of its overlap with a given point set.The second part of the thesis investigates the convex hull problem on uncertain input. To this extent, we examine two probabilistic uncertainty models for point sets. The first model incorporates uncertainty in the existence of the input points. The second model extends the first one by incorporating locational uncertainty. For both models, we study the problem of computing the probability that a given point is contained in the convex hull of the uncertain points. We also consider the problem of finding the most likely convex hull, i.e., the mode of the convex hull random variable