43 research outputs found
Fully Dynamic Maximum Independent Sets of Disks in Polylogarithmic Update Time
A fundamental question in computational geometry is for a dynamic collection
of geometric objects in Euclidean space, whether it is possible to maintain a
maximum independent set in polylogarithmic update time. Already, for a set of
intervals, it is known that no dynamic algorithm can maintain an exact maximum
independent set with sublinear update time. Therefore, the typical objective is
to explore the trade-off between update time and solution size. Substantial
efforts have been made in recent years to understand this question for various
families of geometric objects, such as intervals, hypercubes, hyperrectangles,
and fat objects.
We present the first fully dynamic approximation algorithm for disks of
arbitrary radii in the plane that maintains a constant-factor approximate
maximum independent set in polylogarithmic update time. First, we show that for
a fully dynamic set of unit disks in the plane, a -approximate maximum
independent set can be maintained with worst-case update time ,
and optimal output-sensitive reporting. Moreover, this result generalizes to
fat objects of comparable sizes in any fixed dimension , where the
approximation ratio depends on the dimension and the fatness parameter. Our
main result is that for a fully dynamic set of disks of arbitrary radii in the
plane, an -approximate maximum independent set can be maintained in
polylogarithmic expected amortized update time.Comment: Abstract is shortened to meet Arxiv's requirement on the number of
character
Algorithms and Data Structures for Geometric Intersection Query Problems
University of Minnesota Ph.D. dissertation. September 2017. Major: Computer Science. Advisor: Ravi Janardan. 1 computer file (PDF); xi, 126 pages.The focus of this thesis is the topic of geometric intersection queries (GIQ) which has been very well studied by the computational geometry community and the database community. In a GIQ problem, the user is not interested in the entire input geometric dataset, but only in a small subset of it and requests an informative summary of that small subset of data. Formally, the goal is to preprocess a set A of n geometric objects into a data structure so that given a query geometric object q, a certain aggregation function can be applied efficiently on the objects of A intersecting q. The classical aggregation functions studied in the literature are reporting or counting the objects of A intersecting q. In many applications, the same set A is queried several times, in which case one would like to answer a query faster by preprocessing A into a data structure. The goal is to organize the data into a data structure which occupies a small amount of space and yet responds to any user query in real-time. In this thesis the study of the GIQ problems was conducted from the point-of-view of a computational geometry researcher. Given a model of computation and a GIQ problem, what are the best possible upper bounds (resp., lower bounds) on the space and the query time that can be achieved by a data structure? Also, what is the relative hardness of various GIQ problems and aggregate functions. Here relative hardness means that given two GIQ problems A and B (or, two aggregate functions f(A, q) and g(A, q)), which of them can be answered faster by a computer (assuming data structures for both of them occupy asymptotically the same amount of space)? This thesis presents results which increase our understanding of the above questions. For many GIQ problems, data structures with optimal (or near-optimal) space and query time bounds have been achieved. The geometric settings studied are primarily orthogonal range searching where the input is points and the query is an axes-aligned rectangle, and the dual setting of rectangle stabbing where the input is a set of axes-aligned rectangles and the query is a point. The aggregation functions studied are primarily reporting, top-k, and approximate counting. Most of the data structures are built for the internal memory model (word-RAM or pointer machine model), but in some settings they are generic enough to be efficient in the I/O-model as well
Computing Volumes and Convex Hulls: Variations and Extensions
Geometric techniques are frequently utilized to analyze and reason about multi-dimensional data. When confronted with large quantities of such data, simplifying geometric statistics or summaries are often a necessary first step. In this thesis, we make contributions to two such fundamental concepts of computational geometry: Klee's Measure and Convex Hulls. The former is concerned with computing the total volume occupied by a set of overlapping rectangular boxes in d-dimensional space, while the latter is concerned with identifying extreme vertices in a multi-dimensional set of points. Both problems are frequently used to analyze optimal solutions to multi-objective optimization problems: a variant of Klee's problem called the Hypervolume Indicator gives a quantitative measure for the quality of a discrete Pareto Optimal set, while the Convex Hull represents the subset of solutions that are optimal with respect to at least one linear optimization function.In the first part of the thesis, we investigate several practical and natural variations of Klee's Measure Problem. We develop a specialized algorithm for a specific case of Klee's problem called the âgroundedâ case, which also solves the Hypervolume Indicator problem faster than any earlier solution for certain dimensions. Next, we extend Klee's problem to an uncertainty setting where the existence of the input boxes are defined probabilistically, and study computing the expectation of the volume. Additionally, we develop efficient algorithms for a discrete version of the problem, where the volume of a box is redefined to be the cardinality of its overlap with a given point set.The second part of the thesis investigates the convex hull problem on uncertain input. To this extent, we examine two probabilistic uncertainty models for point sets. The first model incorporates uncertainty in the existence of the input points. The second model extends the first one by incorporating locational uncertainty. For both models, we study the problem of computing the probability that a given point is contained in the convex hull of the uncertain points. We also consider the problem of finding the most likely convex hull, i.e., the mode of the convex hull random variable
Acceleration of Computational Geometry Algorithms for High Performance Computing Based Geo-Spatial Big Data Analysis
Geo-Spatial computing and data analysis is the branch of computer science that deals with real world location-based data. Computational geometry algorithms are algorithms that process geometry/shapes and is one of the pillars of geo-spatial computing. Real world map and location-based data can be huge in size and the data structures used to process them extremely big leading to huge computational costs. Furthermore, Geo-Spatial datasets are growing on all Vâs (Volume, Variety, Value, etc.) and are becoming larger and more complex to process in-turn demanding more computational resources. High Performance Computing is a way to breakdown the problem in ways that it can run in parallel on big computers with massive processing power and hence reduce the computing time delivering the same results but much faster.This dissertation explores different techniques to accelerate the processing of computational geometry algorithms and geo-spatial computing like using Many-core Graphics Processing Units (GPU), Multi-core Central Processing Units (CPU), Multi-node setup with Message Passing Interface (MPI), Cache optimizations, Memory and Communication optimizations, load balancing, Algorithmic Modifications, Directive based parallelization with OpenMP or OpenACC and Vectorization with compiler intrinsic (AVX). This dissertation has applied at least one of the mentioned techniques to the following problems. Novel method to parallelize plane sweep based geometric intersection for GPU with directives is presented. Parallelization of plane sweep based Voronoi construction, parallelization of Segment tree construction, Segment tree queries and Segment tree-based operations has been presented. Spatial autocorrelation, computation of getis-ord hotspots are also presented. Acceleration performance and speedup results are presented in each corresponding chapter
Efficient bulk-loading methods for temporal and multidimensional index structures
Nahezu alle naturwissenschaftlichen Bereiche profitieren von neuesten Analyse- und Verarbeitungsmethoden fĂŒr groĂe Datenmengen. Diese Verfahren setzten eine effiziente Verarbeitung von geo- und zeitbezogenen Daten voraus, da die Zeit und die Position wichtige Attribute vieler Daten
sind. Die effiziente Anfrageverarbeitung wird insbesondere durch den Einsatz von Indexstrukturen
ermöglicht. Im Fokus dieser Arbeit liegen zwei Indexstrukturen: Multiversion B-Baum
(MVBT) und R-Baum. Die erste Struktur wird fĂŒr die Verwaltung von zeitbehafteten Daten,
die zweite fĂŒr die Indexierung von mehrdimensionalen Rechteckdaten eingesetzt.
StĂ€ndig- und schnellwachsendes Datenvolumen stellt eine groĂe Herausforderung an die Informatik
dar. Der Aufbau und das Aktualisieren von Indexen mit herkömmlichen Methoden (Datensatz
fĂŒr Datensatz) ist nicht mehr effizient. Um zeitnahe und kosteneffiziente Datenverarbeitung
zu ermöglichen, werden Verfahren zum schnellen Laden von Indexstrukturen dringend benötigt.
Im ersten Teil der Arbeit widmen wir uns der Frage, ob es ein Verfahren fĂŒr das Laden von MVBT
existiert, das die gleiche I/O-KomplexitÀt wie das externe Sortieren besitz. Bis jetzt blieb diese
Frage unbeantwortet. In dieser Arbeit haben wir eine neue Kostruktionsmethode entwickelt und
haben gezeigt, dass diese gleiche ZeitkomplexitÀt wie das externe Sortieren besitzt. Dabei haben
wir zwei algorithmische Techniken eingesetzt: Gewichts-Balancierung und Puffer-BĂ€ume. Unsere
Experimenten zeigen, dass das Resultat nicht nur theoretischer Bedeutung ist.
Im zweiten Teil der Arbeit beschÀftigen wir uns mit der Frage, ob und wie statistische Informationen
ĂŒber Geo-Anfragen ausgenutzt werden können, um die Anfrageperformanz von R-BĂ€umen zu
verbessern. Unsere neue Methode verwendet Informationen wie SeitenverhÀltnis und SeitenlÀngen
eines reprĂ€sentativen Anfragerechtecks, um einen guten R-Baum bezĂŒglich eines hĂ€ufig eingesetzten
Kostenmodells aufzubauen. Falls diese Informationen nicht verfĂŒgbar sind, optimieren
wir R-BĂ€ume bezĂŒglich der Summe der Volumina von minimal umgebenden Rechtecken der Blattknoten.
Da das Problem des Aufbaus von optimalen R-BĂ€umen bezĂŒglich dieses KostenmaĂes
NP-hart ist, fĂŒhren wir zunĂ€chst das Problem auf ein eindimensionales Partitionierungsproblem
zurĂŒck, indem wir die Daten bezĂŒglich optimierte raumfĂŒllende Kurven sortieren. Dann lösen
wir dieses Problem durch Einsatz vom dynamischen Programmieren. Die I/O-KomplexitÀt des
Verfahrens ist gleich der von externem Sortieren, da die I/O-Laufzeit der Methode durch die
Laufzeit des Sortierens dominiert wird.
Im letzten Teil der Arbeit haben wir die entwickelten Partitionierungsvefahren fĂŒr den Aufbau
von Geo-Histogrammen eingesetzt, da diese Àhnlich zu R-BÀumen eine disjunkte Partitionierung
des Raums erzeugen. Ergebnisse von intensiven Experimenten zeigen, dass sich unter Verwendung
von neuen Partitionierungstechniken sowohl R-BĂ€ume mit besserer Anfrageperformanz als
auch Geo-Histogrammen mit besserer SchÀtzqualitÀt im Vergleich zu Konkurrenzverfahren generieren
lassen
Dynamic Smooth Compressed Quadtrees
We introduce dynamic smooth (a.k.a. balanced) compressed quadtrees with worst-case constant time updates in constant dimensions. We distinguish two versions of the problem. First, we show that quadtrees as a space-division data structure can be made smooth and dynamic subject to split and merge operations on the quadtree cells. Second, we show that quadtrees used to store a set of points in R^d can be made smooth and dynamic subject to insertions and deletions of points. The second version uses the first but must additionally deal with compression and alignment of quadtree components. In both cases our updates take 2^{O(d log d)} time, except for the point location part in the second version which has a lower bound of Omega(log n); but if a pointer (finger) to the correct quadtree cell is given, the rest of the updates take worst-case constant time. Our result implies that several classic and recent results (ranging from ray tracing to planar point location) in computational geometry which use quadtrees can deal with arbitrary point sets on a real RAM pointer machine