12 research outputs found
A fast implementation of near neighbors queries for Fr\'echet distance (GIS Cup)
This paper describes an implementation of fast near-neighbours queries (also
known as range searching) with respect to the Fr\'echet distance. The algorithm
is designed to be efficient on practical data such as GPS trajectories. Our
approach is to use a quadtree data structure to enumerate all curves in the
database that have similar start and endpoints as the query curve. On these
curves we run positive and negative filters to narrow the set of potential
results. Only for those trajectories where these heuristics fail, we compute
the Fr\'echet distance exactly, by running a novel recursive variant of the
classic free-space diagram algorithm.
Our implementation won the ACM SIGSPATIAL GIS Cup 2017.Comment: ACM SIGSPATIAL'17 invited paper. 9 page
Why walking the dog takes time: Frechet distance has no strongly subquadratic algorithms unless SETH fails
The Frechet distance is a well-studied and very popular measure of similarity
of two curves. Many variants and extensions have been studied since Alt and
Godau introduced this measure to computational geometry in 1991. Their original
algorithm to compute the Frechet distance of two polygonal curves with n
vertices has a runtime of O(n^2 log n). More than 20 years later, the state of
the art algorithms for most variants still take time more than O(n^2 / log n),
but no matching lower bounds are known, not even under reasonable complexity
theoretic assumptions.
To obtain a conditional lower bound, in this paper we assume the Strong
Exponential Time Hypothesis or, more precisely, that there is no
O*((2-delta)^N) algorithm for CNF-SAT for any delta > 0. Under this assumption
we show that the Frechet distance cannot be computed in strongly subquadratic
time, i.e., in time O(n^{2-delta}) for any delta > 0. This means that finding
faster algorithms for the Frechet distance is as hard as finding faster CNF-SAT
algorithms, and the existence of a strongly subquadratic algorithm can be
considered unlikely.
Our result holds for both the continuous and the discrete Frechet distance.
We extend the main result in various directions. Based on the same assumption
we (1) show non-existence of a strongly subquadratic 1.001-approximation, (2)
present tight lower bounds in case the numbers of vertices of the two curves
are imbalanced, and (3) examine realistic input assumptions (c-packed curves)
Improved approximation for Fr\'echet distance on c-packed curves matching conditional lower bounds
The Fr\'echet distance is a well-studied and very popular measure of
similarity of two curves. The best known algorithms have quadratic time
complexity, which has recently been shown to be optimal assuming the Strong
Exponential Time Hypothesis (SETH) [Bringmann FOCS'14].
To overcome the worst-case quadratic time barrier, restricted classes of
curves have been studied that attempt to capture realistic input curves. The
most popular such class are c-packed curves, for which the Fr\'echet distance
has a -approximation in time [Driemel
et al. DCG'12]. In dimension this cannot be improved to
for any unless SETH fails
[Bringmann FOCS'14].
In this paper, exploiting properties that prevent stronger lower bounds, we
present an improved algorithm with runtime .
This is optimal in high dimensions apart from lower order factors unless SETH
fails. Our main new ingredients are as follows: For filling the classical
free-space diagram we project short subcurves onto a line, which yields
one-dimensional separated curves with roughly the same pairwise distances
between vertices. Then we tackle this special case in near-linear time by
carefully extending a greedy algorithm for the Fr\'echet distance of
one-dimensional separated curves
Four Soviets Walk the Dog-Improved Bounds for Computing the Fr\'echet Distance
Given two polygonal curves in the plane, there are many ways to define a
notion of similarity between them. One popular measure is the Fr\'echet
distance. Since it was proposed by Alt and Godau in 1992, many variants and
extensions have been studied. Nonetheless, even more than 20 years later, the
original algorithm by Alt and Godau for computing the Fr\'echet
distance remains the state of the art (here, denotes the number of edges on
each curve). This has led Helmut Alt to conjecture that the associated decision
problem is 3SUM-hard.
In recent work, Agarwal et al. show how to break the quadratic barrier for
the discrete version of the Fr\'echet distance, where one considers sequences
of points instead of polygonal curves. Building on their work, we give a
randomized algorithm to compute the Fr\'echet distance between two polygonal
curves in time on a pointer machine
and in time on a word RAM. Furthermore, we show that
there exists an algebraic decision tree for the decision problem of depth
, for some . We believe that this
reveals an intriguing new aspect of this well-studied problem. Finally, we show
how to obtain the first subquadratic algorithm for computing the weak Fr\'echet
distance on a word RAM.Comment: 34 pages, 15 figures. A preliminary version appeared in SODA 201
Approximating -center clustering for curves
The Euclidean -center problem is a classical problem that has been
extensively studied in computer science. Given a set of
points in Euclidean space, the problem is to determine a set of
centers (not necessarily part of ) such that the maximum
distance between a point in and its nearest neighbor in
is minimized. In this paper we study the corresponding
-center problem for polygonal curves under the Fr\'echet distance,
that is, given a set of polygonal curves in ,
each of complexity , determine a set of polygonal curves
in , each of complexity , such that the maximum Fr\'echet
distance of a curve in to its closest curve in is
minimized. In this paper, we substantially extend and improve the known
approximation bounds for curves in dimension and higher. We show that, if
is part of the input, then there is no polynomial-time approximation
scheme unless . Our constructions yield different
bounds for one and two-dimensional curves and the discrete and continuous
Fr\'echet distance. In the case of the discrete Fr\'echet distance on
two-dimensional curves, we show hardness of approximation within a factor close
to . This result also holds when , and the -hardness
extends to the case that , i.e., for the problem of computing the
minimum-enclosing ball under the Fr\'echet distance. Finally, we observe that a
careful adaptation of Gonzalez' algorithm in combination with a curve
simplification yields a -approximation in any dimension, provided that an
optimal simplification can be computed exactly. We conclude that our
approximation bounds are close to being tight.Comment: 24 pages; results on minimum-enclosing ball added, additional author
added, general revisio
Fr\'echet Distance for Uncertain Curves
In this paper we study a wide range of variants for computing the (discrete
and continuous) Fr\'echet distance between uncertain curves. We define an
uncertain curve as a sequence of uncertainty regions, where each region is a
disk, a line segment, or a set of points. A realisation of a curve is a
polyline connecting one point from each region. Given an uncertain curve and a
second (certain or uncertain) curve, we seek to compute the lower and upper
bound Fr\'echet distance, which are the minimum and maximum Fr\'echet distance
for any realisations of the curves.
We prove that both the upper and lower bound problems are NP-hard for the
continuous Fr\'echet distance in several uncertainty models, and that the upper
bound problem remains hard for the discrete Fr\'echet distance. In contrast,
the lower bound (discrete and continuous) Fr\'echet distance can be computed in
polynomial time. Furthermore, we show that computing the expected discrete
Fr\'echet distance is #P-hard when the uncertainty regions are modelled as
point sets or line segments. The construction also extends to show #P-hardness
for computing the continuous Fr\'echet distance when regions are modelled as
point sets.
On the positive side, we argue that in any constant dimension there is a
FPTAS for the lower bound problem when is polynomially
bounded, where is the Fr\'echet distance and bounds the
diameter of the regions. We then argue there is a near-linear-time
3-approximation for the decision problem when the regions are convex and
roughly -separated. Finally, we also study the setting with
Sakoe--Chiba time bands, where we restrict the alignment between the two
curves, and give polynomial-time algorithms for upper bound and expected
discrete and continuous Fr\'echet distance for uncertainty regions modelled as
point sets.Comment: 48 pages, 11 figures. This is the full version of the paper to be
published in ICALP 202
Latitude, longitude, and beyond:mining mobile objects' behavior
Rapid advancements in Micro-Electro-Mechanical Systems (MEMS), and wireless communications, have resulted in a surge in data generation. Mobility data is one of the various forms of data, which are ubiquitously collected by different location sensing devices. Extensive knowledge about the behavior of humans and wildlife is buried in raw mobility data. This knowledge can be used for realizing numerous viable applications ranging from wildlife movement analysis, to various location-based recommendation systems, urban planning, and disaster relief. With respect to what mentioned above, in this thesis, we mainly focus on providing data analytics for understanding the behavior and interaction of mobile entities (humans and animals). To this end, the main research question to be addressed is: How can behaviors and interactions of mobile entities be determined from mobility data acquired by (mobile) wireless sensor nodes in an accurate and efficient manner? To answer the above-mentioned question, both application requirements and technological constraints are considered in this thesis. On the one hand, applications requirements call for accurate data analytics to uncover hidden information about individual behavior and social interaction of mobile entities, and to deal with the uncertainties in mobility data. Technological constraints, on the other hand, require these data analytics to be efficient in terms of their energy consumption and to have low memory footprint, and processing complexity