12 research outputs found

    A fast implementation of near neighbors queries for Fr\'echet distance (GIS Cup)

    This paper describes an implementation of fast near-neighbours queries (also known as range searching) with respect to the Fr\'echet distance. The algorithm is designed to be efficient on practical data such as GPS trajectories. Our approach is to use a quadtree data structure to enumerate all curves in the database that have similar start and endpoints as the query curve. On these curves we run positive and negative filters to narrow the set of potential results. Only for those trajectories where these heuristics fail, we compute the Fr\'echet distance exactly, by running a novel recursive variant of the classic free-space diagram algorithm. Our implementation won the ACM SIGSPATIAL GIS Cup 2017.Comment: ACM SIGSPATIAL'17 invited paper. 9 page

    Why walking the dog takes time: Frechet distance has no strongly subquadratic algorithms unless SETH fails

    The Frechet distance is a well-studied and very popular measure of similarity of two curves. Many variants and extensions have been studied since Alt and Godau introduced this measure to computational geometry in 1991. Their original algorithm to compute the Frechet distance of two polygonal curves with n vertices has a runtime of O(n^2 log n). More than 20 years later, the state of the art algorithms for most variants still take time more than O(n^2 / log n), but no matching lower bounds are known, not even under reasonable complexity theoretic assumptions. To obtain a conditional lower bound, in this paper we assume the Strong Exponential Time Hypothesis or, more precisely, that there is no O*((2-delta)^N) algorithm for CNF-SAT for any delta > 0. Under this assumption we show that the Frechet distance cannot be computed in strongly subquadratic time, i.e., in time O(n^{2-delta}) for any delta > 0. This means that finding faster algorithms for the Frechet distance is as hard as finding faster CNF-SAT algorithms, and the existence of a strongly subquadratic algorithm can be considered unlikely. Our result holds for both the continuous and the discrete Frechet distance. We extend the main result in various directions. Based on the same assumption we (1) show non-existence of a strongly subquadratic 1.001-approximation, (2) present tight lower bounds in case the numbers of vertices of the two curves are imbalanced, and (3) examine realistic input assumptions (c-packed curves)

    Improved approximation for Fr\'echet distance on c-packed curves matching conditional lower bounds

    The Fr\'echet distance is a well-studied and very popular measure of similarity of two curves. The best known algorithms have quadratic time complexity, which has recently been shown to be optimal assuming the Strong Exponential Time Hypothesis (SETH) [Bringmann FOCS'14]. To overcome the worst-case quadratic time barrier, restricted classes of curves have been studied that attempt to capture realistic input curves. The most popular such class are c-packed curves, for which the Fr\'echet distance has a (1+ϵ)(1+\epsilon)-approximation in time O~(cn/ϵ)\tilde{O}(c n /\epsilon) [Driemel et al. DCG'12]. In dimension d5d \ge 5 this cannot be improved to O((cn/ϵ)1δ)O((cn/\sqrt{\epsilon})^{1-\delta}) for any δ>0\delta > 0 unless SETH fails [Bringmann FOCS'14]. In this paper, exploiting properties that prevent stronger lower bounds, we present an improved algorithm with runtime O~(cn/ϵ)\tilde{O}(cn/\sqrt{\epsilon}). This is optimal in high dimensions apart from lower order factors unless SETH fails. Our main new ingredients are as follows: For filling the classical free-space diagram we project short subcurves onto a line, which yields one-dimensional separated curves with roughly the same pairwise distances between vertices. Then we tackle this special case in near-linear time by carefully extending a greedy algorithm for the Fr\'echet distance of one-dimensional separated curves

    Four Soviets Walk the Dog-Improved Bounds for Computing the Fr\'echet Distance

    Given two polygonal curves in the plane, there are many ways to define a notion of similarity between them. One popular measure is the Fr\'echet distance. Since it was proposed by Alt and Godau in 1992, many variants and extensions have been studied. Nonetheless, even more than 20 years later, the original O(n2logn)O(n^2 \log n) algorithm by Alt and Godau for computing the Fr\'echet distance remains the state of the art (here, nn denotes the number of edges on each curve). This has led Helmut Alt to conjecture that the associated decision problem is 3SUM-hard. In recent work, Agarwal et al. show how to break the quadratic barrier for the discrete version of the Fr\'echet distance, where one considers sequences of points instead of polygonal curves. Building on their work, we give a randomized algorithm to compute the Fr\'echet distance between two polygonal curves in time O(n2logn(loglogn)3/2)O(n^2 \sqrt{\log n}(\log\log n)^{3/2}) on a pointer machine and in time O(n2(loglogn)2)O(n^2(\log\log n)^2) on a word RAM. Furthermore, we show that there exists an algebraic decision tree for the decision problem of depth O(n2ε)O(n^{2-\varepsilon}), for some ε>0\varepsilon > 0. We believe that this reveals an intriguing new aspect of this well-studied problem. Finally, we show how to obtain the first subquadratic algorithm for computing the weak Fr\'echet distance on a word RAM.Comment: 34 pages, 15 figures. A preliminary version appeared in SODA 201

    Approximating (k,)(k,\ell)-center clustering for curves

    The Euclidean kk-center problem is a classical problem that has been extensively studied in computer science. Given a set G\mathcal{G} of nn points in Euclidean space, the problem is to determine a set C\mathcal{C} of kk centers (not necessarily part of G\mathcal{G}) such that the maximum distance between a point in G\mathcal{G} and its nearest neighbor in C\mathcal{C} is minimized. In this paper we study the corresponding (k,)(k,\ell)-center problem for polygonal curves under the Fr\'echet distance, that is, given a set G\mathcal{G} of nn polygonal curves in Rd\mathbb{R}^d, each of complexity mm, determine a set C\mathcal{C} of kk polygonal curves in Rd\mathbb{R}^d, each of complexity \ell, such that the maximum Fr\'echet distance of a curve in G\mathcal{G} to its closest curve in C\mathcal{C} is minimized. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension 22 and higher. We show that, if \ell is part of the input, then there is no polynomial-time approximation scheme unless P=NP\mathsf{P}=\mathsf{NP}. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fr\'echet distance. In the case of the discrete Fr\'echet distance on two-dimensional curves, we show hardness of approximation within a factor close to 2.5982.598. This result also holds when k=1k=1, and the NP\mathsf{NP}-hardness extends to the case that =\ell=\infty, i.e., for the problem of computing the minimum-enclosing ball under the Fr\'echet distance. Finally, we observe that a careful adaptation of Gonzalez' algorithm in combination with a curve simplification yields a 33-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Comment: 24 pages; results on minimum-enclosing ball added, additional author added, general revisio

    Fr\'echet Distance for Uncertain Curves

    In this paper we study a wide range of variants for computing the (discrete and continuous) Fr\'echet distance between uncertain curves. We define an uncertain curve as a sequence of uncertainty regions, where each region is a disk, a line segment, or a set of points. A realisation of a curve is a polyline connecting one point from each region. Given an uncertain curve and a second (certain or uncertain) curve, we seek to compute the lower and upper bound Fr\'echet distance, which are the minimum and maximum Fr\'echet distance for any realisations of the curves. We prove that both the upper and lower bound problems are NP-hard for the continuous Fr\'echet distance in several uncertainty models, and that the upper bound problem remains hard for the discrete Fr\'echet distance. In contrast, the lower bound (discrete and continuous) Fr\'echet distance can be computed in polynomial time. Furthermore, we show that computing the expected discrete Fr\'echet distance is #P-hard when the uncertainty regions are modelled as point sets or line segments. The construction also extends to show #P-hardness for computing the continuous Fr\'echet distance when regions are modelled as point sets. On the positive side, we argue that in any constant dimension there is a FPTAS for the lower bound problem when Δ/δ\Delta / \delta is polynomially bounded, where δ\delta is the Fr\'echet distance and Δ\Delta bounds the diameter of the regions. We then argue there is a near-linear-time 3-approximation for the decision problem when the regions are convex and roughly δ\delta-separated. Finally, we also study the setting with Sakoe--Chiba time bands, where we restrict the alignment between the two curves, and give polynomial-time algorithms for upper bound and expected discrete and continuous Fr\'echet distance for uncertainty regions modelled as point sets.Comment: 48 pages, 11 figures. This is the full version of the paper to be published in ICALP 202

    Latitude, longitude, and beyond:mining mobile objects' behavior

    Rapid advancements in Micro-Electro-Mechanical Systems (MEMS), and wireless communications, have resulted in a surge in data generation. Mobility data is one of the various forms of data, which are ubiquitously collected by different location sensing devices. Extensive knowledge about the behavior of humans and wildlife is buried in raw mobility data. This knowledge can be used for realizing numerous viable applications ranging from wildlife movement analysis, to various location-based recommendation systems, urban planning, and disaster relief. With respect to what mentioned above, in this thesis, we mainly focus on providing data analytics for understanding the behavior and interaction of mobile entities (humans and animals). To this end, the main research question to be addressed is: How can behaviors and interactions of mobile entities be determined from mobility data acquired by (mobile) wireless sensor nodes in an accurate and efficient manner? To answer the above-mentioned question, both application requirements and technological constraints are considered in this thesis. On the one hand, applications requirements call for accurate data analytics to uncover hidden information about individual behavior and social interaction of mobile entities, and to deal with the uncertainties in mobility data. Technological constraints, on the other hand, require these data analytics to be efficient in terms of their energy consumption and to have low memory footprint, and processing complexity