15 research outputs found

    Probabilistic embeddings of the Fréchet distance

    No full text
    \u3cp\u3e The FrĂ©chet distance is a popular distance measure for curves which naturally lends itself to fundamental computational tasks, such as clustering, nearest-neighbor searching, and spherical range searching in the corresponding metric space. However, its inherent complexity poses considerable computational challenges in practice. To address this problem we study distortion of the probabilistic embedding that results from projecting the curves to a randomly chosen line. Such an embedding could be used in combination with, e.g. locality-sensitive hashing. We show that in the worst case and under reasonable assumptions, the discrete FrĂ©chet distance between two polygonal curves of complexity t in ℝ \u3csup\u3ed\u3c/sup\u3e , where d ∈ {2, 3, 4, 5}, degrades by a factor linear in t with constant probability. We show upper and lower bounds on the distortion. We also evaluate our findings empirically on a benchmark data set. The preliminary experimental results stand in stark contrast with our lower bounds. They indicate that highly distorted projections happen very rarely in practice, and only for strongly conditioned input curves. \u3c/p\u3

    Jaywalking your dog : computing the Fréchet distance with shortcuts

    No full text
    The similarity of two polygonal curves can be measured using the FrĂ©chet distance. We introduce the notion of a more robust FrĂ©chet distance, where one is allowed to shortcut between vertices of one of the curves. This is a natural approach for handling noise, in particular batched outliers. We compute a (3+Δ)(3+\varepsilon)-approximation to the minimum FrĂ©chet distance over all possible such shortcuts, in near linear time, if the curve is cc-packed and the number of shortcuts is either small or unbounded. To facilitate the new algorithm we develop several new tools: (a) a data structure for preprocessing a curve (not necessarily cc-packed) that supports (1+Δ)(1+\varepsilon)-approximate FrĂ©chet distance queries between a subcurve (of the original curve) and a line segment; (b) a near linear time algorithm that computes a permutation of the vertices of a curve, such that any prefix of 2k−12k-1 vertices of this permutation forms an optimal approximation (up to a constant factor) to the original curve compared to any polygonal curve with kk vertices, for any k > 0; and (c) a data structure for preprocessing a curve that supports approximate FrĂ©chet distance queries between a subcurve and query polygonal curve. The query time depends quadratically on the complexity of the query curve and only (roughly) logarithmically on the complexity of the original curve. To our knowledge, these are the first data structures to support these kind of queries efficiently

    Clustering time series under the Fréchet distance

    No full text
    \u3cp\u3eThe FrĂ©chet distance is a popular distance measure for curves. We study the problem of clustering time series under the FrĂ©chet distance. In particular, we give (1 + ∈)-approximation algorithms for variations of the following problem with parameters k and ℓ. Given n univariate time series P, each of complexity at most m, we find k time series, not necessarily from P, which we call cluster centers and which each have complexity at most ℓ, such that (a) the maximum distance of an element of P to its nearest cluster center or (b) the sum of these distances is minimized. Our algorithms have running time near-linear in the input size for constant ∈, k and ℓ. To the best of our knowledge, our algorithms are the first clustering algorithms for the FrĂ©chet distance which achieve an approximation factor of (1 + ∈) or better.\u3c/p\u3

    Approximating the Fréchet distance for realistic curves in near linear time

    No full text
    We present a simple and practical (1+e)-approximation algorithm for the Fréchet distance between two polygonal curves in R^d. To analyze this algorithm we introduce a new realistic family of curves, c-packed curves, that is closed under simplification. We believe the notion of c-packed curves to be of independent interest. We show that our algorithm has near linear running time for c-packed polygonal curves, and similar results for other input models, such as low-density polygonal curves

    On the expected complexity of Voronoi diagrams on terrains

    No full text
    We investigate the combinatorial complexity of geodesic Voronoi diagrams on polyhedral terrains using a probabilistic analysis. Aronov et.al. [abt-cbvdrt-08] prove that, if one makes certain realistic input assumptions on the terrain, this complexity is T(n + m vn) in the worst case, where n denotes the number of triangles that define the terrain and m denotes the number of Voronoi sites. We prove that under a relaxed set of assumptions the Voronoi diagram has expected complexity O(n+m), given that the sites have a uniform distribution on the domain of the terrain (or the surface of the terrain). Furthermore, we present a worst-case construction of a terrain which implies a lower bound of O(n m2/3) on the expected worst-case complexity if these assumptions on the terrain are dropped

    Computing the Fréchet distance with shortcuts is NP-hard

    No full text
    We study the shortcut Fréchet distance, a natural variant of the Fréchet distance, that allows us to take shortcuts from and to any point along one of the curves. The classic Fréchet distance is a bottle-neck distance measure and hence quite sensitive to outliers. The shortcut Fréchet distance allows us to cut across outliers and hence produces more meaningful results when dealing with real world data. Driemel and Har-Peled recently described approximation algorithms for the restricted case where shortcuts have to start and end at input vertices. We show that, in the general case, the problem of computing the shortcut Fréchet distance is NP-hard. This is the first hardness result for a variant of the Fréchet distance between two polygonal curves in the plane. We also present two algorithms for the decision problem: a 3-approximation algorithm for the general case and an exact algorithm for the vertex-restricted case. Both algorithms run in O(n3 log n) time

    Klcluster:center-based clustering of trajectories

    No full text
    Center-based clustering, in particular k-means clustering, is frequently used for point data. Its advantages include that the resulting clustering is often easy to interpret and that the cluster centers provide a compact representation of the data. Recent theoretical advances have been made in generalizing center-based clustering to trajectory data. Building upon these theoretical results, we present practical algorithms for center-based trajectory clustering

    Segmentation of trajectories for non-monotone criteria

    No full text
    In the trajectory segmentation problem we are given a polygonal trajectory with n vertices that we have to subdivide into a minimum number of disjoint segments (subtrajectories) that all satisfy a given criterion. The problem is known to be solvable efficiently for monotone criteria: criteria with the property that if they hold on a certain segment, they also hold on every subsegment of that segment [4]. To the best of our knowledge, no theoretical results are known for non-monotone criteria. We present a broader study of the segmentation problem, and suggest a general framework for solving it, based on the start-stop diagram: a 2-dimensional diagram that represents all valid and invalid segments of a given trajectory. This yields two subproblems: (i) computing the start-stop diagram, and (ii) finding the optimal segmentation for a given diagram. We show that (ii) is NP-hard in general. However, we identify properties of the start-stop diagram that make the problem tractable, and give polynomial-time algorithm for this case. We study two concrete non-monotone criteria that arise in practical applications in more detail. Both are based on a given univariate attribute function f over the domain of the trajectory. We say a segment satisfies an outlier-tolerant criterion if the value of f lies within a certain range for at least a given percentage of the length of the segment. We say a segment satisfies a standard deviation criterion if the standard deviation of f over the length of the segment lies below a given threshold. We show that both criteria satisfy the properties that make the segmentation problem tractable. In particular, we compute an optimal segmentation of a trajectory based on the outlier-tolerant criterion in O(n^2 log n+kn^2) time, and on the standard deviation criterion in O(kn^2) time, where n is the number of vertices of the input trajectory and k is the number of segments in an optimal solution
    corecore