475 research outputs found

    Approximating Dynamic Time Warping and Edit Distance for a Pair of Point Sequences

    Get PDF
    We give the first subquadratic-time approximation schemes for dynamic time warping (DTW) and edit distance (ED) of several natural families of point sequences in Rd\mathbb{R}^d, for any fixed d1d \ge 1. In particular, our algorithms compute (1+ε)(1+\varepsilon)-approximations of DTW and ED in time near-linear for point sequences drawn from k-packed or k-bounded curves, and subquadratic for backbone sequences. Roughly speaking, a curve is κ\kappa-packed if the length of its intersection with any ball of radius rr is at most κr\kappa \cdot r, and a curve is κ\kappa-bounded if the sub-curve between two curve points does not go too far from the two points compared to the distance between the two points. In backbone sequences, consecutive points are spaced at approximately equal distances apart, and no two points lie very close together. Recent results suggest that a subquadratic algorithm for DTW or ED is unlikely for an arbitrary pair of point sequences even for d=1d=1. Our algorithms work by constructing a small set of rectangular regions that cover the entries of the dynamic programming table commonly used for these distance measures. The weights of entries inside each rectangle are roughly the same, so we are able to use efficient procedures to approximately compute the cheapest paths through these rectangles

    Approximating the Geometric Edit Distance

    Get PDF
    Edit distance is a measurement of similarity between two sequences such as strings, point sequences, or polygonal curves. Many matching problems from a variety of areas, such as signal analysis, bioinformatics, etc., need to be solved in a geometric space. Therefore, the geometric edit distance (GED) has been studied. In this paper, we describe the first strictly sublinear approximate near-linear time algorithm for computing the GED of two point sequences in constant dimensional Euclidean space. Specifically, we present a randomized O(n log^2n) time O(sqrt n)-approximation algorithm. Then, we generalize our result to give a randomized alpha-approximation algorithm for any alpha in [1, sqrt n], running in time O~(n^2/alpha^2). Both algorithms are Monte Carlo and return approximately optimal solutions with high probability

    Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings

    Get PDF

    Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation

    Get PDF
    Dynamic time warping distance (DTW) is a widely used distance measure between time series. The best known algorithms for computing DTW run in near quadratic time, and conditional lower bounds prohibit the existence of significantly faster algorithms. The lower bounds do not prevent a faster algorithm for the special case in which the DTW is small, however. For an arbitrary metric space Σ\Sigma with distances normalized so that the smallest non-zero distance is one, we present an algorithm which computes dtw(x,y)\operatorname{dtw}(x, y) for two strings xx and yy over Σ\Sigma in time O(ndtw(x,y))O(n \cdot \operatorname{dtw}(x, y)). We also present an approximation algorithm which computes dtw(x,y)\operatorname{dtw}(x, y) within a factor of O(nϵ)O(n^\epsilon) in time O~(n2ϵ)\tilde{O}(n^{2 - \epsilon}) for 0<ϵ<10 < \epsilon < 1. The algorithm allows for the strings xx and yy to be taken over an arbitrary well-separated tree metric with logarithmic depth and at most exponential aspect ratio. Extending our techniques further, we also obtain the first approximation algorithm for edit distance to work with characters taken from an arbitrary metric space, providing an nϵn^\epsilon-approximation in time O~(n2ϵ)\tilde{O}(n^{2 - \epsilon}), with high probability. Additionally, we present a simple reduction from computing edit distance to computing DTW. Applying our reduction to a conditional lower bound of Bringmann and K\"unnemann pertaining to edit distance over {0,1}\{0, 1\}, we obtain a conditional lower bound for computing DTW over a three letter alphabet (with distances of zero and one). This improves on a previous result of Abboud, Backurs, and Williams. With a similar approach, we prove a reduction from computing edit distance to computing longest LCS length. This means that one can recover conditional lower bounds for LCS directly from those for edit distance, which was not previously thought to be the case

    Dynamic Time Warping and Geometric Edit Distance: Breaking the Quadratic Barrier

    Get PDF
    Dynamic Time Warping (DTW) and Geometric Edit Distance (GED) are basic similarity measures between curves or general temporal sequences (e.g., time series) that are represented as sequences of points in some metric space (X, dist). The DTW and GED measures are massively used in various fields of computer science and computational biology, consequently, the tasks of computing these measures are among the core problems in P. Despite extensive efforts to find more efficient algorithms, the best-known algorithms for computing the DTW or GED between two sequences of points in X = R^d are long-standing dynamic programming algorithms that require quadratic runtime, even for the one-dimensional case d = 1, which is perhaps one of the most used in practice. In this paper, we break the nearly 50 years old quadratic time bound for computing DTW or GED between two sequences of n points in R, by presenting deterministic algorithms that run in O( n^2 log log log n / log log n ) time. Our algorithms can be extended to work also for higher dimensional spaces R^d, for any constant d, when the underlying distance-metric dist is polyhedral (e.g., L_1, L_infty)

    The One-Way Communication Complexity of Dynamic Time Warping Distance

    Get PDF
    We resolve the randomized one-way communication complexity of Dynamic Time Warping (DTW) distance. We show that there is an efficient one-way communication protocol using O~(n/alpha) bits for the problem of computing an alpha-approximation for DTW between strings x and y of length n, and we prove a lower bound of Omega(n / alpha) bits for the same problem. Our communication protocol works for strings over an arbitrary metric of polynomial size and aspect ratio, and we optimize the logarithmic factors depending on properties of the underlying metric, such as when the points are low-dimensional integer vectors equipped with various metrics or have bounded doubling dimension. We also consider linear sketches of DTW, showing that such sketches must have size Omega(n)

    Probabilistic embeddings of the Fr\'echet distance

    Full text link
    The Fr\'echet distance is a popular distance measure for curves which naturally lends itself to fundamental computational tasks, such as clustering, nearest-neighbor searching, and spherical range searching in the corresponding metric space. However, its inherent complexity poses considerable computational challenges in practice. To address this problem we study distortion of the probabilistic embedding that results from projecting the curves to a randomly chosen line. Such an embedding could be used in combination with, e.g. locality-sensitive hashing. We show that in the worst case and under reasonable assumptions, the discrete Fr\'echet distance between two polygonal curves of complexity tt in Rd\mathbb{R}^d, where d{2,3,4,5}d\in\lbrace 2,3,4,5\rbrace, degrades by a factor linear in tt with constant probability. We show upper and lower bounds on the distortion. We also evaluate our findings empirically on a benchmark data set. The preliminary experimental results stand in stark contrast with our lower bounds. They indicate that highly distorted projections happen very rarely in practice, and only for strongly conditioned input curves. Keywords: Fr\'echet distance, metric embeddings, random projectionsComment: 27 pages, 11 figure
    corecore