475 research outputs found
Approximating Dynamic Time Warping and Edit Distance for a Pair of Point Sequences
We give the first subquadratic-time approximation schemes for dynamic time
warping (DTW) and edit distance (ED) of several natural families of point
sequences in , for any fixed . In particular, our
algorithms compute -approximations of DTW and ED in time
near-linear for point sequences drawn from k-packed or k-bounded curves, and
subquadratic for backbone sequences. Roughly speaking, a curve is
-packed if the length of its intersection with any ball of radius
is at most , and a curve is -bounded if the sub-curve
between two curve points does not go too far from the two points compared to
the distance between the two points. In backbone sequences, consecutive points
are spaced at approximately equal distances apart, and no two points lie very
close together. Recent results suggest that a subquadratic algorithm for DTW or
ED is unlikely for an arbitrary pair of point sequences even for . Our
algorithms work by constructing a small set of rectangular regions that cover
the entries of the dynamic programming table commonly used for these distance
measures. The weights of entries inside each rectangle are roughly the same, so
we are able to use efficient procedures to approximately compute the cheapest
paths through these rectangles
Approximating the Geometric Edit Distance
Edit distance is a measurement of similarity between two sequences such as strings, point sequences, or polygonal curves. Many matching problems from a variety of areas, such as signal analysis, bioinformatics, etc., need to be solved in a geometric space. Therefore, the geometric edit distance (GED) has been studied. In this paper, we describe the first strictly sublinear approximate near-linear time algorithm for computing the GED of two point sequences in constant dimensional Euclidean space. Specifically, we present a randomized O(n log^2n) time O(sqrt n)-approximation algorithm. Then, we generalize our result to give a randomized alpha-approximation algorithm for any alpha in [1, sqrt n], running in time O~(n^2/alpha^2). Both algorithms are Monte Carlo and return approximately optimal solutions with high probability
Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation
Dynamic time warping distance (DTW) is a widely used distance measure between
time series. The best known algorithms for computing DTW run in near quadratic
time, and conditional lower bounds prohibit the existence of significantly
faster algorithms. The lower bounds do not prevent a faster algorithm for the
special case in which the DTW is small, however. For an arbitrary metric space
with distances normalized so that the smallest non-zero distance is
one, we present an algorithm which computes for two
strings and over in time . We also present an approximation algorithm which computes
within a factor of in time
for . The algorithm allows for
the strings and to be taken over an arbitrary well-separated tree
metric with logarithmic depth and at most exponential aspect ratio. Extending
our techniques further, we also obtain the first approximation algorithm for
edit distance to work with characters taken from an arbitrary metric space,
providing an -approximation in time ,
with high probability. Additionally, we present a simple reduction from
computing edit distance to computing DTW. Applying our reduction to a
conditional lower bound of Bringmann and K\"unnemann pertaining to edit
distance over , we obtain a conditional lower bound for computing DTW
over a three letter alphabet (with distances of zero and one). This improves on
a previous result of Abboud, Backurs, and Williams. With a similar approach, we
prove a reduction from computing edit distance to computing longest LCS length.
This means that one can recover conditional lower bounds for LCS directly from
those for edit distance, which was not previously thought to be the case
Dynamic Time Warping and Geometric Edit Distance: Breaking the Quadratic Barrier
Dynamic Time Warping (DTW) and Geometric Edit Distance (GED) are basic similarity measures between curves or general temporal sequences (e.g., time series) that are represented as sequences of points in some metric space (X, dist). The DTW and GED measures are massively used in various fields of computer science and computational biology, consequently, the tasks of computing these measures are among the core problems in P. Despite extensive efforts to find more efficient algorithms, the best-known algorithms for computing the DTW or GED between two sequences of points in X = R^d are long-standing dynamic programming algorithms that require quadratic runtime, even for the one-dimensional case d = 1, which is perhaps one of the most used in practice.
In this paper, we break the nearly 50 years old quadratic time bound for computing DTW or GED between two sequences of n points in R, by presenting deterministic algorithms that run in O( n^2 log log log n / log log n ) time. Our algorithms can be extended to work also for higher dimensional spaces R^d, for any constant d, when the underlying distance-metric dist is polyhedral (e.g., L_1, L_infty)
The One-Way Communication Complexity of Dynamic Time Warping Distance
We resolve the randomized one-way communication complexity of Dynamic Time Warping (DTW) distance. We show that there is an efficient one-way communication protocol using O~(n/alpha) bits for the problem of computing an alpha-approximation for DTW between strings x and y of length n, and we prove a lower bound of Omega(n / alpha) bits for the same problem. Our communication protocol works for strings over an arbitrary metric of polynomial size and aspect ratio, and we optimize the logarithmic factors depending on properties of the underlying metric, such as when the points are low-dimensional integer vectors equipped with various metrics or have bounded doubling dimension. We also consider linear sketches of DTW, showing that such sketches must have size Omega(n)
Probabilistic embeddings of the Fr\'echet distance
The Fr\'echet distance is a popular distance measure for curves which
naturally lends itself to fundamental computational tasks, such as clustering,
nearest-neighbor searching, and spherical range searching in the corresponding
metric space. However, its inherent complexity poses considerable computational
challenges in practice. To address this problem we study distortion of the
probabilistic embedding that results from projecting the curves to a randomly
chosen line. Such an embedding could be used in combination with, e.g.
locality-sensitive hashing. We show that in the worst case and under reasonable
assumptions, the discrete Fr\'echet distance between two polygonal curves of
complexity in , where , degrades
by a factor linear in with constant probability. We show upper and lower
bounds on the distortion. We also evaluate our findings empirically on a
benchmark data set. The preliminary experimental results stand in stark
contrast with our lower bounds. They indicate that highly distorted projections
happen very rarely in practice, and only for strongly conditioned input curves.
Keywords: Fr\'echet distance, metric embeddings, random projectionsComment: 27 pages, 11 figure
- …