3 research outputs found
Matrix Profile XXVII: A Novel Distance Measure for Comparing Long Time Series
The most useful data mining primitives are distance measures. With an
effective distance measure, it is possible to perform classification,
clustering, anomaly detection, segmentation, etc. For single-event time series
Euclidean Distance and Dynamic Time Warping distance are known to be extremely
effective. However, for time series containing cyclical behaviors, the semantic
meaningfulness of such comparisons is less clear. For example, on two separate
days the telemetry from an athlete workout routine might be very similar. The
second day may change the order in of performing push-ups and squats, adding
repetitions of pull-ups, or completely omitting dumbbell curls. Any of these
minor changes would defeat existing time series distance measures. Some
bag-of-features methods have been proposed to address this problem, but we
argue that in many cases, similarity is intimately tied to the shapes of
subsequences within these longer time series. In such cases, summative features
will lack discrimination ability. In this work we introduce PRCIS, which stands
for Pattern Representation Comparison in Series. PRCIS is a distance measure
for long time series, which exploits recent progress in our ability to
summarize time series with dictionaries. We will demonstrate the utility of our
ideas on diverse tasks and datasets.Comment: Accepted at IEEE ICKG 2022. (Previously entitled IEEE ICBK.) Abridged
abstract as per arxiv's requirement