323 research outputs found
Times series averaging from a probabilistic interpretation of time-elastic kernel
At the light of regularized dynamic time warping kernels, this paper
reconsider the concept of time elastic centroid (TEC) for a set of time series.
From this perspective, we show first how TEC can easily be addressed as a
preimage problem. Unfortunately this preimage problem is ill-posed, may suffer
from over-fitting especially for long time series and getting a sub-optimal
solution involves heavy computational costs. We then derive two new algorithms
based on a probabilistic interpretation of kernel alignment matrices that
expresses in terms of probabilistic distributions over sets of alignment paths.
The first algorithm is an iterative agglomerative heuristics inspired from the
state of the art DTW barycenter averaging (DBA) algorithm proposed specifically
for the Dynamic Time Warping measure. The second proposed algorithm achieves a
classical averaging of the aligned samples but also implements an averaging of
the time of occurrences of the aligned samples. It exploits a straightforward
progressive agglomerative heuristics. An experimentation that compares for 45
time series datasets classification error rates obtained by first near
neighbors classifiers exploiting a single medoid or centroid estimate to
represent each categories show that: i) centroids based approaches
significantly outperform medoids based approaches, ii) on the considered
experience, the two proposed algorithms outperform the state of the art DBA
algorithm, and iii) the second proposed algorithm that implements an averaging
jointly in the sample space and along the time axes emerges as the most
significantly robust time elastic averaging heuristic with an interesting noise
reduction capability. Index Terms-Time series averaging Time elastic kernel
Dynamic Time Warping Time series clustering and classification
Fast dynamic time warping and clustering in C++
We present an approach for computationally efficient dynamic time warping
(DTW) and clustering of time-series data. The method frames the dynamic warping
of time series datasets as an optimisation problem solved using dynamic
programming, and then clusters time series data by solving a second
optimisation problem using mixed-integer programming (MIP). There is also an
option to use k-medoids clustering for increased speed, when a certificate for
global optimality is not essential. The improved efficiency of our approach is
due to task-level parallelisation of the clustering alongside DTW. Our approach
was tested using the UCR Time Series Archive, and was found to be, on average,
33% faster than the next fastest option when using the same clustering method.
This increases to 64% faster when considering only larger datasets (with more
than 1000 time series). The MIP clustering is most effective on small numbers
of longer time series, because the DTW computation is faster than other
approaches, but the clustering problem becomes increasingly computationally
expensive as the number of time series to be clustered increases
Efficient Kernel-Based Subsequence Search for Enabling Health Monitoring Services in IoT-Based Home Setting
This paper presents an efficient approach for subsequence search in data streams. The problem consists of identifying coherent repetitions of a given reference time-series, also in the multivariate case, within a longer data stream. The most widely adopted metric to address this problem is Dynamic Time Warping (DTW), but its computational complexity is a well-known issue. In this paper, we present an approach aimed at learning a kernel approximating DTW for efficiently analyzing streaming data collected from wearable sensors, while reducing the burden of DTW computation. Contrary to kernel, DTW allows for comparing two time-series with different length. To enable the use of kernel for comparing two time-series with different length, a feature embedding is required in order to obtain a fixed length vector representation. Each vector component is the DTW between the given time-series and a set of "basis" series, randomly chosen. The approach has been validated on two benchmark datasets and on a real-life application for supporting self-rehabilitation in elderly subjects has been addressed. A comparison with traditional DTW implementations and other state-of-the-art algorithms is provided: results show a slight decrease in accuracy, which is counterbalanced by a significant reduction in computational costs
Proximity Forest 2.0: A new effective and scalable similarity-based classifier for time series
Time series classification (TSC) is a challenging task due to the diversity
of types of feature that may be relevant for different classification tasks,
including trends, variance, frequency, magnitude, and various patterns. To
address this challenge, several alternative classes of approach have been
developed, including similarity-based, features and intervals, shapelets,
dictionary, kernel, neural network, and hybrid approaches. While kernel, neural
network, and hybrid approaches perform well overall, some specialized
approaches are better suited for specific tasks. In this paper, we propose a
new similarity-based classifier, Proximity Forest version 2.0 (PF 2.0), which
outperforms previous state-of-the-art similarity-based classifiers across the
UCR benchmark and outperforms state-of-the-art kernel, neural network, and
hybrid methods on specific datasets in the benchmark that are best addressed by
similarity-base methods. PF 2.0 incorporates three recent advances in time
series similarity measures -- (1) computationally efficient early abandoning
and pruning to speedup elastic similarity computations; (2) a new elastic
similarity measure, Amerced Dynamic Time Warping (ADTW); and (3) cost function
tuning. It rationalizes the set of similarity measures employed, reducing the
eight base measures of the original PF to three and using the first derivative
transform with all similarity measures, rather than a limited subset. We have
implemented both PF 1.0 and PF 2.0 in a single C++ framework, making the PF
framework more efficient
- …