548 research outputs found
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Subsequence clustering of multivariate time series is a useful tool for
discovering repeated patterns in temporal data. Once these patterns have been
discovered, seemingly complicated datasets can be interpreted as a temporal
sequence of only a small number of states, or clusters. For example, raw sensor
data from a fitness-tracking application can be expressed as a timeline of a
select few actions (i.e., walking, sitting, running). However, discovering
these patterns is challenging because it requires simultaneous segmentation and
clustering of the time series. Furthermore, interpreting the resulting clusters
is difficult, especially when the data is high-dimensional. Here we propose a
new method of model-based clustering, which we call Toeplitz Inverse
Covariance-based Clustering (TICC). Each cluster in the TICC method is defined
by a correlation network, or Markov random field (MRF), characterizing the
interdependencies between different observations in a typical subsequence of
that cluster. Based on this graphical representation, TICC simultaneously
segments and clusters the time series data. We solve the TICC problem through
alternating minimization, using a variation of the expectation maximization
(EM) algorithm. We derive closed-form solutions to efficiently solve the two
resulting subproblems in a scalable way, through dynamic programming and the
alternating direction method of multipliers (ADMM), respectively. We validate
our approach by comparing TICC to several state-of-the-art baselines in a
series of synthetic experiments, and we then demonstrate on an automobile
sensor dataset how TICC can be used to learn interpretable clusters in
real-world scenarios.Comment: This revised version fixes two small typos in the published versio
Down-Sampling coupled to Elastic Kernel Machines for Efficient Recognition of Isolated Gestures
In the field of gestural action recognition, many studies have focused on
dimensionality reduction along the spatial axis, to reduce both the variability
of gestural sequences expressed in the reduced space, and the computational
complexity of their processing. It is noticeable that very few of these methods
have explicitly addressed the dimensionality reduction along the time axis.
This is however a major issue with regard to the use of elastic distances
characterized by a quadratic complexity. To partially fill this apparent gap,
we present in this paper an approach based on temporal down-sampling associated
to elastic kernel machine learning. We experimentally show, on two data sets
that are widely referenced in the domain of human gesture recognition, and very
different in terms of quality of motion capture, that it is possible to
significantly reduce the number of skeleton frames while maintaining a good
recognition rate. The method proves to give satisfactory results at a level
currently reached by state-of-the-art methods on these data sets. The
computational complexity reduction makes this approach eligible for real-time
applications.Comment: ICPR 2014, International Conference on Pattern Recognition, Stockholm
: Sweden (2014
A Review of Subsequence Time Series Clustering
Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies
Approximating Dynamic Time Warping and Edit Distance for a Pair of Point Sequences
We give the first subquadratic-time approximation schemes for dynamic time
warping (DTW) and edit distance (ED) of several natural families of point
sequences in , for any fixed . In particular, our
algorithms compute -approximations of DTW and ED in time
near-linear for point sequences drawn from k-packed or k-bounded curves, and
subquadratic for backbone sequences. Roughly speaking, a curve is
-packed if the length of its intersection with any ball of radius
is at most , and a curve is -bounded if the sub-curve
between two curve points does not go too far from the two points compared to
the distance between the two points. In backbone sequences, consecutive points
are spaced at approximately equal distances apart, and no two points lie very
close together. Recent results suggest that a subquadratic algorithm for DTW or
ED is unlikely for an arbitrary pair of point sequences even for . Our
algorithms work by constructing a small set of rectangular regions that cover
the entries of the dynamic programming table commonly used for these distance
measures. The weights of entries inside each rectangle are roughly the same, so
we are able to use efficient procedures to approximately compute the cheapest
paths through these rectangles
Clustering of Bi-Dimensional and Heterogeneous Times Series: Application to Social Sciences Data
We present an application of bi-dimensional and heterogeneous time series clustering in order to resolve a Social Sciences study issue. The dataset is the result of a survey involving more than eight thousand handicapped persons. Sociologists need to know if there are in this dataset some homogeneous classes of people according to two attributes: the idea that handicapped people have about the quality of their life and their couple status (i.e. if they have a partner or not). These two attributes are time series so we had to adapt the k-Means clustering algorithm in order to be efficient with this kind of data. For this purpose, we use the Longest Common Subsequence time series distance for its efficiency to manage time stretching and we extend it to the bidimensional and heterogeneous case. The results of our unsupervised process give some pertinent and surprising clusters that can be easily analyzed by sociologists.Présentation d'une application d'un "bi-dimensional and heterogeneous time series clustering" pour résoudre un problème en sciences sociales. Les données concernent plus de huit mille personnes en situation de handicap. Le problème est de savoir s'il existe de groupes homogènes vis-à -vis de la qualité de vie ressentie et de la vie de couple déclarée. A ces deux séries temporelles, un algorithme de k-Means clustering a été adapté. Nous avons utilisé the Longest Common Subsequence time series distance et nous l'avons étendue au cas bi-dimensionnel et hétérogène. Le résultat a été pertinent et surprenant, utile à l'analyse sociologique
- …