548 research outputs found

    Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data

    Full text link
    Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.Comment: This revised version fixes two small typos in the published versio

    Down-Sampling coupled to Elastic Kernel Machines for Efficient Recognition of Isolated Gestures

    Get PDF
    In the field of gestural action recognition, many studies have focused on dimensionality reduction along the spatial axis, to reduce both the variability of gestural sequences expressed in the reduced space, and the computational complexity of their processing. It is noticeable that very few of these methods have explicitly addressed the dimensionality reduction along the time axis. This is however a major issue with regard to the use of elastic distances characterized by a quadratic complexity. To partially fill this apparent gap, we present in this paper an approach based on temporal down-sampling associated to elastic kernel machine learning. We experimentally show, on two data sets that are widely referenced in the domain of human gesture recognition, and very different in terms of quality of motion capture, that it is possible to significantly reduce the number of skeleton frames while maintaining a good recognition rate. The method proves to give satisfactory results at a level currently reached by state-of-the-art methods on these data sets. The computational complexity reduction makes this approach eligible for real-time applications.Comment: ICPR 2014, International Conference on Pattern Recognition, Stockholm : Sweden (2014

    A Review of Subsequence Time Series Clustering

    Get PDF
    Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies

    Approximating Dynamic Time Warping and Edit Distance for a Pair of Point Sequences

    Get PDF
    We give the first subquadratic-time approximation schemes for dynamic time warping (DTW) and edit distance (ED) of several natural families of point sequences in Rd\mathbb{R}^d, for any fixed d≥1d \ge 1. In particular, our algorithms compute (1+ε)(1+\varepsilon)-approximations of DTW and ED in time near-linear for point sequences drawn from k-packed or k-bounded curves, and subquadratic for backbone sequences. Roughly speaking, a curve is κ\kappa-packed if the length of its intersection with any ball of radius rr is at most κ⋅r\kappa \cdot r, and a curve is κ\kappa-bounded if the sub-curve between two curve points does not go too far from the two points compared to the distance between the two points. In backbone sequences, consecutive points are spaced at approximately equal distances apart, and no two points lie very close together. Recent results suggest that a subquadratic algorithm for DTW or ED is unlikely for an arbitrary pair of point sequences even for d=1d=1. Our algorithms work by constructing a small set of rectangular regions that cover the entries of the dynamic programming table commonly used for these distance measures. The weights of entries inside each rectangle are roughly the same, so we are able to use efficient procedures to approximately compute the cheapest paths through these rectangles

    Clustering of Bi-Dimensional and Heterogeneous Times Series: Application to Social Sciences Data

    Get PDF
    We present an application of bi-dimensional and heterogeneous time series clustering in order to resolve a Social Sciences study issue. The dataset is the result of a survey involving more than eight thousand handicapped persons. Sociologists need to know if there are in this dataset some homogeneous classes of people according to two attributes: the idea that handicapped people have about the quality of their life and their couple status (i.e. if they have a partner or not). These two attributes are time series so we had to adapt the k-Means clustering algorithm in order to be efficient with this kind of data. For this purpose, we use the Longest Common Subsequence time series distance for its efficiency to manage time stretching and we extend it to the bidimensional and heterogeneous case. The results of our unsupervised process give some pertinent and surprising clusters that can be easily analyzed by sociologists.Présentation d'une application d'un "bi-dimensional and heterogeneous time series clustering" pour résoudre un problème en sciences sociales. Les données concernent plus de huit mille personnes en situation de handicap. Le problème est de savoir s'il existe de groupes homogènes vis-à-vis de la qualité de vie ressentie et de la vie de couple déclarée. A ces deux séries temporelles, un algorithme de k-Means clustering a été adapté. Nous avons utilisé the Longest Common Subsequence time series distance et nous l'avons étendue au cas bi-dimensionnel et hétérogène. Le résultat a été pertinent et surprenant, utile à l'analyse sociologique
    • …
    corecore