Search CORE

94 research outputs found

Finding Motif Sets in Time Series

Author: Bagnall Anthony
Hills Jon
Lines Jason
Publication venue
Publication date: 01/01/2014
Field of study

Time-series motifs are representative subsequences that occur frequently in a time series; a motif set is the set of subsequences deemed to be instances of a given motif. We focus on finding motif sets. Our motivation is to detect motif sets in household electricity-usage profiles, representing repeated patterns of household usage. We propose three algorithms for finding motif sets. Two are greedy algorithms based on pairwise comparison, and the third uses a heuristic measure of set quality to find the motif set directly. We compare these algorithms on simulated datasets and on electricity-usage data. We show that Scan MK, the simplest way of using the best-matching pair to find motif sets, is less accurate on our synthetic data than Set Finder and Cluster MK, although the latter is very sensitive to parameter settings. We qualitatively analyse the outputs for the electricity-usage data and demonstrate that both Scan MK and Set Finder can discover useful motif sets in such data

arXiv.org e-Print Archive

University of East Anglia digital repository

Multivariate time series classification with temporal abstractions

Author: Batal L
Bellazzi R
Hauskrecht M
Sacchi L
Publication venue
Publication date: 01/01/2009
Field of study

The increase in the number of complex temporal datasets collected today has prompted the development of methods that extend classical machine learning and data mining methods to time-series data. This work focuses on methods for multivariate time-series classification. Time series classification is a challenging problem mostly because the number of temporal features that describe the data and are potentially useful for classification is enormous. We study and develop a temporal abstraction framework for generating multivariate time series features suitable for classification tasks. We propose the STF-Mine algorithm that automatically mines discriminative temporal abstraction patterns from the time series data and uses them to learn a classification model. Our experimental evaluations, carried out on both synthetic and real world medical data, demonstrate the benefit of our approach in learning accurate classifiers for time-series datasets. Copyright © 2009, Assocation for the Advancement of ArtdicaI Intelligence (www.aaai.org). All rights reserved

CiteSeerX

D-Scholarship@Pitt

University of Pittsburgh ETD Submission Page

Clustering Time Series from Mixture Polynomial Models with Discretised Data

Author: Bagnall AJ
Janacek GJ
Zhang M
Publication venue: University of East Anglia
Publication date: 01/01/2003
Field of study

Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low

University of East Anglia digital repository

Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data

Author: Banerjee O.
Berndt D. J.
Boyd S.
Cover T. M.
Cuturi M.
Das G.
Gray R. M.
Hsieh C.-J.
Hsieh C.-J.
Lauritzen S. L.
Mohan K.
Smyth P.
Wytock M.
Publication venue
Publication date: 14/05/2018
Field of study

Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.Comment: This revised version fixes two small typos in the published versio

arXiv.org e-Print Archive

Crossref

Recommended from our members

Multidimensional Time Series Fuzzy Association Rules Mining

Author: Gao Xuedong
Guo Hongwei
Publication venue: CSUSB ScholarWorks
Publication date: 07/01/2015
Field of study

In this paper, we present a new solution, in which the fuzziness of both subsequences and subsequences interval has been taken into consideration for solving the problem of multidimensional time series fuzzy association rules mining. Aimed at dealing with the new conception, this paper has put forward some key algorithms of the solution. Finally, an application example of multidimensional time series fuzzy association rules mining is illustrated. The result shows that rules with fuzzy interval can only be mined out by the above-mentioned new method

CSUSB ScholarWorks