298 research outputs found

    Unsupervised Multivariate Time Series Clustering

    Get PDF
    Clustering is widely used in unsupervised machine learning to partition a given set of data into non-overlapping groups. Many real-world applications require processing more complex multivariate time series data characterized by more than one dependent variables. A few works in literature reported multivariate classification using Shapelet learning. However, the clustering of multivariate time series signals using Shapelet learning has not explored yet. Shapelet learning is a process of discovering those Shapelets which contain the most informative features of the time series signal. Discovering suitable Shapelets from many candidates Shapelet has been broadly studied for classification and clustering of univariate time series signals. Shapelet learning has shown promising results in the case of univariate time series analysis. The analysis of multivariate time series signals is not widely explored because of the dimensionality issue. This work proposes a generalized Shapelet learning method for unsupervised multivariate time series clustering. The proposed method utilizes spectral clustering and Shapelet similarity minimization with least square regularization to obtain the optimal Shapelets for unsupervised clustering. The proposed method is evaluated using an in-house multivariate time series dataset on detection of radio frequency (RF) faults in the Jefferson Labs Continuous Beam Accelerator Facility (CEBAF). The dataset constitutes of three-dimensional time series recordings of three RF fault types. The proposed method shows successful clustering performance with average value of a precision of 0.732, recall of 0.717, F-score of 0.732, a rand index (RI) score of 0.812 and normalize mutual information (NMI) of 0.56 with overall less than 3% standard deviation in a five-fold cross validation evaluation.https://digitalcommons.odu.edu/gradposters2021_engineering/1004/thumbnail.jp

    Mining time-series data using discriminative subsequences

    Get PDF
    Time-series data is abundant, and must be analysed to extract usable knowledge. Local-shape-based methods offer improved performance for many problems, and a comprehensible method of understanding both data and models. For time-series classification, we transform the data into a local-shape space using a shapelet transform. A shapelet is a time-series subsequence that is discriminative of the class of the original series. We use a heterogeneous ensemble classifier on the transformed data. The accuracy of our method is significantly better than the time-series classification benchmark (1-nearest-neighbour with dynamic time-warping distance), and significantly better than the previous best shapelet-based classifiers. We use two methods to increase interpretability: First, we cluster the shapelets using a novel, parameterless clustering method based on Minimum Description Length, reducing dimensionality and removing duplicate shapelets. Second, we transform the shapelet data into binary data reflecting the presence or absence of particular shapelets, a representation that is straightforward to interpret and understand. We supplement the ensemble classifier with partial classifocation. We generate rule sets on the binary-shapelet data, improving performance on certain classes, and revealing the relationship between the shapelets and the class label. To aid interpretability, we use a novel algorithm, BruteSuppression, that can substantially reduce the size of a rule set without negatively affecting performance, leading to a more compact, comprehensible model. Finally, we propose three novel algorithms for unsupervised mining of approximately repeated patterns in time-series data, testing their performance in terms of speed and accuracy on synthetic data, and on a real-world electricity-consumption device-disambiguation problem. We show that individual devices can be found automatically and in an unsupervised manner using a local-shape-based approach

    Multi-Sensor Event Detection using Shape Histograms

    Full text link
    Vehicular sensor data consists of multiple time-series arising from a number of sensors. Using such multi-sensor data we would like to detect occurrences of specific events that vehicles encounter, e.g., corresponding to particular maneuvers that a vehicle makes or conditions that it encounters. Events are characterized by similar waveform patterns re-appearing within one or more sensors. Further such patterns can be of variable duration. In this work, we propose a method for detecting such events in time-series data using a novel feature descriptor motivated by similar ideas in image processing. We define the shape histogram: a constant dimension descriptor that nevertheless captures patterns of variable duration. We demonstrate the efficacy of using shape histograms as features to detect events in an SVM-based, multi-sensor, supervised learning scenario, i.e., multiple time-series are used to detect an event. We present results on real-life vehicular sensor data and show that our technique performs better than available pattern detection implementations on our data, and that it can also be used to combine features from multiple sensors resulting in better accuracy than using any single sensor. Since previous work on pattern detection in time-series has been in the single series context, we also present results using our technique on multiple standard time-series datasets and show that it is the most versatile in terms of how it ranks compared to other published results

    SE-shapelets: Semi-supervised Clustering of Time Series Using Representative Shapelets

    Full text link
    Shapelets that discriminate time series using local features (subsequences) are promising for time series clustering. Existing time series clustering methods may fail to capture representative shapelets because they discover shapelets from a large pool of uninformative subsequences, and thus result in low clustering accuracy. This paper proposes a Semi-supervised Clustering of Time Series Using Representative Shapelets (SE-Shapelets) method, which utilizes a small number of labeled and propagated pseudo-labeled time series to help discover representative shapelets, thereby improving the clustering accuracy. In SE-Shapelets, we propose two techniques to discover representative shapelets for the effective clustering of time series. 1) A \textit{salient subsequence chain} (SSCSSC) that can extract salient subsequences (as candidate shapelets) of a labeled/pseudo-labeled time series, which helps remove massive uninformative subsequences from the pool. 2) A \textit{linear discriminant selection} (LDSLDS) algorithm to identify shapelets that can capture representative local features of time series in different classes, for convenient clustering. Experiments on UCR time series datasets demonstrate that SE-shapelets discovers representative shapelets and achieves higher clustering accuracy than counterpart semi-supervised time series clustering methods
    • …
    corecore