209,586 research outputs found
Clustering Method for Time-Series Images Using Quantum-Inspired Computing Technology
Time-series clustering serves as a powerful data mining technique for
time-series data in the absence of prior knowledge about clusters. A large
amount of time-series data with large size has been acquired and used in
various research fields. Hence, clustering method with low computational cost
is required. Given that a quantum-inspired computing technology, such as a
simulated annealing machine, surpasses conventional computers in terms of fast
and accurately solving combinatorial optimization problems, it holds promise
for accomplishing clustering tasks that are challenging to achieve using
existing methods. This study proposes a novel time-series clustering method
that leverages an annealing machine. The proposed method facilitates an even
classification of time-series data into clusters close to each other while
maintaining robustness against outliers. Moreover, its applicability extends to
time-series images. We compared the proposed method with a standard existing
method for clustering an online distributed dataset. In the existing method,
the distances between each data are calculated based on the Euclidean distance
metric, and the clustering is performed using the k-means++ method. We found
that both methods yielded comparable results. Furthermore, the proposed method
was applied to a flow measurement image dataset containing noticeable noise
with a signal-to-noise ratio of approximately 1. Despite a small signal
variation of approximately 2%, the proposed method effectively classified the
data without any overlap among the clusters. In contrast, the clustering
results by the standard existing method and the conditional image sampling
(CIS) method, a specialized technique for flow measurement data, displayed
overlapping clusters. Consequently, the proposed method provides better results
than the other two methods, demonstrating its potential as a superior
clustering method.Comment: 13 pages, 4 figure
Finite Mixture Models for Clustering Auto-Correlated Sales Series Data Influenced by Promotions
The focus of the present paper is on clustering, namely the problem of finding distinct groups in a dataset so that each group consists of similar observations. We consider the finite mixtures of regression models, given their flexibility in modeling heterogeneous time series. Our study aims to implement a novel approach, which fits mixture models based on the spline and polynomial regression in the case of auto-correlated data, to cluster time series in an unsupervised machine learning framework. Given the assumption of auto-correlated data and the usage of exogenous variables in the mixture model, the usual approach of estimating the maximum likelihood parameters using the Expectation–Maximization (EM) algorithm is computationally prohibitive. Therefore, we provide a novel algorithm for model fitting combining auto-correlated observations with spline and polynomial regression. The case study of this paper consists of the task of clustering the time series of sales data influenced by promotional campaigns. We demonstrate the effectiveness of our method in a case study of 131 sales series data from a real-world company. Numerical outcomes demonstrate the efficacy of the proposed method for clustering auto-correlated time series. Despite the specific case study of this paper, the proposed method can be used in several real-world application fields
Deep Temporal Clustering: Fully Unsupervised Learning of Time-Domain Features
abstract: Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. This thesis presents a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objective. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, a visualization method is applied that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion.Dissertation/ThesisMasters Thesis Computer Engineering 201
Functional Singular Spectrum Analysis and the Clustering of Time-Dependent Data
The present work extends the application of the recently submitted functional singular spectrum analysis (FSSA) into the realm of structure level subsequence clustering. We begin with a comprehensive review of principal component analysis (PCA), functional principal component analysis (FPCA), singular spectrum analysis (SSA), and the recently submitted FSSA. We computationally show that the novel FSSA-FPCA hybrid clustering technique can be employed as an effective structure-based subsequence clustering method for call-center functional time series data where the method behaves as a dimension reduction technique for time-dependent data. Metrics, such as the F-ratio from k-means clustering, the w-correlation between reconstructed functional time series, and the Rand index are offered to determine the quality of clustering results of labeled functional data. We find that these outcomes are dependent on the grouping stage of FSSA for the call-center data. We also find that our measurements are not significantly sensitive to changes in groupings. Our investigations show that FSSA behaves as a type of temporal to frequency domain transformation similar to that of a Fourier analysis. The results shown in the present essay can be used to extend FSSA in its maturation and offer insight into how the hybrid method should be used and the challenges one faces with it
ShapeDBA: Generating Effective Time Series Prototypes using ShapeDTW Barycenter Averaging
Time series data can be found in almost every domain, ranging from the
medical field to manufacturing and wireless communication. Generating realistic
and useful exemplars and prototypes is a fundamental data analysis task. In
this paper, we investigate a novel approach to generating realistic and useful
exemplars and prototypes for time series data. Our approach uses a new form of
time series average, the ShapeDTW Barycentric Average. We therefore turn our
attention to accurately generating time series prototypes with a novel
approach. The existing time series prototyping approaches rely on the Dynamic
Time Warping (DTW) similarity measure such as DTW Barycentering Average (DBA)
and SoftDBA. These last approaches suffer from a common problem of generating
out-of-distribution artifacts in their prototypes. This is mostly caused by the
DTW variant used and its incapability of detecting neighborhood similarities,
instead it detects absolute similarities. Our proposed method, ShapeDBA, uses
the ShapeDTW variant of DTW, that overcomes this issue. We chose time series
clustering, a popular form of time series analysis to evaluate the outcome of
ShapeDBA compared to the other prototyping approaches. Coupled with the k-means
clustering algorithm, and evaluated on a total of 123 datasets from the UCR
archive, our proposed averaging approach is able to achieve new
state-of-the-art results in terms of Adjusted Rand Index.Comment: Published in AALTD workshop at ECML/PKDD 202
Fuzzy clustering with spatial-temporal information
Clustering geographical units based on a set of quantitative features observed at several time occasions requires to deal with the complexity of both space and time information. In particular, one should consider (1) the spatial nature of the units to be clustered, (2) the characteristics of the space of multivariate time trajectories, and (3) the uncertainty related to the assignment of a geographical unit to a given cluster on the basis of the above com- plex features. This paper discusses a novel spatially constrained multivariate time series clustering for units characterised by different levels of spatial proximity. In particular, the Fuzzy Partitioning Around Medoids algorithm with Dynamic Time Warping dissimilarity measure and spatial penalization terms is applied to classify multivariate Spatial-Temporal series. The clustering method has been theoretically presented and discussed using both simulated and real data, highlighting its main features. In particular, the capability of embedding different levels of proximity among units, and the ability of considering time series with different length
- …