61 research outputs found

    A Review of Subsequence Time Series Clustering

    Get PDF
    Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequence of time series data is used. This paper reviews some definitions and backgrounds related to subsequence time series clustering. The categorization of the literature reviews is divided into three groups: preproof, interproof, and postproof period. Moreover, various state-of-the-art approaches in performing subsequence time series clustering are discussed under each of the following categories. The strengths and weaknesses of the employed methods are evaluated as potential issues for future studies

    Online pattern recognition in subsequence time series clustering

    Get PDF
    One of the open issues in the context of subsequence time series clustering is online pattern recognition. There are different fields in this clustering such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. Among these fields pattern recognition is one the essential concept. To implement the idea of online pattern recognition, we choose sequences of ECG data as a subsequence time series data. Additionally, using ECG data can help to interpret heart activity for finding heart diseases. This paper will offer a way to generate online pattern recognition in subsequence time series clustering in order to have a runtime results

    Unsupervised Shift-invariant Feature Learning from Time-series Data

    Get PDF
    Unsupervised feature learning is one of the key components of machine learningand articial intelligence. Learning features from high dimensional streaming data isan important and dicult problem which is incorporated with number of challenges.Moreover, feature learning algorithms need to be evaluated and generalized for timeseries with dierent patterns and components. A detailed study is needed to clarifywhen simple algorithms fail to learn features and whether we need more complicatedmethods.In this thesis, we show that the systematic way to learn meaningful featuresfrom time-series is by using convolutional or shift-invariant versions of unsupervisedfeature learning. We experimentally compare the shift-invariant versions of clustering,sparse coding and non-negative matrix factorization algorithms for: reconstruction,noise separation, prediction, classication and simulating auditory lters from acousticsignals. The results show that the most ecient and highly scalable clustering algorithmwith a simple modication in inference and learning phase is able to produce meaningfulresults. Clustering features are also comparable with sparse coding and non-negativematrix factorization in most of the tasks (e.g. classication) and even more successful insome (e.g. prediction). Shift invariant sparse coding is also used on a novel application,inferring hearing loss from speech signal and produced promising results.Performance of algorithms with regard to some important factors such as: timeseries components, number of features and size of receptive eld is also analyzed. Theresults show that there is a signicant positive correlation between performance of clusteringwith degree of trend, frequency skewness, frequency kurtosis and serial correlationof data, whereas, the correlation is negative in the case of dataset average bandwidth.Performance of shift invariant sparse coding is aected by frequency skewness, frequencykurtosis and serial correlation of data. Non-Negative matrix factorization is influenced by data characteristics same as clustering

    Semi-Supervised Time Point Clustering for Multivariate Time Series

    Get PDF

    Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data

    Full text link
    Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.Comment: This revised version fixes two small typos in the published versio

    Development of a Realistic Driving Cycle Using Time Series Clustering Technique for Buses: Thailand Case Study

    Get PDF
    Realistic driving cycles for Thailand’s road conditions were studied only for the pollution problems of passenger cars and light-duty trucks in urban areas of the metropolitan region. Such driving cycles did not cover rural areas and the in-use operation of buses. Furthermore, such created methods of the driving pattern were very irregular and complicated for chassis dynamometer operation. To propose a new method for a realistic driving cycle, the development of the route traffic for bus transportation in rural areas using a time series clustering technique is indicated. As a case study, this method was applied to collect the driving data on the 323 route in Kanchanaburi province using on-board measurement. The selection procedure of suitable speed and interval time for driving cycle construction was revealed. Similarity of driving characteristics was identified with clustering technique for each time duration to decide the best driving cycle. In conclusion, the frequency of speed ranges from entire trips is 30-40 km/h with the highest ratio of deceleration time to the entire trip. Furthermore, discrete average speeds at each time point with 40 seconds of interval time are the best choice related to the real driving condition in this case study

    PENGGUNAAN MOVING AVERAGE DENGAN METODE HYBRID ARTIFICIAL NEURAL NETWORK DAN FUZZY INFERENCE SYSTEM UNTUK PREDIKSI CUACA

    Get PDF
    Kebutuhan akan prediksi sangat diperlukan diberbagai sektor kehidupan, salah satunya adalah mengenai prediksi cuaca. Prediksi mengenai cuaca dapat dilakukan dalam rentang waktu tertentu, sehingga untuk dapat memprediksi keadaan cuaca dalam rentang waktu tertentu penelitian ini akan menggunakan moving average dengan metode hybrid artificial neural network dan fuzzy inference system. Data yang digunakan berasal dari BMKG Karangploso, Malang dengan menggunakan empat buah parameter yang mempengaruhi kondisi cuaca, yaitu suhu, tekanan udara, kelembapan udara, dan kecepatan angin. Performa model menghasilkan tingkat akurasi mencapai 73.91 %

    Comparing Time Series Through Event Clusterin

    Get PDF
    The comparison of two time series and the extraction of subsequences that are common to the two is a complex data mining problem. Many existing techniques, like the Discrete Fourier Transform (DFT), offer solutions for comparing two whole time series. Often, however, the important thing is to analyse certain regions, known as events, rather than the whole times series. This applies to domains like the stock market, seismography or medicine. In this paper, we propose a method for comparing two time series by analysing the events present in the two. The proposed method is applied to time series generated by stabilometric and posture graphic systems within a branch of medicine studying balance-related functions in human beings
    corecore