Search CORE

80 research outputs found

Deep learning for time series classification: a review

Author: Fawaz Hassan Ismail
Forestier Germain
Idoumghar Lhassane
Muller Pierre-Alain
Weber Jonathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/05/2019
Field of study

Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.Comment: Accepted at Data Mining and Knowledge Discover

arXiv.org e-Print Archive

univOAK

GENDIS : genetic discovery of shapelets

Author: De Turck Filip
Ongenae Femke
Vandewiele Gilles
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

In the time series classification domain, shapelets are subsequences that are discriminative of a certain class. It has been shown that classifiers are able to achieve state-of-the-art results by taking the distances from the input time series to different discriminative shapelets as the input. Additionally, these shapelets can be visualized and thus possess an interpretable characteristic, making them appealing in critical domains, where longitudinal data are ubiquitous. In this study, a new paradigm for shapelet discovery is proposed, which is based on evolutionary computation. The advantages of the proposed approach are that: (i) it is gradient-free, which could allow escaping from local optima more easily and supports non-differentiable objectives; (ii) no brute-force search is required, making the algorithm scalable; (iii) the total amount of shapelets and the length of each of these shapelets are evolved jointly with the shapelets themselves, alleviating the need to specify this beforehand; (iv) entire sets are evaluated at once as opposed to single shapelets, which results in smaller final sets with fewer similar shapelets that result in similar predictive performances; and (v) the discovered shapelets do not need to be a subsequence of the input time series. We present the results of the experiments, which validate the enumerated advantages

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Directory of Open Access Journals

Generalised Interpretable Shapelets for Irregular Time Series

Author: Kidger Patrick
Lyons Terry
Morrill James
Publication venue
Publication date: 28/05/2020
Field of study

The shapelet transform is a form of feature extraction for time series, in which a time series is described by its similarity to each of a collection of `shapelets'. However it has previously suffered from a number of limitations, such as being limited to regularly-spaced fully-observed time series, and having to choose between efficient training and interpretability. Here, we extend the method to continuous time, and in doing so handle the general case of irregularly-sampled partially-observed multivariate time series. Furthermore, we show that a simple regularisation penalty may be used to train efficiently without sacrificing interpretability. The continuous-time formulation additionally allows for learning the length of each shapelet (previously a discrete object) in a differentiable manner. Finally, we demonstrate that the measure of similarity between time series may be generalised to a learnt pseudometric. We validate our method by demonstrating its performance and interpretability on several datasets; for example we discover (purely from data) that the digits 5 and 6 may be distinguished by the chirality of their bottom loop, and that a kind of spectral gap exists in spoken audio classification

arXiv.org e-Print Archive

Oxford University Research Archive

Contrastive Shapelet Learning for Unsupervised Multivariate Time Series Representation Learning

Author: Liang Chen
Liang Zheng
Liang Zhiyu
Pan Lujia
Wang Hongzhi
Zhang Jianfeng
Publication venue
Publication date: 30/05/2023
Field of study

Recent studies have shown great promise in unsupervised representation learning (URL) for multivariate time series, because URL has the capability in learning generalizable representation for many downstream tasks without using inaccessible labels. However, existing approaches usually adopt the models originally designed for other domains (e.g., computer vision) to encode the time series data and rely on strong assumptions to design learning objectives, which limits their ability to perform well. To deal with these problems, we propose a novel URL framework for multivariate time series by learning time-series-specific shapelet-based representation through a popular contrasting learning paradigm. To the best of our knowledge, this is the first work that explores the shapelet-based embedding in the unsupervised general-purpose representation learning. A unified shapelet-based encoder and a novel learning objective with multi-grained contrasting and multi-scale alignment are particularly designed to achieve our goal, and a data augmentation library is employed to improve the generalization. We conduct extensive experiments using tens of real-world datasets to assess the representation quality on many downstream tasks, including classification, clustering, and anomaly detection. The results demonstrate the superiority of our method against not only URL competitors, but also techniques specially designed for downstream tasks. Our code has been made publicly available at https://github.com/real2fish/CSL

arXiv.org e-Print Archive

A scalable machine learning system for anomaly detection in manufacturing

Author: Schlegl Thomas
Publication venue
Publication date: 01/01/2023
Field of study

Berichte über Rückrufaktionen in der Automobilindustrie gehören inzwischen zum medialen Alltag. Tatsächlich hat deren Häufigkeit und die Anzahl der betroffenen Fahrzeuge in den letzten Jahren weiter zugenommen. Die meisten Aktionen sind auf Fehler in der Produktion zurückzuführen. Für die Hersteller stellt neben Verbesserungen im Qualitätsmanagement die intelligente und automatisierte Analyse von Produktionsprozessdaten ein bislang kaum ausgeschöpftes Potential dar. Die technischen Herausforderungen sind jedoch enorm: die Datenmengen sind gewaltig und die für einen Fehler charakteristischen Datenmuster zwangsläufig unbekannt. Der Einsatz maschineller Lernverfahren (ML) ist ein vielversprechender Ansatz um diese Suche nach der sinnbildlichen Nadel im Häuhaufen zu ermöglichen. Algorithmen sollen anhand der Daten selbständig lernen zwischen normalem und auffälligem Prozessverhalten zu unterscheiden um Prozessexperten frühzeitig zu warnen. Industrie und Forschung versuchen bereits seit Jahren solche ML-Systeme im Produktionsumfeld zu etablieren. Die meisten ML-Projekte scheitern jedoch bereits vor der Produktivphase bzw. verschlingen enorme Ressourcen im Betrieb und liefern keinen wirtschaftlichen Mehrwert. Ziel der Arbeit ist die Entwicklung eines technischen Frameworks zur Implementierung eines skalierbares ML-System für die Anomalieerkennung in Prozessdaten. Die Trainingsprozesse zum Initialisieren und Adaptieren der Modelle sollen hochautomatisierbar sein um einen strukturierten Skalierungsprozess zu ermöglichen. Das entwickelt DM/ML-Verfahren ermöglicht den langfristigen Aufwand für den Systembetrieb durch initialen Mehraufwand für den Modelltrainingsprozess zu senken und hat sich in der Praxis als sowohl relativ als auch absolut Skalierbar bewährt. Dadurch kann die Komplexität auf Systemebene auf ein beherrschbares Maß reduziert werden um einen späteren Systembetrieb zu ermöglichen

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Deep learning for time series classification

Author: Fawaz Hassan Ismail
Publication venue
Publication date: 21/09/2020
Field of study

Time series analysis is a field of data science which is interested in analyzing sequences of numerical values ordered in time. Time series are particularly interesting because they allow us to visualize and understand the evolution of a process over time. Their analysis can reveal trends, relationships and similarities across the data. There exists numerous fields containing data in the form of time series: health care (electrocardiogram, blood sugar, etc.), activity recognition, remote sensing, finance (stock market price), industry (sensors), etc. Time series classification consists of constructing algorithms dedicated to automatically label time series data. The sequential aspect of time series data requires the development of algorithms that are able to harness this temporal property, thus making the existing off-the-shelf machine learning models for traditional tabular data suboptimal for solving the underlying task. In this context, deep learning has emerged in recent years as one of the most effective methods for tackling the supervised classification task, particularly in the field of computer vision. The main objective of this thesis was to study and develop deep neural networks specifically constructed for the classification of time series data. We thus carried out the first large scale experimental study allowing us to compare the existing deep methods and to position them compared other non-deep learning based state-of-the-art methods. Subsequently, we made numerous contributions in this area, notably in the context of transfer learning, data augmentation, ensembling and adversarial attacks. Finally, we have also proposed a novel architecture, based on the famous Inception network (Google), which ranks among the most efficient to date.Comment: PhD thesi

arXiv.org e-Print Archive

Theses.fr

timeXplain -- A Framework for Explaining the Predictions of Time Series Classifiers

Author: Doskoč Vanja
Friedrich Tobias
Mujkanovic Felix
Schirneck Martin
Schäfer Patrick
Publication venue
Publication date: 15/07/2020
Field of study

Modern time series classifiers display impressive predictive capabilities, yet their decision-making processes mostly remain black boxes to the user. At the same time, model-agnostic explainers, such as the recently proposed SHAP, promise to make the predictions of machine learning models interpretable, provided there are well-designed domain mappings. We bring both worlds together in our timeXplain framework, extending the reach of explainable artificial intelligence to time series classification and value prediction. We present novel domain mappings for the time and the frequency domain as well as series statistics and analyze their explicative power as well as their limits. We employ timeXplain in a large-scale experimental comparison of several state-of-the-art time series classifiers and discover similarities between seemingly distinct classification concepts such as residual neural networks and elastic ensembles

arXiv.org e-Print Archive

Time Series Anomaly Detection using Diffusion-based Models

Author: Brad Florin
Manolache Andrei
Pintilie Ioana
Publication venue
Publication date: 02/11/2023
Field of study

Diffusion models have been recently used for anomaly detection (AD) in images. In this paper we investigate whether they can also be leveraged for AD on multivariate time series (MTS). We test two diffusion-based models and compare them to several strong neural baselines. We also extend the PA%K protocol, by computing a ROCK-AUC metric, which is agnostic to both the detection threshold and the ratio K of correctly detected points. Our models outperform the baselines on synthetic datasets and are competitive on real-world datasets, illustrating the potential of diffusion-based methods for AD in multivariate time series.Comment: Accepted at the AI4TS workshop of the 23rd IEEE International Conference on Data Mining (ICDM 2023), 9 pages, 7 figures, 2 table

arXiv.org e-Print Archive