Search CORE

75 research outputs found

Automatic Feature Engineering for Time Series Classification: Evaluation and Discussion

Author: Bondu Alexis
Gay Dominique
Lemaire Vincent
Renault Aurélien
Publication venue
Publication date: 02/08/2023
Field of study

Time Series Classification (TSC) has received much attention in the past two decades and is still a crucial and challenging problem in data science and knowledge engineering. Indeed, along with the increasing availability of time series data, many TSC algorithms have been suggested by the research community in the literature. Besides state-of-the-art methods based on similarity measures, intervals, shapelets, dictionaries, deep learning methods or hybrid ensemble methods, several tools for extracting unsupervised informative summary statistics, aka features, from time series have been designed in the recent years. Originally designed for descriptive analysis and visualization of time series with informative and interpretable features, very few of these feature engineering tools have been benchmarked for TSC problems and compared with state-of-the-art TSC algorithms in terms of predictive performance. In this article, we aim at filling this gap and propose a simple TSC process to evaluate the potential predictive performance of the feature sets obtained with existing feature engineering tools. Thus, we present an empirical study of 11 feature engineering tools branched with 9 supervised classifiers over 112 time series data sets. The analysis of the results of more than 10000 learning experiments indicate that feature-based methods perform as accurately as current state-of-the-art TSC algorithms, and thus should rightfully be considered further in the TSC literature

arXiv.org e-Print Archive

Optimising maintenance operations in photovoltaic solar plants using data analysis for predictive maintenance

Author: Gedde-Dahl Gøran Sildnes
Publication venue: Norwegian University of Life Sciences, Ås
Publication date: 01/01/2022
Field of study

In PV (photovoltaic) solar power plants, high reliability of critical assets must be ensured— these include inverters, which combine the power from multiple solar cell modules. While avoiding unexpected failures and downtime, maintenance schedules aim to take advantage of the full equipment lifetime. Predictive maintenance schedules trigger maintenance actions by modelling the current equipment condition and the time until a particular failure type occurs, known as residual useful lifetime (RUL). However, predicting the RUL of an equipment is complex in this case since the equipment condition is not directly measurable; it is affected by numerous error types with corresponding influencing factors. This work compares statistical and machine learning models using sensor and weather data for the purpose of optimising maintenance decisions. Our methods allow the user to perform maintenance before failure occurs and hence, contribute to maximising reliability. We present two distinct data handling and analysis pipelines for predictive maintenance: The first method is based on a Hidden Markov Model, which estimates the degree of degradation on a discrete scale of latent states. The multivariate input time series is transformed using PCA to reduce dimensionality. This approach delivers a profound statistical model providing insight into the temporal dynamics of the degradation process. The second method pursues a machine learning approach by using a Random Forest Regression algorithm, on top of a feature selection step from time series data. Both methods are assessed by their abilities to predict the RUL from a random point in time prior to failure. The machine learning approach is able to exploit its favourable properties in high-dimensional input data and delivers high predictive performance. Further, we discuss qualitative aspects, such as the interpretability of model parameters and results. Both approaches are benchmarked and compared to one another. We conclude that both approaches have practical merits and may contribute to more favourable decisions and optimised maintenance operations.submittedVersionM-D

pyts: A Python Package for Time Series Classification

Author: Faouzi Johann
Janati Hicham
Publication venue: Microtome Publishing
Publication date: 01/01/2020
Field of study

International audiencepyts is an open-source Python package for time series classification. This versatile toolbox provides implementations of many algorithms published in the literature, preprocessing functionalities, and data set loading utilities. pyts relies on the standard scientific Python packages numpy, scipy, scikit-learn, joblib, and numba, and is distributed under the BSD-3-Clause license. Documentation contains installation instructions, a detailed user guide, a full API description, and concrete self-contained examples. Source code and documentation can be downloaded from https://github.com/johannfaouzi/pyts

INRIA a CCSD electronic archive server