3 research outputs found

    SCMFTS: Scalable and Distributed Complexity Measures and Features for Univariate and Multivariate Time Series in Big Data Environments

    Get PDF
    This research has been partially funded by the following grants: TIN2016-81113-R from the Spanish Ministry of Economy and Competitiveness, P12-TIC-2985 and P18-TP-5168 from Andalusian Regional Government, Spain, and EU Commission with FEDER funds. Francisco J. Baldan holds the FPI grant BES-2017-080137 from the Spanish Ministry of Economy and Competitiveness. D. Peralta is a Postdoctoral Fellow of the Research Foundation of Flanders (170303/12X1619N). Y. Saeys is an ISAC Marylou Ingram Scholar.Time series data are becoming increasingly important due to the interconnectedness of the world. Classical problems, which are getting bigger and bigger, require more and more resources for their processing, and Big Data technologies offer many solutions. Although the principal algorithms for traditional vector-based problems are available in Big Data environments, the lack of tools for time series processing in these environments needs to be addressed. In this work, we propose a scalable and distributed time series transformation for Big Data environments based on well-known time series features (SCMFTS), which allows practitioners to apply traditional vector-based algorithms to time series problems. The proposed transformation, along with the algorithms available in Spark, improved the best results in the state-of-the-art on the Wearable Stress and Affect Detection dataset, which is the biggest publicly available multivariate time series dataset in the University of California Irvine (UCI) Machine Learning Repository. In addition, SCMFTS showed a linear relationship between its runtime and the number of processed time series, demonstrating a linear scalable behavior, which is mandatory in Big Data environments. SCMFTS has been implemented in the Scala programming language for the Apache Spark framework, and the code is publicly available.Spanish Government TIN2016-81113-R BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2985 P18-TP-5168European Commission European Commission Joint Research Centre European Commissio

    Complexity Measures and Features for Times Series classification

    Get PDF
    Classification of time series is a growing problem in different disciplines due to the progressive digitalization of the world. Currently, the state-of-the-art in time series classification is dominated by The Hierarchical Vote Collective of Transformation-based Ensembles. This algorithm is composed of several classifiers of different domains distributed in five large modules. The combination of the results obtained by each module weighed based on an internal evaluation process allows this algorithm to obtain the best results in state-of-the-art. One Nearest Neighbour with Dynamic Time Warping remains the base classifier in any time series classification problem for its simplicity and good results. Despite their performance, they share a weakness, which is that they are not interpretable. In the field of time series classification, there is a tradeoff between accuracy and interpretability. In this work, we propose a set of characteristics capable of extracting information on the structure of the time series to face time series classification problems. The use of these characteristics allows the use of traditional classification algorithms in time series problems. The experimental results of our proposal show no statistically significant differences from the second and third best models of the state-of-the-art. Apart from competitive results in accuracy, our proposal is able to offer interpretable results based on the set of characteristics proposed.Spanish Government TIN2016-81113-R PID2020-118224RB-I00 BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2958 P18-TP-5168 A-TIC-388-UGR-1

    Time series analysis in big data environments

    Get PDF
    This doctoral thesis has been supported by the Spanish National Research Project TIN2016-81113-R and by the Andalusian Regional Government Project P12-TIC-2958. Francisco Javier Baldán Lozano holds the FPI grant BES-2017-080137 from the Spanish Ministry of Economy and Competitiveness.This thesis is focused on time series analysis, specifically in supervised classification tasks. Although the time series classification field has a large number of approaches to deal with this problem, the proposals made in this field can be classified into three main groups: distance-based, feature-based, and deep learning.Esta tesis se centra en el análisis de series temporales, concretamente en tareas de clasificación supervisada. Aunque el campo de la clasificación de series temporales cuenta con un gran número de enfoques para abordar este problema, las propuestas realizadas en este campo se pueden clasificar en tres grandes grupos: basadas en distancia, basadas en características y deep learning.Tesis Univ. Granada.Spanish National Research Project TIN2016-81113-RAndalusian Regional Government Project P12-TIC-2958FPI grant BES-2017-080137 from the Spanish Ministry of Economy and Competitivenes
    corecore