3 research outputs found
SCMFTS: Scalable and Distributed Complexity Measures and Features for Univariate and Multivariate Time Series in Big Data Environments
This research has been partially funded by the following grants: TIN2016-81113-R from the Spanish Ministry of Economy and Competitiveness, P12-TIC-2985 and P18-TP-5168 from Andalusian Regional Government, Spain, and EU Commission with FEDER funds. Francisco J. Baldan holds the FPI grant BES-2017-080137 from the Spanish Ministry of Economy and Competitiveness. D. Peralta is a Postdoctoral Fellow of the Research Foundation of Flanders (170303/12X1619N). Y. Saeys is an ISAC Marylou Ingram Scholar.Time series data are becoming increasingly important due to the interconnectedness of the world. Classical problems, which
are getting bigger and bigger, require more and more resources for their processing, and Big Data technologies offer many
solutions. Although the principal algorithms for traditional vector-based problems are available in Big Data environments,
the lack of tools for time series processing in these environments needs to be addressed. In this work, we propose a scalable
and distributed time series transformation for Big Data environments based on well-known time series features (SCMFTS),
which allows practitioners to apply traditional vector-based algorithms to time series problems. The proposed transformation,
along with the algorithms available in Spark, improved the best results in the state-of-the-art on the Wearable Stress
and Affect Detection dataset, which is the biggest publicly available multivariate time series dataset in the University of
California Irvine (UCI) Machine Learning Repository. In addition, SCMFTS showed a linear relationship between its runtime
and the number of processed time series, demonstrating a linear scalable behavior, which is mandatory in Big Data
environments. SCMFTS has been implemented in the Scala programming language for the Apache Spark framework, and
the code is publicly available.Spanish Government TIN2016-81113-R
BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2985
P18-TP-5168European Commission
European Commission Joint Research Centre
European Commissio
Complexity Measures and Features for Times Series classification
Classification of time series is a growing problem in different disciplines due
to the progressive digitalization of the world. Currently, the state-of-the-art
in time series classification is dominated by The Hierarchical Vote Collective
of Transformation-based Ensembles. This algorithm is composed of several
classifiers of different domains distributed in five large modules. The combination
of the results obtained by each module weighed based on an internal evaluation
process allows this algorithm to obtain the best results in state-of-the-art. One
Nearest Neighbour with Dynamic Time Warping remains the base classifier
in any time series classification problem for its simplicity and good results.
Despite their performance, they share a weakness, which is that they are not
interpretable. In the field of time series classification, there is a tradeoff between
accuracy and interpretability. In this work, we propose a set of characteristics
capable of extracting information on the structure of the time series to face time
series classification problems. The use of these characteristics allows the use of
traditional classification algorithms in time series problems. The experimental
results of our proposal show no statistically significant differences from the second
and third best models of the state-of-the-art. Apart from competitive results in
accuracy, our proposal is able to offer interpretable results based on the set of
characteristics proposed.Spanish Government TIN2016-81113-R
PID2020-118224RB-I00
BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2958
P18-TP-5168
A-TIC-388-UGR-1
Time series analysis in big data environments
This doctoral thesis has been supported by the Spanish National Research Project TIN2016-81113-R and
by the Andalusian Regional Government Project P12-TIC-2958. Francisco Javier Baldán Lozano holds the FPI
grant BES-2017-080137 from the Spanish Ministry of Economy and Competitiveness.This thesis is focused on time series analysis, specifically in supervised classification tasks. Although the time series classification field has a large number of approaches to deal with this problem, the proposals made in this field can be classified into three main groups: distance-based, feature-based, and deep learning.Esta tesis se centra en el análisis de series temporales, concretamente en tareas de clasificación supervisada. Aunque el campo de la clasificación de series temporales cuenta con un gran número de enfoques para abordar este problema, las propuestas realizadas en este campo se pueden clasificar en tres grandes grupos: basadas en distancia, basadas en caracterÃsticas y deep learning.Tesis Univ. Granada.Spanish National Research Project TIN2016-81113-RAndalusian Regional Government Project P12-TIC-2958FPI grant BES-2017-080137 from the Spanish Ministry of Economy and Competitivenes