40 research outputs found

    A review on distance based time series classification

    Get PDF
    Time series classification is an increasing research topic due to the vast amount of time series data that is being created over a wide variety of fields. The particularity of the data makes it a challenging task and different approaches have been taken, including the distance based approach. 1-NN has been a widely used method within distance based time series classification due to its simplicity but still good performance. However, its supremacy may be attributed to being able to use specific distances for time series within the classification process and not to the classifier itself. With the aim of exploiting these distances within more complex classifiers, new approaches have arisen in the past few years that are competitive or which outperform the 1-NN based approaches. In some cases, these new methods use the distance measure to transform the series into feature vectors, bridging the gap between time series and traditional classifiers. In other cases, the distances are employed to obtain a time series kernel and enable the use of kernel methods for time series classification. One of the main challenges is that a kernel function must be positive semi-definite, a matter that is also addressed within this review. The presented review includes a taxonomy of all those methods that aim to classify time series using a distance based approach, as well as a discussion of the strengths and weaknesses of each method.TIN2016-78365-

    Contributions to Time Series Classification: Meta-Learning and Explainability

    Get PDF
    This thesis includes 3 contributions of different types to the area of supervised time series classification, a growing field of research due to the amount of time series collected daily in a wide variety of domains. In this context, the number of methods available for classifying time series is increasing, and the classifiers are becoming more and more competitive and varied. Thus, the first contribution of the thesis consists of proposing a taxonomy of distance-based time series classifiers, where an exhaustive review of the existing methods and their computational costs is made. Moreover, from the point of view of a non-expert user (even from that of an expert), choosing a suitable classifier for a given problem is a difficult task. The second contribution, therefore, deals with the recommendation of time series classifiers, for which we will use a meta-learning approach. Finally, the third contribution consists of proposing a method to explain the prediction of time series classifiers, in which we calculate the relevance of each region of a series in the prediction. This method of explanation is based on perturbations, for which we will consider specific and realistic transformations for the time series.BES-2016-07689

    Contributions to Time Series Classification: Meta-Learning and Explainability

    Get PDF
    141 p.La presente tesis incluye 3 contribuciones de diferentes tipos al área de la clasificación supervisada de series temporales, un campo en auge por la cantidad de series temporales recolectadas día a día en una gran variedad en ámbitos. En este contexto, la cantidad de métodos disponibles para clasificar series temporales es cada vez más grande, siendo los clasificadores cada vez más competitivos y variados. De esta manera, la primera contribución de la tesis consiste en proponer una taxonomía de los clasificadores de series temporales basados en distancias, donde se hace una revisión exhaustiva de los métodos existentes y sus costes computacionales. Además, desde el punto de vista de un/a usuario/a no experto/a (incluso desde la de un/a experto/a), elegir un clasificador adecuado para un problema concreto es una tarea difícil. En la segunda contribución, por tanto, se aborda la recomendación de clasificadores de series temporales, para lo que usaremos un enfoque basado en el meta-aprendizaje. Por último, la tercera contribución consiste en proponer un método para explicar la predicción de los clasificadores de series temporales, en el que calculamos la relevancia de cada región de una serie en la predicción. Este método de explicación está basado en perturbaciones, para lo que consideraremos transformaciones específicas y realistas para las series temporales

    Deep Time-Series Clustering: A Review

    Get PDF
    We present a comprehensive, detailed review of time-series data analysis, with emphasis on deep time-series clustering (DTSC), and a case study in the context of movement behavior clustering utilizing the deep clustering method. Specifically, we modified the DCAE architectures to suit time-series data at the time of our prior deep clustering work. Lately, several works have been carried out on deep clustering of time-series data. We also review these works and identify state-of-the-art, as well as present an outlook on this important field of DTSC from five important perspectives

    Shapelet Transforms for Univariate and Multivariate Time Series Classification

    Get PDF
    Time Series Classification (TSC) is a growing field of machine learning research. One particular algorithm from the TSC literature is the Shapelet Transform (ST). Shapelets are a phase independent subsequences that are extracted from times series to form discriminatory features. It has been shown that using the shapelets to transform the datasets into a new space can improve performance. One of the major problems with ST, is that the algorithm is O(n2m4), where n is the number of time series and m is the length of the series. As a problem increases in sizes, or additional dimensions are added, the algorithm quickly becomes computationally infeasible. The research question addressed is whether the shapelet transform be improved in terms of accuracy and speed. Making algorithmic improvements to shapelets will enable the development of multivariate shapelet algorithms that can attempt to solve much larger problems in realistic time frames. In support of this thesis a new distance early abandon method is proposed. A class balancing algorithm is implemented, which uses a one vs. all multi class information gain that enables heuristics which were developed for two class problems. To support these improvements a large scale analysis of the best shapelet algorithms is conducted as part of a larger experimental evaluation. ST is proven to be one of the most accurate algorithms in TSC on the UCR-UEA datasets. Contract classification is proposed for shapelets, where a fixed run time is set, and the number of shapelets is bounded. Four search algorithms are evaluated with fixed run times of one hour and one day, three of which are not significantly worse than a full enumeration. Finally, three multivariate shapelet algorithms are developed and compared to benchmark results and multivariate dynamic time warping

    Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey

    Full text link
    Data Augmentation (DA) has emerged as an indispensable strategy in Time Series Classification (TSC), primarily due to its capacity to amplify training samples, thereby bolstering model robustness, diversifying datasets, and curtailing overfitting. However, the current landscape of DA in TSC is plagued with fragmented literature reviews, nebulous methodological taxonomies, inadequate evaluative measures, and a dearth of accessible, user-oriented tools. In light of these challenges, this study embarks on an exhaustive dissection of DA methodologies within the TSC realm. Our initial approach involved an extensive literature review spanning a decade, revealing that contemporary surveys scarcely capture the breadth of advancements in DA for TSC, prompting us to meticulously analyze over 100 scholarly articles to distill more than 60 unique DA techniques. This rigorous analysis precipitated the formulation of a novel taxonomy, purpose-built for the intricacies of DA in TSC, categorizing techniques into five principal echelons: Transformation-Based, Pattern-Based, Generative, Decomposition-Based, and Automated Data Augmentation. Our taxonomy promises to serve as a robust navigational aid for scholars, offering clarity and direction in method selection. Addressing the conspicuous absence of holistic evaluations for prevalent DA techniques, we executed an all-encompassing empirical assessment, wherein upwards of 15 DA strategies were subjected to scrutiny across 8 UCR time-series datasets, employing ResNet and a multi-faceted evaluation paradigm encompassing Accuracy, Method Ranking, and Residual Analysis, yielding a benchmark accuracy of 88.94 +- 11.83%. Our investigation underscored the inconsistent efficacies of DA techniques, with..
    corecore