Search CORE

44,829 research outputs found

Highly comparative feature-based time-series classification

Author: Fulcher Ben D.
Jones Nick S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/05/2014
Field of study

A highly comparative, feature-based approach to time series classification is introduced that uses an extensive database of algorithms to extract thousands of interpretable features from time series. These features are derived from across the scientific time-series analysis literature, and include summaries of time series in terms of their correlation structure, distribution, entropy, stationarity, scaling properties, and fits to a range of time-series models. After computing thousands of features for each time series in a training set, those that are most informative of the class structure are selected using greedy forward feature selection with a linear classifier. The resulting feature-based classifiers automatically learn the differences between classes using a reduced number of time-series properties, and circumvent the need to calculate distances between time series. Representing time series in this way results in orders of magnitude of dimensionality reduction, allowing the method to perform well on very large datasets containing long time series or time series of different lengths. For many of the datasets studied, classification performance exceeded that of conventional instance-based classifiers, including one nearest neighbor classifiers using Euclidean distances and dynamic time warping and, most importantly, the features selected provide an understanding of the properties of the dataset, insight that can guide further scientific investigation

arXiv.org e-Print Archive

CiteSeerX

The Hybrid Dynamic Prototype Construction and Parameter Optimization with Genetic Algorithm for Support Vector Machine

Author: Chung I-Fang
Li Tsun-Chen
Lu Chun-Liang
Publication venue: 'Taiwan Association of Engineering and Technology Innovation'
Publication date: 01/10/2015
Field of study

The optimized hybrid artificial intelligence model is a potential tool to deal with construction engineering and management problems. Support vector machine (SVM) has achieved excellent performance in a wide variety of applications. Nevertheless, how to effectively reduce the training complexity for SVM is still a serious challenge. In this paper, a novel order-independent approach for instance selection, called the dynamic condensed nearest neighbor (DCNN) rule, is proposed to adaptively construct prototypes in the training dataset and to reduce the redundant or noisy instances in a classification process for the SVM. Furthermore, a hybrid model based on the genetic algorithm (GA) is proposed to simultaneously optimize the prototype construction and the SVM kernel parameters setting to enhance the classification accuracy. Several UCI benchmark datasets are considered to compare the proposed hybrid GA-DCNN-SVM approach with the previously published GA-based method. The experimental results illustrate that the proposed hybrid model outperforms the existing method and effectively improves the classification performance for the SVM

Taiwan Association of Engineering and Technology Innovation: E-Journals

k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)

Author: Cunningham Padraig
Delany Sarah Jane
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/04/2020
Field of study

Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data. This paper is the second edition of a paper previously published as a technical report. Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN

arXiv.org e-Print Archive

Arrow@TUDublin