Search CORE

373,794 research outputs found

The Extended Edit Distance Metric

Author: Fuad Muhammad Marwan Muhammad
Marteau Pierre-François
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/09/2007
Field of study

Similarity search is an important problem in information retrieval. This similarity is based on a distance. Symbolic representation of time series has attracted many researchers recently, since it reduces the dimensionality of these high dimensional data objects. We propose a new distance metric that is applied to symbolic data objects and we test it on time series data bases in a classification task. We compare it to other distances that are well known in the literature for symbolic data objects. We also prove, mathematically, that our distance is metric.Comment: Technical repor

arXiv.org e-Print Archive

Crossref

HAL Descartes

SVM Parameter Optimization using Grid Search and Genetic Algorithm to Improve Classification Performance

Author: Prugel-Bennett Adam
Syarif Iwan
Wills Gary
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/12/2016
Field of study

Machine Learning algorithms have been widely used to solve various kinds of data classification problems. Classification problem especially for high dimensional datasets have attracted many researchers in order to find efficient approaches to address them. However, the classification problem has become very complicated and computationally expensive, especially when the number of possible different combinations of variables is so high. Support Vector Machine (SVM) has been proven to perform much better when dealing with high dimensional datasets and numerical features. Although SVM works well with default value, the performance of SVM can be improved significantly using parameter optimization. We applied two methods which are Grid Search and Genetic Algorithm (GA) to optimize the SVM parameters. Our experiment showed that SVM parameter optimization using grid search always finds near optimal parameter combination within the given ranges. However, grid search was very slow; therefore it was very reliable only in low dimensional datasets with few parameters. SVM parameter optimization using GA can be used to solve the problem of grid search. GA has proven to be more stable than grid search. Based on average running time on 9 datasets, GA was almost 16 times faster than grid search. Futhermore, the GA’s results were slighlty better than the grid search in 8 of 9 datasets

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications

Author: Adrian
Brendan Murphy
E. Raftery
Nema Dean
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Food authenticity studies are concerned with determining if food samples have been correctly labelled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervised manner using both labeled and unlabeled data. The method is shown to give excellent classification performance on several high-dimensional multiclass food authenticity datasets with more variables than observations. The variables selected by the proposed method provide information about which variables are meaningful for classification purposes. A headlong search strategy for variable selection is shown to be efficient in terms of computation and achieves excellent classification performance. In applications to several food authenticity datasets, our proposed method outperformed default implementations of Random Forests, AdaBoost, transductive SVMs and Bayesian Multinomial Regression by substantial margins

arXiv.org e-Print Archive

CiteSeerX

Crossref

Research Repository UCD

PubMed Central

Enlighten