113,929 research outputs found

    Using a multi-objective genetic algorithm for SVM construction

    Get PDF
    Support Vector Machines are kernel machines useful for classification and regression problems. In this paper, they are used for non-linear regression of environmental data. From a structural point of view, Support Vector Machines are particular Artificial Neural Networks and their training paradigm has some positive implications. In fact, the original training approach is useful to overcome the curse of dimensionality and too strict assumptions on statistics of the errors in data. Support Vector Machines and Radial Basis Function Regularised Networks are presented within a common structural framework for non-linear regression in order to emphasise the training strategy for support vector machines and to better explain the multi-objective approach in support vector machines' construction. A support vector machine's performance depends on the kernel parameter, input selection and ε-tube optimal dimension. These will be used as decision variables for the evolutionary strategy based on a Genetic Algorithm, which exhibits the number of support vectors, for the capacity of machine, and the fitness to a validation subset, for the model accuracy in mapping the underlying physical phenomena, as objective functions. The strategy is tested on a case study dealing with groundwater modelling, based on time series (past measured rainfalls and levels) for level predictions at variable time horizons

    TENFOLD BOOTSTRAP PROCEDURE FOR SUPPORT VECTOR MACHINES

    Get PDF
    Cross validation is often used to split input data into training and test set in Support vector machines. The two most commonly used cross validation versions are the tenfold and leave-one-out cross validation. Another commonly used resampling method is the random test/train split. The advantage of these methods is that they avoid overfitting in the model and perform model selection. They, however, can increase the computational time for fitting Support vector machines with the increase of the size of the dataset. In this research, we propose an alternative for fitting SVM, which we call the tenfold bootstrap for Support vector machines. This resampling procedure can significantly reduce execution time despite the big number of observations, while preserving model’s accuracy. With this finding, we propose a solution to the problem of slow execution time when fitting support vector machines on big datasets

    Spatial prediction models for landslide hazards: review, comparison and evaluation

    Get PDF
    The predictive power of logistic regression, support vector machines and bootstrap-aggregated classification trees (bagging, double-bagging) is compared using misclassification error rates on independent test data sets. Based on a resampling approach that takes into account spatial autocorrelation, error rates for predicting 'present' and 'future' landslides are estimated within and outside the training area. In a case study from the Ecuadorian Andes, logistic regression with stepwise backward variable selection yields lowest error rates and demonstrates the best generalization capabilities. The evaluation outside the training area reveals that tree-based methods tend to overfit the data

    Bayesian Model Selection for Support Vector Machines, Gaussian Processes and Other Kernel Classifiers

    Get PDF
    We present a variational Bayesian method for model selection over families of kernels classifiers like Support Vector machines or Gaussian processes. The algorithm needs no user interaction and is able to adapt a large number of kernel parameters to given data without having to sacrifice training cases for validation. This opens the possibility to use sophisticated families of kernels in situations where the small ``standard kernel'' classes are clearly inappropriate. We relate the method to other work done on Gaussian processes and clarify the relation between Support Vector machines and certain Gaussian process models

    Is it worth changing pattern recognition methods for structural health monitoring?

    Get PDF
    The key element of this work is to demonstrate alternative strategies for using pattern recognition algorithms whilst investigating structural health monitoring. This paper looks to determine if it makes any difference in choosing from a range of established classification techniques: from decision trees and support vector machines, to Gaussian processes. Classification algorithms are tested on adjustable synthetic data to establish performance metrics, then all techniques are applied to real SHM data. To aid the selection of training data, an informative chain of artificial intelligence tools is used to explore an active learning interaction between meaningful clusters of data

    Large scale musical instrument identification

    Get PDF
    In this paper, automatic musical instrument identification using a variety of classifiers is addressed. Experiments are performed on a large set of recordings that stem from 20 instrument classes. Several features from general audio data classification applications as well as MPEG-7 descriptors are measured for 1000 recordings. Branch-and-bound feature selection is applied in order to select the most discriminating features for instrument classification. The first classifier is based on non-negative matrix factorization (NMF) techniques, where training is performed for each audio class individually. A novel NMF testing method is proposed, where each recording is projected onto several training matrices, which have been Gram-Schmidt orthogonalized. Several NMF variants are utilized besides the standard NMF method, such as the local NMF and the sparse NMF. In addition, 3-layered multilayer perceptrons, normalized Gaussian radial basis function networks, and support vector machines employing a polynomial kernel have also been tested as classifiers. The classification accuracy is high, ranging between 88.7% to 95.3%, outperforming the state-of-the-art techniques tested in the aforementioned experiment

    Feature Selection and Classification Pairwise Combinations for High-dimensional Tumour Biomedical Datasets

    Get PDF
    This paper concerns classification of high-dimensional yet small sample size biomedical data and feature selection aimed at reducing dimensionality of the microarray data. The research presents a comparison of pairwise combinations of six classification strategies, including decision trees, logistic model trees, Bayes network, Na¨ıve Bayes, k-nearest neighbours and sequential minimal optimization algorithm for training support vector machines, as well as seven attribute selection methods: Correlation-based Feature Selection, chi-squared, information gain, gain ratio, symmetrical uncertainty, ReliefF and SVM-RFE (Support Vector Machine-Recursive Feature Elimination). In this paper, SVMRFE feature selection technique combined with SMO classifier has demonstrated its potential ability to accurately and efficiently classify both binary and multiclass high-dimensional sets of tumour specimens
    corecore