3,377 research outputs found

    Benchmarking least squares support vector machine classifiers.

    Get PDF
    In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a ( convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LS-SVMs), a least squares cost function is proposed so as to obtain a linear set of equations in the dual space. While the SVM classifier has a large margin interpretation, the LS-SVM formulation is related in this paper to a ridge regression approach for classification with binary targets and to Fisher's linear discriminant analysis in the feature space. Multiclass categorization problems are represented by a set of binary classifiers using different output coding schemes. While regularization is used to control the effective number of parameters of the LS-SVM classifier, the sparseness property of SVMs is lost due to the choice of the 2-norm. Sparseness can be imposed in a second stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LS-SVM classifiers with linear, polynomial and radial basis function (RBF) kernels. Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances. These SVM and LS-SVM performances are consistently very good when compared to a variety of methods described in the literature including decision tree based algorithms, statistical algorithms and instance based learning methods. We show on ten UCI datasets that the LS-SVM sparse approximation procedure can be successfully applied.least squares support vector machines; multiclass support vector machines; sparse approximation; discriminant-analysis; sparse approximation; learning algorithms; classification; framework; kernels; time; SISTA;

    Least squares support vector machine with self-organizing multiple kernel learning and sparsity

    Get PDF
    © 2018 In recent years, least squares support vector machines (LSSVMs) with various kernel functions have been widely used in the field of machine learning. However, the selection of kernel functions is often ignored in practice. In this paper, an improved LSSVM method based on self-organizing multiple kernel learning is proposed for black-box problems. To strengthen the generalization ability of the LSSVM, some appropriate kernel functions are selected and the corresponding model parameters are optimized using a differential evolution algorithm based on an improved mutation strategy. Due to the large computation cost, a sparse selection strategy is developed to extract useful data and remove redundant data without loss of accuracy. To demonstrate the effectiveness of the proposed method, some benchmark problems from the UCI machine learning repository are tested. The results show that the proposed method performs better than other state-of-the-art methods. In addition, to verify the practicability of the proposed method, it is applied to a real-world converter steelmaking process. The results illustrate that the proposed model can precisely predict the molten steel quality and satisfy the actual production demand

    Sparse LS-SVMs with L0-norm minimization

    Full text link
    This is an electronic version of the paper presented at the 19th European Symposium on Artificial Neural Networks, held in Bruges on 2011Least-Squares Support Vector Machines (LS-SVMs) have been successfully applied in many classification and regression tasks. Their main drawback is the lack of sparseness of the final models. Thus, a procedure to sparsify LS-SVMs is a frequent desideratum. In this paper, we adapt to the LS-SVM case a recent work for sparsifying classical SVM classifiers, which is based on an iterative approximation to the L0-norm. Experiments on real-world classification and regression datasets illustrate that this adaptation achieves very sparse models, without significant loss of accuracy compared to standard LS-SVMs or SVMs

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    Improving Sparse Representation-Based Classification Using Local Principal Component Analysis

    Full text link
    Sparse representation-based classification (SRC), proposed by Wright et al., seeks the sparsest decomposition of a test sample over the dictionary of training samples, with classification to the most-contributing class. Because it assumes test samples can be written as linear combinations of their same-class training samples, the success of SRC depends on the size and representativeness of the training set. Our proposed classification algorithm enlarges the training set by using local principal component analysis to approximate the basis vectors of the tangent hyperplane of the class manifold at each training sample. The dictionary in SRC is replaced by a local dictionary that adapts to the test sample and includes training samples and their corresponding tangent basis vectors. We use a synthetic data set and three face databases to demonstrate that this method can achieve higher classification accuracy than SRC in cases of sparse sampling, nonlinear class manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition," editors Shyi-Ming Chen and Witold Pedrycz. The original publication is available at http://www.springerlink.co
    corecore