139,739 research outputs found

    Extending twin support vector machine classifier for multi-category classification problems

    Get PDF
    © 2013 – IOS Press and the authors. All rights reservedTwin support vector machine classifier (TWSVM) was proposed by Jayadeva et al., which was used for binary classification problems. TWSVM not only overcomes the difficulties in handling the problem of exemplar unbalance in binary classification problems, but also it is four times faster in training a classifier than classical support vector machines. This paper proposes one-versus-all twin support vector machine classifiers (OVA-TWSVM) for multi-category classification problems by utilizing the strengths of TWSVM. OVA-TWSVM extends TWSVM to solve k-category classification problems by developing k TWSVM where in the ith TWSVM, we only solve the Quadratic Programming Problems (QPPs) for the ith class, and get the ith nonparallel hyperplane corresponding to the ith class data. OVA-TWSVM uses the well known one-versus-all (OVA) approach to construct a corresponding twin support vector machine classifier. We analyze the efficiency of the OVA-TWSVM theoretically, and perform experiments to test its efficiency on both synthetic data sets and several benchmark data sets from the UCI machine learning repository. Both the theoretical analysis and experimental results demonstrate that OVA-TWSVM can outperform the traditional OVA-SVMs classifier. Further experimental comparisons with other multiclass classifiers demonstrated that comparable performance could be achieved.This work is supported in part by the grant of the Fundamental Research Funds for the Central Universities of GK201102007 in PR China, and is also supported by Natural Science Basis Research Plan in Shaanxi Province of China (Program No.2010JM3004), and is at the same time supported by Chinese Academy of Sciences under the Innovative Group Overseas Partnership Grant as well as Natural Science Foundation of China Major International Joint Research Project (NO.71110107026)

    Classification of patients with broncho-pulmonary diseases based on analysis of absorption spectra of exhaled air samples with SVM and neural network algorithm application

    Get PDF
    In this work results of classification of patients with broncho-pulmonary diseases based on analysis of exhaled air samples are presented. These results obtained by application of laser photoacoustic spectroscopy method and intellectual data analysis ones (Principal Component Analysis, Support vector machines, neural networks). Absorption spectra of exhaled air of gathered volunteers were registered; data preparation for classification procedure of absorption spectra of exhaled air of healthy and sick people was made. Also error matrices for neural networks and sensitivity/specificity values in case of classification with SVM method were obtained. This work was partially supposed by the Federal Target Program for Research and Development, Contract No. 14.578.21.0082 (unique identifier of applied scientific research and experimental development RFMEFI57814X0082)

    PhysicsGP: A Genetic Programming Approach to Event Selection

    Full text link
    We present a novel multivariate classification technique based on Genetic Programming. The technique is distinct from Genetic Algorithms and offers several advantages compared to Neural Networks and Support Vector Machines. The technique optimizes a set of human-readable classifiers with respect to some user-defined performance measure. We calculate the Vapnik-Chervonenkis dimension of this class of learning machines and consider a practical example: the search for the Standard Model Higgs Boson at the LHC. The resulting classifier is very fast to evaluate, human-readable, and easily portable. The software may be downloaded at: http://cern.ch/~cranmer/PhysicsGP.htmlComment: 16 pages 9 figures, 1 table. Submitted to Comput. Phys. Commu

    Structured variable selection in support vector machines

    Get PDF
    When applying the support vector machine (SVM) to high-dimensional classification problems, we often impose a sparse structure in the SVM to eliminate the influences of the irrelevant predictors. The lasso and other variable selection techniques have been successfully used in the SVM to perform automatic variable selection. In some problems, there is a natural hierarchical structure among the variables. Thus, in order to have an interpretable SVM classifier, it is important to respect the heredity principle when enforcing the sparsity in the SVM. Many variable selection methods, however, do not respect the heredity principle. In this paper we enforce both sparsity and the heredity principle in the SVM by using the so-called structured variable selection (SVS) framework originally proposed in Yuan, Joseph and Zou (2007). We minimize the empirical hinge loss under a set of linear inequality constraints and a lasso-type penalty. The solution always obeys the desired heredity principle and enjoys sparsity. The new SVM classifier can be efficiently fitted, because the optimization problem is a linear program. Another contribution of this work is to present a nonparametric extension of the SVS framework, and we propose nonparametric heredity SVMs. Simulated and real data are used to illustrate the merits of the proposed method.Comment: Published in at http://dx.doi.org/10.1214/07-EJS125 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Robustness Verification of k-Nearest Neighbor Classifiers by Abstract Interpretation

    Get PDF
    openAbstract interpretation is an established mathematical framework introduced by Cousot and Cousot in 1977 and ubiquitously used in static program analysis. In recent years, many noteworthy works have shown how abstract interpretation can be successfully applied to formally verify robustness properties of some major machine learning techniques like (deep) neural networks, decision trees and support vector machines. This research work aims to pursue this line of research by proposing a novel abstract interpretation-based framework for designing a sound abstract version of the k-Nearest Neighbors (kNN) algorithm, a well-known non-parametric supervised learning method widely used for classification and regression tasks, which is then instantiated to the standard interval domain approximating the range of numerical features, to verify its robustness and stability properties. This verification approach has been fully implemented and evaluated on several datasets, including standard benchmark datasets for individual fairness verification, and then compared with some related works finding adversarial examples on kNNs. The experimental results turned out to be very promising and showed high percentages of provable robustness and stability in most of the reference datasets, thus making a step forward in the current state-of-the-art of formal verification of machine learning models.Abstract interpretation is an established mathematical framework introduced by Cousot and Cousot in 1977 and ubiquitously used in static program analysis. In recent years, many noteworthy works have shown how abstract interpretation can be successfully applied to formally verify robustness properties of some major machine learning techniques like (deep) neural networks, decision trees and support vector machines. This research work aims to pursue this line of research by proposing a novel abstract interpretation-based framework for designing a sound abstract version of the k-Nearest Neighbors (kNN) algorithm, a well-known non-parametric supervised learning method widely used for classification and regression tasks, which is then instantiated to the standard interval domain approximating the range of numerical features, to verify its robustness and stability properties. This verification approach has been fully implemented and evaluated on several datasets, including standard benchmark datasets for individual fairness verification, and then compared with some related works finding adversarial examples on kNNs. The experimental results turned out to be very promising and showed high percentages of provable robustness and stability in most of the reference datasets, thus making a step forward in the current state-of-the-art of formal verification of machine learning models

    C Language Extensions for Hybrid CPU/GPU Programming with StarPU

    Get PDF
    Modern platforms used for high-performance computing (HPC) include machines with both general-purpose CPUs, and "accelerators", often in the form of graphical processing units (GPUs). StarPU is a C library to exploit such platforms. It provides users with ways to define "tasks" to be executed on CPUs or GPUs, along with the dependencies among them, and by automatically scheduling them over all the available processing units. In doing so, it also relieves programmers from the need to know the underlying architecture details: it adapts to the available CPUs and GPUs, and automatically transfers data between main memory and GPUs as needed. While StarPU's approach is successful at addressing run-time scheduling issues, being a C library makes for a poor and error-prone programming interface. This paper presents an effort started in 2011 to promote some of the concepts exported by the library as C language constructs, by means of an extension of the GCC compiler suite. Our main contribution is the design and implementation of language extensions that map to StarPU's task programming paradigm. We argue that the proposed extensions make it easier to get started with StarPU,eliminate errors that can occur when using the C library, and help diagnose possible mistakes. We conclude on future work

    Machine learning and Datamining

    Get PDF
    Práce je příspěvkem ke studiu formátu MOOC kurzů. Popisuje existující MOOC portály a osobní zkušenost z kurzu Machine Learning na portále Coursera. Na základě tohoto kurzu jsou v práci analyzovány metody supervizovaného strojového učení lineární regrese, logistická regrese, neuronové sítě a support vector machines (SVM).Výsledkem práce je výukový program pro experimentování s lineárním SVM s jednou cílovou proměnnou. Program pomáhá uživatelům pochopit důsledky volby jednotlivých parametrů. Je realizován jako webová aplikace v jazyce Python za použití frameworku Django. Je k dispozici účastníkům kurzu dataminingu v elearningovém portálu ALS na TUL. V rámci práce je také popsán postup zpracování tématu lineární regrese ve formátu MOOC.The thesis is a contribution to studying the MOOC format. It describes existing MOOC portals and a personal experience from the Machine Learning course on Coursera. It contains an analysis of supervised machine learning methods based on the Machine Learning course. Namely linear regression, logistic regression, neural networks and support vector machines (SVM).The result of the thesis is an educational program for experimenting with a linear SVM with one target variable. Program helps users understand the meaning of choosing each parameters. It is made as a web application in the programming language Python and the Django framework. It is available for the participants of the datamining course in the elearning portal ALS at TUL. The thesis also describes how the MOOC course for linear regression was made

    An Exponential Lower Bound on the Complexity of Regularization Paths

    Full text link
    For a variety of regularized optimization problems in machine learning, algorithms computing the entire solution path have been developed recently. Most of these methods are quadratic programs that are parameterized by a single parameter, as for example the Support Vector Machine (SVM). Solution path algorithms do not only compute the solution for one particular value of the regularization parameter but the entire path of solutions, making the selection of an optimal parameter much easier. It has been assumed that these piecewise linear solution paths have only linear complexity, i.e. linearly many bends. We prove that for the support vector machine this complexity can be exponential in the number of training points in the worst case. More strongly, we construct a single instance of n input points in d dimensions for an SVM such that at least \Theta(2^{n/2}) = \Theta(2^d) many distinct subsets of support vectors occur as the regularization parameter changes.Comment: Journal version, 28 Pages, 5 Figure
    corecore