139,739 research outputs found
Extending twin support vector machine classifier for multi-category classification problems
© 2013 – IOS Press and the authors. All rights reservedTwin support vector machine classifier (TWSVM) was proposed by Jayadeva et al., which was used for binary classification
problems. TWSVM not only overcomes the difficulties in handling the problem of exemplar unbalance in binary classification problems, but also it is four times faster in training a classifier than classical support vector machines. This paper proposes one-versus-all twin support vector machine classifiers (OVA-TWSVM) for multi-category classification problems by utilizing the strengths of TWSVM. OVA-TWSVM extends TWSVM to solve k-category classification problems by developing k TWSVM where in the ith TWSVM, we only solve the Quadratic Programming Problems (QPPs) for the ith class, and get the ith nonparallel hyperplane corresponding to the ith class data. OVA-TWSVM uses the well known one-versus-all (OVA) approach to construct a corresponding twin support vector machine classifier. We analyze the efficiency of the OVA-TWSVM theoretically, and perform experiments to test its efficiency on both synthetic data sets and several benchmark data sets from the UCI machine learning repository. Both the theoretical analysis and experimental results demonstrate that OVA-TWSVM can outperform the traditional OVA-SVMs classifier. Further experimental comparisons with other multiclass classifiers demonstrated that comparable performance could be achieved.This work is supported in part by the grant
of the Fundamental Research Funds for the Central Universities of GK201102007 in PR China, and is also supported by Natural Science Basis Research Plan in Shaanxi Province of China (Program No.2010JM3004), and is at the same time supported by Chinese Academy of Sciences under the Innovative
Group Overseas Partnership Grant as well as Natural Science Foundation of China Major International Joint Research Project (NO.71110107026)
Classification of patients with broncho-pulmonary diseases based on analysis of absorption spectra of exhaled air samples with SVM and neural network algorithm application
In this work results of classification of patients with broncho-pulmonary diseases based on analysis of exhaled air samples are presented. These results obtained by application of laser photoacoustic spectroscopy method and intellectual data analysis ones (Principal Component Analysis, Support vector machines, neural networks). Absorption spectra of exhaled air of gathered volunteers were registered; data preparation for classification procedure of absorption spectra of exhaled air of healthy and sick people was made. Also error matrices for neural networks and sensitivity/specificity values in case of classification with SVM method were obtained. This work was partially supposed by the Federal Target Program for Research and Development, Contract No. 14.578.21.0082 (unique identifier of applied scientific research and experimental development RFMEFI57814X0082)
PhysicsGP: A Genetic Programming Approach to Event Selection
We present a novel multivariate classification technique based on Genetic
Programming. The technique is distinct from Genetic Algorithms and offers
several advantages compared to Neural Networks and Support Vector Machines. The
technique optimizes a set of human-readable classifiers with respect to some
user-defined performance measure. We calculate the Vapnik-Chervonenkis
dimension of this class of learning machines and consider a practical example:
the search for the Standard Model Higgs Boson at the LHC. The resulting
classifier is very fast to evaluate, human-readable, and easily portable. The
software may be downloaded at: http://cern.ch/~cranmer/PhysicsGP.htmlComment: 16 pages 9 figures, 1 table. Submitted to Comput. Phys. Commu
Structured variable selection in support vector machines
When applying the support vector machine (SVM) to high-dimensional
classification problems, we often impose a sparse structure in the SVM to
eliminate the influences of the irrelevant predictors. The lasso and other
variable selection techniques have been successfully used in the SVM to perform
automatic variable selection. In some problems, there is a natural hierarchical
structure among the variables. Thus, in order to have an interpretable SVM
classifier, it is important to respect the heredity principle when enforcing
the sparsity in the SVM. Many variable selection methods, however, do not
respect the heredity principle. In this paper we enforce both sparsity and the
heredity principle in the SVM by using the so-called structured variable
selection (SVS) framework originally proposed in Yuan, Joseph and Zou (2007).
We minimize the empirical hinge loss under a set of linear inequality
constraints and a lasso-type penalty. The solution always obeys the desired
heredity principle and enjoys sparsity. The new SVM classifier can be
efficiently fitted, because the optimization problem is a linear program.
Another contribution of this work is to present a nonparametric extension of
the SVS framework, and we propose nonparametric heredity SVMs. Simulated and
real data are used to illustrate the merits of the proposed method.Comment: Published in at http://dx.doi.org/10.1214/07-EJS125 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Robustness Verification of k-Nearest Neighbor Classifiers by Abstract Interpretation
openAbstract interpretation is an established mathematical framework introduced by Cousot and Cousot in 1977 and ubiquitously used in static program analysis. In recent years, many noteworthy works have shown how abstract interpretation can be successfully applied to formally verify robustness properties of some major machine learning techniques like (deep) neural networks, decision trees and support vector machines.
This research work aims to pursue this line of research by proposing a novel abstract interpretation-based framework for designing a sound abstract version of the k-Nearest Neighbors (kNN) algorithm, a well-known non-parametric supervised learning method widely used for classification and regression tasks, which is then instantiated to the standard interval domain approximating the range of numerical features, to verify its robustness and stability properties. This verification approach has been fully implemented and evaluated on several datasets, including standard benchmark datasets for individual fairness verification, and then compared with some related works finding adversarial examples on kNNs. The experimental results turned out to be very promising and showed high percentages of provable robustness and stability in most of the reference datasets, thus making a step forward in the current state-of-the-art of formal verification of machine learning models.Abstract interpretation is an established mathematical framework introduced by Cousot and Cousot in 1977 and ubiquitously used in static program analysis. In recent years, many noteworthy works have shown how abstract interpretation can be successfully applied to formally verify robustness properties of some major machine learning techniques like (deep) neural networks, decision trees and support vector machines.
This research work aims to pursue this line of research by proposing a novel abstract interpretation-based framework for designing a sound abstract version of the k-Nearest Neighbors (kNN) algorithm, a well-known non-parametric supervised learning method widely used for classification and regression tasks, which is then instantiated to the standard interval domain approximating the range of numerical features, to verify its robustness and stability properties. This verification approach has been fully implemented and evaluated on several datasets, including standard benchmark datasets for individual fairness verification, and then compared with some related works finding adversarial examples on kNNs. The experimental results turned out to be very promising and showed high percentages of provable robustness and stability in most of the reference datasets, thus making a step forward in the current state-of-the-art of formal verification of machine learning models
C Language Extensions for Hybrid CPU/GPU Programming with StarPU
Modern platforms used for high-performance computing (HPC) include machines
with both general-purpose CPUs, and "accelerators", often in the form of
graphical processing units (GPUs). StarPU is a C library to exploit such
platforms. It provides users with ways to define "tasks" to be executed on CPUs
or GPUs, along with the dependencies among them, and by automatically
scheduling them over all the available processing units. In doing so, it also
relieves programmers from the need to know the underlying architecture details:
it adapts to the available CPUs and GPUs, and automatically transfers data
between main memory and GPUs as needed. While StarPU's approach is successful
at addressing run-time scheduling issues, being a C library makes for a poor
and error-prone programming interface. This paper presents an effort started in
2011 to promote some of the concepts exported by the library as C language
constructs, by means of an extension of the GCC compiler suite. Our main
contribution is the design and implementation of language extensions that map
to StarPU's task programming paradigm. We argue that the proposed extensions
make it easier to get started with StarPU,eliminate errors that can occur when
using the C library, and help diagnose possible mistakes. We conclude on future
work
Machine learning and Datamining
Práce je příspěvkem ke studiu formátu MOOC kurzů. Popisuje existující MOOC portály a osobní zkušenost z kurzu Machine Learning na portále Coursera. Na základě tohoto kurzu jsou v práci analyzovány metody supervizovaného strojového učení lineární regrese, logistická regrese, neuronové sítě a support vector machines (SVM).Výsledkem práce je výukový program pro experimentování s lineárním SVM s jednou cílovou proměnnou. Program pomáhá uživatelům pochopit důsledky volby jednotlivých parametrů. Je realizován jako webová aplikace v jazyce Python za použití frameworku Django. Je k dispozici účastníkům kurzu dataminingu v elearningovém portálu ALS na TUL. V rámci práce je také popsán postup zpracování tématu lineární regrese ve formátu MOOC.The thesis is a contribution to studying the MOOC format. It describes existing MOOC portals and a personal experience from the Machine Learning course on Coursera. It contains an analysis of supervised machine learning methods based on the Machine Learning course. Namely linear regression, logistic regression, neural networks and support vector machines (SVM).The result of the thesis is an educational program for experimenting with a linear SVM with one target variable. Program helps users understand the meaning of choosing each parameters. It is made as a web application in the programming language Python and the Django framework. It is available for the participants of the datamining course in the elearning portal ALS at TUL. The thesis also describes how the MOOC course for linear regression was made
An Exponential Lower Bound on the Complexity of Regularization Paths
For a variety of regularized optimization problems in machine learning,
algorithms computing the entire solution path have been developed recently.
Most of these methods are quadratic programs that are parameterized by a single
parameter, as for example the Support Vector Machine (SVM). Solution path
algorithms do not only compute the solution for one particular value of the
regularization parameter but the entire path of solutions, making the selection
of an optimal parameter much easier.
It has been assumed that these piecewise linear solution paths have only
linear complexity, i.e. linearly many bends. We prove that for the support
vector machine this complexity can be exponential in the number of training
points in the worst case. More strongly, we construct a single instance of n
input points in d dimensions for an SVM such that at least \Theta(2^{n/2}) =
\Theta(2^d) many distinct subsets of support vectors occur as the
regularization parameter changes.Comment: Journal version, 28 Pages, 5 Figure
- …