2,851 research outputs found
A Divide-and-Conquer Solver for Kernel Support Vector Machines
The kernel support vector machine (SVM) is one of the most widely used
classification methods; however, the amount of computation required becomes the
bottleneck when facing millions of samples. In this paper, we propose and
analyze a novel divide-and-conquer solver for kernel SVMs (DC-SVM). In the
division step, we partition the kernel SVM problem into smaller subproblems by
clustering the data, so that each subproblem can be solved independently and
efficiently. We show theoretically that the support vectors identified by the
subproblem solution are likely to be support vectors of the entire kernel SVM
problem, provided that the problem is partitioned appropriately by kernel
clustering. In the conquer step, the local solutions from the subproblems are
used to initialize a global coordinate descent solver, which converges quickly
as suggested by our analysis. By extending this idea, we develop a multilevel
Divide-and-Conquer SVM algorithm with adaptive clustering and early prediction
strategy, which outperforms state-of-the-art methods in terms of training
speed, testing accuracy, and memory usage. As an example, on the covtype
dataset with half-a-million samples, DC-SVM is 7 times faster than LIBSVM in
obtaining the exact SVM solution (to within relative error) which
achieves 96.15% prediction accuracy. Moreover, with our proposed early
prediction strategy, DC-SVM achieves about 96% accuracy in only 12 minutes,
which is more than 100 times faster than LIBSVM
Scale-sensitive Psi-dimensions: the Capacity Measures for Classifiers Taking Values in R^Q
Bounds on the risk play a crucial role in statistical learning theory. They
usually involve as capacity measure of the model studied the VC dimension or
one of its extensions. In classification, such "VC dimensions" exist for models
taking values in {0, 1}, {1,..., Q} and R. We introduce the generalizations
appropriate for the missing case, the one of models with values in R^Q. This
provides us with a new guaranteed risk for M-SVMs which appears superior to the
existing one
Support Vector Machines in High Energy Physics
This lecture will introduce the Support Vector algorithms for classification
and regression. They are an application of the so called kernel trick, which
allows the extension of a certain class of linear algorithms to the non linear
case. The kernel trick will be introduced and in the context of structural risk
minimization, large margin algorithms for classification and regression will be
presented. Current applications in high energy physics will be discussed.Comment: 11 pages, 12 figures. Part of the proceedings of the Track
'Computational Intelligence for HEP Data Analysis' at iCSC 200
Statistical Learning Theory for Location Fingerprinting in Wireless LANs
In this paper, techniques and algorithms developed in the framework of statistical learning theory are analyzed and applied to the problem of determining the location of a wireless device by measuring the signal strengths from a set of access points (location fingerprinting). Statistical Learning Theory provides a rich theoretical basis for the development of models starting from a set of examples. Signal strength measurement is part of the normal operating mode of wireless equipment, in particular Wi-Fi, so that no custom hardware is required. The proposed techniques, based on the Support Vector Machine paradigm, have been implemented and compared, on the same data set, with other approaches considered in the literature. Tests performed in a real-world environment show that results are comparable, with the advantage of a low algorithmic complexity in the normal operating phase. Moreover, the algorithm is particularly suitable for classification, where it outperforms the other techniques
Making Indefinite Kernel Learning Practical
In this paper we embed evolutionary computation into statistical learning theory. First, we outline the connection between large margin optimization and statistical learning and see why this paradigm is successful for many pattern recognition problems. We then embed evolutionary computation into the most prominent representative of this class of learning methods, namely into Support Vector Machines (SVM). In contrast to former applications of evolutionary algorithms to SVM we do not only optimize the method or kernel parameters. We rather use evolution strategies in order to directly solve the posed constrained optimization problem. Transforming the problem into the Wolfe dual reduces the total runtime and allows the usage of kernel functions just as for traditional SVM. We will show that evolutionary SVM are at least as accurate as their quadratic programming counterparts on eight real-world benchmark data sets in terms of generalization performance. They always outperform traditional approaches in terms of the original optimization problem. Additionally, the proposed algorithm is more generic than existing traditional solutions since it will also work for non-positive semidefinite or indefinite kernel functions. The evolutionary SVM variants frequently outperform their quadratic programming competitors in cases where such an indefinite Kernel function is used. --
A complexity analysis of statistical learning algorithms
We apply information-based complexity analysis to support vector machine
(SVM) algorithms, with the goal of a comprehensive continuous algorithmic
analysis of such algorithms. This involves complexity measures in which some
higher order operations (e.g., certain optimizations) are considered primitive
for the purposes of measuring complexity. We consider classes of information
operators and algorithms made up of scaled families, and investigate the
utility of scaling the complexities to minimize error. We look at the division
of statistical learning into information and algorithmic components, at the
complexities of each, and at applications to support vector machine (SVM) and
more general machine learning algorithms. We give applications to SVM
algorithms graded into linear and higher order components, and give an example
in biomedical informatics
- …