1,521 research outputs found
A Novel Model of Working Set Selection for SMO Decomposition Methods
In the process of training Support Vector Machines (SVMs) by decomposition
methods, working set selection is an important technique, and some exciting
schemes were employed into this field. To improve working set selection, we
propose a new model for working set selection in sequential minimal
optimization (SMO) decomposition methods. In this model, it selects B as
working set without reselection. Some properties are given by simple proof, and
experiments demonstrate that the proposed method is in general faster than
existing methods.Comment: 8 pages, 12 figures, it was submitted to IEEE International
conference of Tools on Artificial Intelligenc
Data selection based on decision tree for SVM classification on large data sets
Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations.Proyecto UAEM 3771/2014/C
Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
For supervised and unsupervised learning, positive definite kernels allow to
use large and potentially infinite dimensional feature spaces with a
computational cost that only depends on the number of observations. This is
usually done through the penalization of predictor functions by Euclidean or
Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing
norms such as the l1-norm or the block l1-norm. We assume that the kernel
decomposes into a large sum of individual basis kernels which can be embedded
in a directed acyclic graph; we show that it is then possible to perform kernel
selection through a hierarchical multiple kernel learning framework, in
polynomial time in the number of selected kernels. This framework is naturally
applied to non linear variable selection; our extensive simulations on
synthetic datasets and datasets from the UCI repository show that efficiently
exploring the large feature space through sparsity-inducing norms leads to
state-of-the-art predictive performance
Training Support Vector Machines Using Frank-Wolfe Optimization Methods
Training a Support Vector Machine (SVM) requires the solution of a quadratic
programming problem (QP) whose computational complexity becomes prohibitively
expensive for large scale datasets. Traditional optimization methods cannot be
directly applied in these cases, mainly due to memory restrictions.
By adopting a slightly different objective function and under mild conditions
on the kernel used within the model, efficient algorithms to train SVMs have
been devised under the name of Core Vector Machines (CVMs). This framework
exploits the equivalence of the resulting learning problem with the task of
building a Minimal Enclosing Ball (MEB) problem in a feature space, where data
is implicitly embedded by a kernel function.
In this paper, we improve on the CVM approach by proposing two novel methods
to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast
method to approximate the solution of a MEB problem. In contrast to CVMs, our
algorithms do not require to compute the solutions of a sequence of
increasingly complex QPs and are defined by using only analytic optimization
steps. Experiments on a large collection of datasets show that our methods
scale better than CVMs in most cases, sometimes at the price of a slightly
lower accuracy. As CVMs, the proposed methods can be easily extended to machine
learning problems other than binary classification. However, effective
classifiers are also obtained using kernels which do not satisfy the condition
required by CVMs and can thus be used for a wider set of problems
- …