106 research outputs found
Training Support Vector Machines Using Frank-Wolfe Optimization Methods
Training a Support Vector Machine (SVM) requires the solution of a quadratic
programming problem (QP) whose computational complexity becomes prohibitively
expensive for large scale datasets. Traditional optimization methods cannot be
directly applied in these cases, mainly due to memory restrictions.
By adopting a slightly different objective function and under mild conditions
on the kernel used within the model, efficient algorithms to train SVMs have
been devised under the name of Core Vector Machines (CVMs). This framework
exploits the equivalence of the resulting learning problem with the task of
building a Minimal Enclosing Ball (MEB) problem in a feature space, where data
is implicitly embedded by a kernel function.
In this paper, we improve on the CVM approach by proposing two novel methods
to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast
method to approximate the solution of a MEB problem. In contrast to CVMs, our
algorithms do not require to compute the solutions of a sequence of
increasingly complex QPs and are defined by using only analytic optimization
steps. Experiments on a large collection of datasets show that our methods
scale better than CVMs in most cases, sometimes at the price of a slightly
lower accuracy. As CVMs, the proposed methods can be easily extended to machine
learning problems other than binary classification. However, effective
classifiers are also obtained using kernels which do not satisfy the condition
required by CVMs and can thus be used for a wider set of problems
A Graph-Based Semi-Supervised k Nearest-Neighbor Method for Nonlinear Manifold Distributed Data Classification
Nearest Neighbors (NN) is one of the most widely used supervised
learning algorithms to classify Gaussian distributed data, but it does not
achieve good results when it is applied to nonlinear manifold distributed data,
especially when a very limited amount of labeled samples are available. In this
paper, we propose a new graph-based NN algorithm which can effectively
handle both Gaussian distributed data and nonlinear manifold distributed data.
To achieve this goal, we first propose a constrained Tired Random Walk (TRW) by
constructing an -level nearest-neighbor strengthened tree over the graph,
and then compute a TRW matrix for similarity measurement purposes. After this,
the nearest neighbors are identified according to the TRW matrix and the class
label of a query point is determined by the sum of all the TRW weights of its
nearest neighbors. To deal with online situations, we also propose a new
algorithm to handle sequential samples based a local neighborhood
reconstruction. Comparison experiments are conducted on both synthetic data
sets and real-world data sets to demonstrate the validity of the proposed new
NN algorithm and its improvements to other version of NN algorithms.
Given the widespread appearance of manifold structures in real-world problems
and the popularity of the traditional NN algorithm, the proposed manifold
version NN shows promising potential for classifying manifold-distributed
data.Comment: 32 pages, 12 figures, 7 table
Multi-class pairwise linear dimensionality reduction using heteroscedastic schemes
Linear dimensionality reduction (LDR) techniques have been increasingly important in pattern recognition (PR) due to the fact that they permit a relatively simple mapping of the problem onto a lower-dimensional subspace, leading to simple and computationally efficient classification strategies. Although the field has been well developed for the two-class problem, the corresponding issues encountered when dealing with multiple classes are far from trivial. In this paper, we argue that, as opposed to the traditional LDR multi-class schemes, if we are dealing with multiple classes, it is not expedient to treat it as a multi-class problem per se. Rather, we shall show that it is better to treat it as an ensemble of Chernoff-based two-class reductions onto different subspaces, whence the overall solution is achieved by resorting to either Voting, Weighting, or to a Decision Tree strategy. The experimental results obtained on benchmark datasets demonstrate that the proposed methods are not only efficient, but that they also yield accuracies comparable to that obtained by the optimal Bayes classifier
Convex Optimization for Binary Classifier Aggregation in Multiclass Problems
Multiclass problems are often decomposed into multiple binary problems that
are solved by individual binary classifiers whose results are integrated into a
final answer. Various methods, including all-pairs (APs), one-versus-all (OVA),
and error correcting output code (ECOC), have been studied, to decompose
multiclass problems into binary problems. However, little study has been made
to optimally aggregate binary problems to determine a final answer to the
multiclass problem. In this paper we present a convex optimization method for
an optimal aggregation of binary classifiers to estimate class membership
probabilities in multiclass problems. We model the class membership probability
as a softmax function which takes a conic combination of discrepancies induced
by individual binary classifiers, as an input. With this model, we formulate
the regularized maximum likelihood estimation as a convex optimization problem,
which is solved by the primal-dual interior point method. Connections of our
method to large margin classifiers are presented, showing that the large margin
formulation can be considered as a limiting case of our convex formulation.
Numerical experiments on synthetic and real-world data sets demonstrate that
our method outperforms existing aggregation methods as well as direct methods,
in terms of the classification accuracy and the quality of class membership
probability estimates.Comment: Appeared in Proceedings of the 2014 SIAM International Conference on
Data Mining (SDM 2014
- …