5,600 research outputs found
Performance and optimization of support vector machines in high-energy physics classification problems
In this paper we promote the use of Support Vector Machines (SVM) as a
machine learning tool for searches in high-energy physics. As an example for a
new- physics search we discuss the popular case of Supersymmetry at the Large
Hadron Collider. We demonstrate that the SVM is a valuable tool and show that
an automated discovery- significance based optimization of the SVM
hyper-parameters is a highly efficient way to prepare an SVM for such
applications. A new C++ LIBSVM interface called SVM-HINT is developed and
available on Github.Comment: 20 pages, 6 figure
Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms
This paper studies the generalization performance of multi-class
classification algorithms, for which we obtain, for the first time, a
data-dependent generalization error bound with a logarithmic dependence on the
class size, substantially improving the state-of-the-art linear dependence in
the existing data-dependent generalization analysis. The theoretical analysis
motivates us to introduce a new multi-class classification machine based on
-norm regularization, where the parameter controls the complexity
of the corresponding bounds. We derive an efficient optimization algorithm
based on Fenchel duality theory. Benchmarks on several real-world datasets show
that the proposed algorithm can achieve significant accuracy gains over the
state of the art
Regularization and Kernelization of the Maximin Correlation Approach
Robust classification becomes challenging when each class consists of
multiple subclasses. Examples include multi-font optical character recognition
and automated protein function prediction. In correlation-based
nearest-neighbor classification, the maximin correlation approach (MCA)
provides the worst-case optimal solution by minimizing the maximum
misclassification risk through an iterative procedure. Despite the optimality,
the original MCA has drawbacks that have limited its wide applicability in
practice. That is, the MCA tends to be sensitive to outliers, cannot
effectively handle nonlinearities in datasets, and suffers from having high
computational complexity. To address these limitations, we propose an improved
solution, named regularized maximin correlation approach (R-MCA). We first
reformulate MCA as a quadratically constrained linear programming (QCLP)
problem, incorporate regularization by introducing slack variables in the
primal problem of the QCLP, and derive the corresponding Lagrangian dual. The
dual formulation enables us to apply the kernel trick to R-MCA so that it can
better handle nonlinearities. Our experimental results demonstrate that the
regularization and kernelization make the proposed R-MCA more robust and
accurate for various classification tasks than the original MCA. Furthermore,
when the data size or dimensionality grows, R-MCA runs substantially faster by
solving either the primal or dual (whichever has a smaller variable dimension)
of the QCLP.Comment: Submitted to IEEE Acces
Inhibition in multiclass classification
The role of inhibition is investigated in a multiclass support vector machine formalism inspired by the brain structure of insects. The so-called mushroom bodies have a set of output neurons, or classification functions,
that compete with each other to encode a particular input. Strongly active output neurons depress or inhibit the remaining outputs without knowing which is correct or incorrect. Accordingly, we propose to use a
classification function that embodies unselective inhibition and train it in the large margin classifier framework. Inhibition leads to more robust classifiers in the sense that they perform better on larger areas of appropriate hyperparameters when assessed with leave-one-out strategies. We also show that the classifier with inhibition is a tight bound to probabilistic exponential models and is Bayes consistent for 3-class problems.
These properties make this approach useful for data sets with a limited number of labeled examples. For larger data sets, there is no significant comparative advantage to other multiclass SVM approaches
Inhibition in multiclass classification
The role of inhibition is investigated in a multiclass support vector machine formalism inspired by the brain structure of insects. The so-called mushroom bodies have a set of output neurons, or classification functions,
that compete with each other to encode a particular input. Strongly active output neurons depress or inhibit the remaining outputs without knowing which is correct or incorrect. Accordingly, we propose to use a
classification function that embodies unselective inhibition and train it in the large margin classifier framework. Inhibition leads to more robust classifiers in the sense that they perform better on larger areas of appropriate hyperparameters when assessed with leave-one-out strategies. We also show that the classifier with inhibition is a tight bound to probabilistic exponential models and is Bayes consistent for 3-class problems.
These properties make this approach useful for data sets with a limited number of labeled examples. For larger data sets, there is no significant comparative advantage to other multiclass SVM approaches
- …