Search CORE

11 research outputs found

Multiple Locally Linear Kernel Machines

Author: Picard David
Publication venue
Publication date: 17/01/2024
Field of study

In this paper we propose a new non-linear classifier based on a combination of locally linear classifiers. A well known optimization formulation is given as we cast the problem in a

\ell_1

Multiple Kernel Learning (MKL) problem using many locally linear kernels. Since the number of such kernels is huge, we provide a scalable generic MKL training algorithm handling streaming kernels. With respect to the inference time, the resulting classifier fits the gap between high accuracy but slow non-linear classifiers (such as classical MKL) and fast but low accuracy linear classifiers.Comment: This paper was written in 2014 and was originally submitted but rejected at ICML'1

arXiv.org e-Print Archive

Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Model Coefficients

Author: Lan Liang
Lei Zijian
Publication venue
Publication date: 06/10/2020
Field of study

Kernel approximation is widely used to scale up kernel SVM training and prediction. However, the memory and computation costs of kernel approximation models are still too high if we want to deploy them on memory-limited devices such as mobile phones, smartwatches, and IoT devices. To address this challenge, we propose a novel memory and computation-efficient kernel SVM model by using both binary embedding and binary model coefficients. First, we propose an efficient way to generate compact binary embedding of the data, preserving the kernel similarity. Second, we propose a simple but effective algorithm to learn a linear classification model with ternary coefficients that can support different types of loss function and regularizer. Our algorithm can achieve better generalization accuracy than existing works on learning binary coefficients since we allow coefficient to be

-1

0

, or

1

during the training stage, and coefficient

0

can be removed during model inference for binary classification. Moreover, we provide a detailed analysis of the convergence of our algorithm and the inference complexity of our model. The analysis shows that the convergence to a local optimum is guaranteed, and the inference complexity of our model is much lower than other competing methods. Our experimental results on five large real-world datasets have demonstrated that our proposed method can build accurate nonlinear SVM models with memory costs less than 30KB

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Indefinite Core Vector Machine

Author: Schleif Frank-Michael
Tino Peter
Publication venue: 'Elsevier BV'
Publication date: 01/11/2017
Field of study

Crossref

University of Birmingham Research Portal

Optimal and Efficient Learning In Classification

Author: DELLA VECCHIA ANDREA
Publication venue: Università degli studi di Genova
Publication date: 29/05/2023
Field of study

We study a natural extension of classical empirical risk minimization, where the hypothesis space is a random subspace of a given space. In particular, we consider possibly data dependent subspaces spanned by a random subset of the data, recovering as a special case Nyström approaches for kernel methods. Considering random subspaces naturally leads to computational savings, but the question is whether the corresponding learning accuracy is degraded. These statistical-computational tradeoffs have been recently explored for the least squares loss and self-concordant loss functions, such as the logistic loss. Here, we work to extend these results to convex Lipschitz loss functions, that might not be smooth, such as the hinge loss used in support vector machines. This unified analysis requires developing new proofs, that use different technical tools to establish fast rates. Our main results show the existence of different settings, depending on how hard the learning problem is, for which computational efficiency can be improved with no loss in performance. The analysis is also specialized to smooth loss functions. In the final part of the paper we convert our surrogates risk bounds into classification error bounds and compare the choice of hinge loss with respect to square loss

Archivio istituzionale della ricerca - Università di Genova