81 research outputs found
Parallel Perceptrons and Training Set Selection for Imbalanced Classification Problems
This is an electronic version of the paper presented at the Learning 2004, held in Spain on 2004Parallel perceptrons are a novel approach to the study of committee machines
that allows, among other things, for a fast training with minimal communications
between outputs and hidden units. Moreover, their training allows to naturally
de¯ne margins for hidden unit activations. In this work we shall show how to
use those margins to perform subsample selections over a given training set that
reduce training complexity while enhancing classi¯cation accuracy and allowing
for a balanced classi¯er performance when class sizes are greatly di®erent.With partial support of Spain's CICyT, TIC 01-57
General noise support vector regression with non-constant uncertainty intervals for solar radiation prediction
General noise cost functions have been recently
proposed for support vector regression (SVR). When
applied to tasks whose underlying noise distribution is
similar to the one assumed for the cost function, these
models should perform better than classical є-SVR. On the
other hand, uncertainty estimates for SVR have received a
somewhat limited attention in the literature until now and
still have unaddressed problems. Keeping this in mind,
three main goals are addressed here. First, we propose a
framework that uses a combination of general noise SVR
models with naive online R minimization algorithm
(NORMA) as optimization method, and then gives nonconstant
error intervals dependent upon input data aided by
the use of clustering techniques. We give theoretical details
required to implement this framework for Laplace, Gaussian,
Beta, Weibull and Marshall–Olkin generalized
exponential distributions. Second, we test the proposed
framework in two real-world regression problems using
data of two public competitions about solar energy. Results
show the validity of our models and an improvement over
classical є-SVR. Finally, in accordance with the principle
of reproducible research, we make sure that data and model
implementations used for the experiments are easily and
publicly accessible.With partial support from Spain’s grants
TIN2013-42351-P, TIN2016-76406-P, TIN2015-70308-REDT, as
well as S2013/ICE-2845 CASI-CAM-CM. This work was supported
also by project FACIL–Ayudas Fundación BBVA a Equipos de
Investigación Científica 2016 and the UAM–ADIC Chair for Data
Science and Machine Learning. We gratefully acknowledge the use of
the facilities of Centro de Computación Científica, CCC, at Universidad
Autónoma de Madrid, UA
Rosen’s projection method for SVM training
This is an electronic version of the paper presented at the 17th European Symposium on Artificial Neural Networks, held in Bruges on 2009In this work we will give explicit formulae for the application
of Rosen’s gradient projection method to SVM training that leads to a very
simple implementation. We shall experimentally show that the method
provides good descent directions that result in less training iterations,
particularly when large precision is wanted. However, a naive kernelization
may end up in a procedure requiring more KOs than SMO and further work
is needed to arrive at an efficient implementation.With partial support of Spain’s TIN 2007–66862 project and Cátedra UAM–IIC en Modelado
y Predicción. The first author is kindly supported by FPU-MICINN grant reference
AP2007–00142
Least 1-Norm SVMs: a new SVM variant between standard and LS-SVMs
This is an electronic version of the paper presented at the 18th European Symposium on Artificial Neural Networks, held in Bruges on 2010Least Squares Support Vector Machines (LS-SVMs) were
proposed by replacing the inequality constraints inherent to L1-SVMs with
equality constraints. So far this idea has only been suggested for a least
squares (L2) loss. We describe how this can also be done for the sumof-slacks
(L1) loss, yielding a new classifier (Least 1-Norm SVMs) which
gives similar models in terms of complexity and accuracy and that may
also be more robust than LS-SVMs with respect to outliers.With partial support of Spain’s TIN 2007–66862 project and Cátedra IIC en Modelado y
Predicción. The first author is kindly supported by FPU-MICINN grant reference AP2007–
00142
Discriminant parallel perceptrons
Proceedings of 22nd International Conference on Artificial Neural Networks, Lausanne, Switzerland, September 11-14, 2012The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-33266-1_70In this work we will apply Diffusion Maps (DM), a recent technique for dimensionality reduction and clustering, to build local models for wind energy forecasting. We will compare ridge regression models for K–means clusters obtained over DM features, against the models obtained for clusters constructed over the original meteorological data or principal components, and also against a global model. We will see that a combination of the DM model for the low wind power region and the global model elsewhere outperforms other options.With partial support of Spain’s CICyT, projects TIC 01–572, TIN2004–07676
Convex formulation for multi-task L1-, L2-, and LS-SVMs
Quite often a machine learning problem lends itself to be split in several well-defined subproblems, or tasks. The goal of Multi-Task Learning (MTL) is to leverage the joint learning of the problem from two different perspectives: on the one hand, a single, overall model, and on the other hand task-specific models. In this way, the found solution by MTL may be better than those of either the common or the task-specific models. Starting with the work of Evgeniou et al., support vector machines (SVMs) have lent themselves naturally to this approach. This paper proposes a convex formulation of MTL for the L1-, L2- and LS-SVM models that results in dual problems quite similar to the single-task ones, but with multi-task kernels; in turn, this makes possible to train the convex MTL models using standard solvers. As an alternative approach, the direct optimal combination of the already trained common and task-specific models can also be considered. In this paper, a procedure to compute the optimal combining parameter with respect to four different error functions is derived. As shown experimentally, the proposed convex MTL approach performs generally better than the alternative optimal convex combination, and both of them are better than the straight use of either common or task-specific modelsWith partial support from Spain’s grant TIN2016-76406-P.
Work supported also by the UAM–ADIC Chair for Data Science
and Machine Learning
Companion Losses for Deep Neural Networks
Modern Deep Neuronal Network backends allow a great flexibility to define network architectures. This allows for multiple outputs with their specific losses which can make them more suitable for particular goals. In this work we shall explore this possibility for classification networks which will combine the categorical cross-entropy loss, typical of softmax probabilistic outputs, the categorical hinge loss, which extends the hinge loss standard on SVMs, and a novel Fisher loss which seeks to concentrate class members near their centroids while keeping these apartThe authors acknowledge financial support from the European Regional Development Fund and the Spanish State Research Agency of the Ministry of Economy, Industry, and Competitiveness under the project PID2019-106827GB-I00. They also thank the UAM–ADIC Chair for Data Science and Machine Learning and gratefully acknowledge the use of the facilities of Centro de Computaci´on Cient´ıfica (CCC) at UA
An accelerated MDM algorithm for SVM training
This is an electronic version of the paper presented at the 16th European Symposium on Artificial Neural Networks, held in Bruges on 2018In this work we will propose an acceleration procedure for the
Mitchell–Demyanov–Malozemov (MDM) algorithm (a fast geometric algorithm
for SVM construction) that may yield quite large training savings.
While decomposition algorithms such as SVMLight or SMO are usually the
SVM methods of choice, we shall show that there is a relationship between
SMO and MDM that suggests that, at least in their simplest implementations,
they should have similar training speeds. Thus, and although we
will not discuss it here, the proposed MDM acceleration might be used as
a starting point to new ways of accelerating SMO.With partial support of Spain’s TIN 2004–07676 and TIN 2007–66862 projects. The first
author is kindly supported by FPU-MEC grant reference AP2006-02285
- …