408 research outputs found
Learning from data with uncertainty: Robust multiclass kernel-based classifiers and regressors.
Motivated by the presence of uncertainty in real data, in this research we investigate a robust optimization approach applied to multiclass support vector machines (SVMs) and support vector regression. Two new kernel based-methods are developed to address data with uncertainty where each data point is inside a sphere of uncertainty. For classification problems, the models are called robust SVM (R-SVM) and robust feasibility approach (R-FA) respectively as extensions of SVM approach. The two models are compared in terms of robustness and generalization error. For comparison purposes, the robust minimax probability machine (MPM) is applied and compared with the above methods. From the empirical results, we conclude that the R-SVM performs better than robust MPM. For regression problems, the models are called robust support vector regression (R-SVR) and robust feasibility approach for regression (R-FAR.). The proposed robust methods can improve the mean square error (MSE) in regression problems
Multiclass optimal classification trees with SVM‑splits
In this paper we present a novel mathematical optimization-based methodology to construct
tree-shaped classification rules for multiclass instances. Our approach consists of
building Classification Trees in which, except for the leaf nodes, the labels are temporarily
left out and grouped into two classes by means of a SVM separating hyperplane. We provide
a Mixed Integer Non Linear Programming formulation for the problem and report the
results of an extended battery of computational experiments to assess the performance of
our proposal with respect to other benchmarking classification methods.Universidad de Sevilla/CBUASpanish Ministerio de Ciencia y Tecnología, Agencia Estatal de Investigación, and
Fondos Europeos de Desarrollo Regional (FEDER) via project PID2020-114594GB-C21Junta de Andalucía
projects FEDER-US-1256951, P18-FR-1422, CEI-3-FQM331, B-FQM-322-UGR20AT 21_00032;
Fundación BBVA through project NetmeetData: Big Data 2019UE-NextGenerationEU (ayudas de movilidad
para la recualificación del profesorado universitario)IMAG-Maria de Maeztu grant CEX2020-
001105-M /AEI /10.13039/50110001103
Supervised classification and mathematical optimization
Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely
useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data.Ministerio de Ciencia e InnovaciónJunta de Andalucí
Supervised Classification and Mathematical Optimization
Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data
Second order cone programming approaches for handling missing and uncertain data
We propose a novel second order cone programming formulation for designing robust classifiers
which can handle uncertainty in observations. Similar formulations are also derived for designing
regression functions which are robust to uncertainties in the regression setting. The proposed formulations
are independent of the underlying distribution, requiring only the existence of second order
moments. These formulations are then specialized to the case of missing values in observations
for both classification and regression problems. Experiments show that the proposed formulations
outperform imputation
- …