545 research outputs found
Supervised Classification and Mathematical Optimization
Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data
Supervised classification and mathematical optimization
Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely
useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data.Ministerio de Ciencia e InnovaciónJunta de Andalucí
HawkEye: Advancing Robust Regression with Bounded, Smooth, and Insensitive Loss Function
Support vector regression (SVR) has garnered significant popularity over the
past two decades owing to its wide range of applications across various fields.
Despite its versatility, SVR encounters challenges when confronted with
outliers and noise, primarily due to the use of the -insensitive
loss function. To address this limitation, SVR with bounded loss functions has
emerged as an appealing alternative, offering enhanced generalization
performance and robustness. Notably, recent developments focus on designing
bounded loss functions with smooth characteristics, facilitating the adoption
of gradient-based optimization algorithms. However, it's crucial to highlight
that these bounded and smooth loss functions do not possess an insensitive
zone. In this paper, we address the aforementioned constraints by introducing a
novel symmetric loss function named the HawkEye loss function. It is worth
noting that the HawkEye loss function stands out as the first loss function in
SVR literature to be bounded, smooth, and simultaneously possess an insensitive
zone. Leveraging this breakthrough, we integrate the HawkEye loss function into
the least squares framework of SVR and yield a new fast and robust model termed
HE-LSSVR. The optimization problem inherent to HE-LSSVR is addressed by
harnessing the adaptive moment estimation (Adam) algorithm, known for its
adaptive learning rate and efficacy in handling large-scale problems. To our
knowledge, this is the first time Adam has been employed to solve an SVR
problem. To empirically validate the proposed HE-LSSVR model, we evaluate it on
UCI, synthetic, and time series datasets. The experimental outcomes
unequivocally reveal the superiority of the HE-LSSVR model both in terms of its
remarkable generalization performance and its efficiency in training time
Heuristic approaches for support vector machines with the ramp loss
Recently, Support Vector Machines with the ramp loss (RLM) have attracted attention from the computational point of view. In this technical note, we propose two heuristics, the first one based on solving the continuous
relaxation of a Mixed Integer Nonlinear formulation of the RLM and the second one based on the training of an SVM classifier on a reduced dataset identified by an integer linear problem. Our computational results illustrate the ability of our heuristics to handle datasets of much larger size than those previously addressed in the literature.Ministerio de Economía y CompetitividadJunta de AndalucíaEuropean Regional Development Fund
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
A sequential dual method for the structured ramp loss minimization
The paper presents a sequential dual method for the non-convex structured ramp loss minimization. The method uses the concave-convex procedure which transforms a non-convex problem iterativelly into a series of convex ones. The sequential minimal optimization is used to deal with the convex optimization by sequentially traversing through the data and optimizing parameters associated with the incrementally built set of active structures inside each of the training examples. The paper includes the results on two sequence labeling problems, shallow parsing and part-of-speech tagging, and also presents the results on artificial data when the method is exposed to outlayers. The comparison with a primal sub-gradient method with the structured ramp and hinge loss is also presented
- …