2,008 research outputs found
Linear Maximum Margin Classifier for Learning from Uncertain Data
In this paper, we propose a maximum margin classifier that deals with
uncertainty in data input. More specifically, we reformulate the SVM framework
such that each training example can be modeled by a multi-dimensional Gaussian
distribution described by its mean vector and its covariance matrix -- the
latter modeling the uncertainty. We address the classification problem and
define a cost function that is the expected value of the classical SVM cost
when data samples are drawn from the multi-dimensional Gaussian distributions
that form the set of the training examples. Our formulation approximates the
classical SVM formulation when the training examples are isotropic Gaussians
with variance tending to zero. We arrive at a convex optimization problem,
which we solve efficiently in the primal form using a stochastic gradient
descent approach. The resulting classifier, which we name SVM with Gaussian
Sample Uncertainty (SVM-GSU), is tested on synthetic data and five publicly
available and popular datasets; namely, the MNIST, WDBC, DEAP, TV News Channel
Commercial Detection, and TRECVID MED datasets. Experimental results verify the
effectiveness of the proposed method.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence. (c)
2017 IEEE. DOI: 10.1109/TPAMI.2017.2772235 Author's accepted version. The
final publication is available at
http://ieeexplore.ieee.org/document/8103808
A mathematical programming approach to SVM-based classification with label noise
The authors of this research acknowledge financial support by the Spanish Ministerio de Ciencia y Tecnologia, Agencia Estatal de Investigacion and Fondos
Europeos de Desarrollo Regional (FEDER) via project PID2020114594GB-C21. The authors also acknowledge partial support from projects FEDER-US-1256951,
Junta de Andalucía P18-FR-1422, CEI-3-FQM331, NetmeetData: Ayudas Fundación BBVA a equipos de investigación científica 2019. The first author was
also supported by projects P18-FR-2369 (Junta de Andalucía) and IMAG-Maria de Maeztu grant CEX2020-001105-M /AEI /10.13039/501100011033. (Spanish
Ministerio de Ciencia y Tecnologia).In this paper we propose novel methodologies to optimally construct Support Vector Machine-based classifiers that take into account that label noise occur in the training sample. We propose different alternatives based on solving Mixed Integer Linear and Non Linear models by incorporating decisions on relabeling some of the observations in the training dataset. The first method incorporates relabeling directly in the SVM model while a second family of methods combines clustering with classification at the same time, giving rise to a model that applies simultaneously similarity measures and SVM. Extensive computational experiments are reported based on a battery of standard datasets taken from UCI Machine Learning repository, showing the effectiveness of the proposed approaches.Spanish Ministerio de Ciencia y Tecnologia, Agencia Estatal de Investigacion and Fondos
Europeos de Desarrollo Regional (FEDER) via project PID2020114594GB-C21FEDER-US-1256951Junta de Andalucía P18-FR-1422CEI-3-FQM331NetmeetData: Ayudas Fundación BBVA a equipos de investigación científica 2019Project P18-FR-2369 Junta de AndalucíaIMAG-Maria de Maeztu grant CEX2020-001105-M /AEI /10.13039/501100011033. (Spanish
Ministerio de Ciencia y Tecnologia
Positive Semidefinite Metric Learning with Boosting
The learning of appropriate distance metrics is a critical problem in image
classification and retrieval. In this work, we propose a boosting-based
technique, termed \BoostMetric, for learning a Mahalanobis distance metric. One
of the primary difficulties in learning such a metric is to ensure that the
Mahalanobis matrix remains positive semidefinite. Semidefinite programming is
sometimes used to enforce this constraint, but does not scale well.
\BoostMetric is instead based on a key observation that any positive
semidefinite matrix can be decomposed into a linear positive combination of
trace-one rank-one matrices. \BoostMetric thus uses rank-one positive
semidefinite matrices as weak learners within an efficient and scalable
boosting-based learning process. The resulting method is easy to implement,
does not require tuning, and can accommodate various types of constraints.
Experiments on various datasets show that the proposed algorithm compares
favorably to those state-of-the-art methods in terms of classification accuracy
and running time.Comment: 11 pages, Twenty-Third Annual Conference on Neural Information
Processing Systems (NIPS 2009), Vancouver, Canad
A scalable algorithm for learning a Mahalanobis distance metric
A distance metric that can accurately re°ect the intrinsic characteristics of data is critical for visual recognition tasks. An e®ective solution to de¯ning such a metric is to learn it from a set of training sam- ples. In this work, we propose a fast and scalable algorithm to learn a Ma- halanobis distance. By employing the principle of margin maximization to secure better generalization performances, this algorithm formulates the metric learning as a convex optimization problem with a positive semide¯nite (psd) matrix variable. Based on an important theorem that a psd matrix with trace of one can always be represented as a convex combination of multiple rank-one matrices, our algorithm employs a dif- ferentiable loss function and solves the above convex optimization with gradient descent methods. This algorithm not only naturally maintains the psd requirement of the matrix variable that is essential for met- ric learning, but also signi¯cantly cuts down computational overhead, making it much more e±cient with the increasing dimensions of fea- ture vectors. Experimental study on benchmark data sets indicates that, compared with the existing metric learning algorithms, our algorithm can achieve higher classi¯cation accuracy with much less computational load
Aeronautical Engineering: A special bibliography, supplement 60
This bibliography lists 284 reports, articles, and other documents introduced into the NASA scientific and technical information system in July 1975
- …