Search CORE

16,094 research outputs found

Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval

Author: Li Xuelong
Tang X.
Tao D.
Wu X.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based relevance feedback is often poor when the number of labeled positive feedback samples is small. This is mainly due to three reasons: 1) an SVM classifier is unstable on a small-sized training set, 2) SVM's optimal hyperplane may be biased when the positive feedback samples are much less than the negative feedback samples, and 3) overfitting happens because the number of feature dimensions is much higher than the size of the training set. In this paper, we develop a mechanism to overcome these problems. To address the first two problems, we propose an asymmetric bagging-based SVM (AB-SVM). For the third problem, we combine the random subspace method and SVM for relevance feedback, which is named random subspace SVM (RS-SVM). Finally, by integrating AB-SVM and RS-SVM, an asymmetric bagging and random subspace SVM (ABRS-SVM) is built to solve these three problems and further improve the relevance feedback performance

Crossref

OPUS - University of Technology Sydney

Birkbeck Institutional Research Online

A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning

Author: Balcan Maria-Florina
Bellet Aurélien
Garakani Alireza Bagheri
Liang Yingyu
Sha Fei
Publication venue
Publication date: 12/01/2015
Field of study

Learning sparse combinations is a frequent theme in machine learning. In this paper, we study its associated optimization problem in the distributed setting where the elements to be combined are not centrally located but spread over a network. We address the key challenges of balancing communication costs and optimization errors. To this end, we propose a distributed Frank-Wolfe (dFW) algorithm. We obtain theoretical guarantees on the optimization error

\epsilon

and communication cost that do not depend on the total number of combining elements. We further show that the communication cost of dFW is optimal by deriving a lower-bound on the communication cost required to construct an

\epsilon

-approximate solution. We validate our theoretical analysis with empirical studies on synthetic and real-world data, which demonstrate that dFW outperforms both baselines and competing methods. We also study the performance of dFW when the conditions of our analysis are relaxed, and show that dFW is fairly robust.Comment: Extended version of the SIAM Data Mining 2015 pape

arXiv.org e-Print Archive

Crossref