2,917 research outputs found
Pruning of Error Correcting Output Codes by optimization of accuracy–diversity trade off
Ensemble learning is a method of combining learners to obtain more reliable and accurate predictions in supervised and unsupervised learning. However, the ensemble sizes are sometimes unnecessarily large which leads to additional memory usage, computational overhead and decreased effectiveness. To overcome such side effects, pruning algorithms have been developed; since this is a combinatorial problem, finding the exact subset of ensembles is computationally infeasible. Different types of heuristic algorithms have developed to obtain an approximate solution but they lack a theoretical guarantee. Error Correcting Output Code (ECOC) is one of the well-known ensemble techniques for multiclass classification which combines the outputs of binary base learners to predict the classes for multiclass data. In this paper, we propose a novel approach for pruning the ECOC matrix by utilizing accuracy and diversity information simultaneously. All existing pruning methods need the size of the ensemble as a parameter, so the performance of the pruning methods depends on the size of the ensemble. Our unparametrized pruning method is novel as being independent of the size of ensemble. Experimental results show that our pruning method is mostly better than other existing approaches
Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision
Feature selection is essential for effective visual recognition. We propose
an efficient joint classifier learning and feature selection method that
discovers sparse, compact representations of input features from a vast sea of
candidates, with an almost unsupervised formulation. Our method requires only
the following knowledge, which we call the \emph{feature sign}---whether or not
a particular feature has on average stronger values over positive samples than
over negatives. We show how this can be estimated using as few as a single
labeled training sample per class. Then, using these feature signs, we extend
an initial supervised learning problem into an (almost) unsupervised clustering
formulation that can incorporate new data without requiring ground truth
labels. Our method works both as a feature selection mechanism and as a fully
competitive classifier. It has important properties, low computational cost and
excellent accuracy, especially in difficult cases of very limited training
data. We experiment on large-scale recognition in video and show superior speed
and performance to established feature selection approaches such as AdaBoost,
Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.Comment: arXiv admin note: text overlap with arXiv:1411.771
- …