110 research outputs found
Improved Multi-Class Cost-Sensitive Boosting via Estimation of the Minimum-Risk Class
We present a simple unified framework for multi-class cost-sensitive boosting.
The minimum-risk class is estimated directly, rather than via an approximation
of the posterior distribution. Our method jointly optimizes binary weak learners
and their corresponding output vectors, requiring classes to share features at each
iteration. By training in a cost-sensitive manner, weak learners are invested in separating
classes whose discrimination is important, at the expense of less relevant
classification boundaries. Additional contributions are a family of loss functions
along with proof that our algorithm is Boostable in the theoretical sense, as well
as an efficient procedure for growing decision trees for use as weak learners. We
evaluate our method on a variety of datasets: a collection of synthetic planar data,
common UCI datasets, MNIST digits, SUN scenes, and CUB-200 birds. Results
show state-of-the-art performance across all datasets against several strong baselines,
including non-boosting multi-class approaches
Improved Multi-Class Cost-Sensitive Boosting via Estimation of the Minimum-Risk Class
We present a simple unified framework for multi-class cost-sensitive boosting.
The minimum-risk class is estimated directly, rather than via an approximation
of the posterior distribution. Our method jointly optimizes binary weak learners
and their corresponding output vectors, requiring classes to share features at each
iteration. By training in a cost-sensitive manner, weak learners are invested in separating
classes whose discrimination is important, at the expense of less relevant
classification boundaries. Additional contributions are a family of loss functions
along with proof that our algorithm is Boostable in the theoretical sense, as well
as an efficient procedure for growing decision trees for use as weak learners. We
evaluate our method on a variety of datasets: a collection of synthetic planar data,
common UCI datasets, MNIST digits, SUN scenes, and CUB-200 birds. Results
show state-of-the-art performance across all datasets against several strong baselines,
including non-boosting multi-class approaches
Recommended from our members
The Recurrent Temporal Discriminative Restricted Boltzmann Machines
Classification of sequence data is the topic of interest for dynamic Bayesian models and Recurrent Neural Networks (RNNs). While the former can explicitly model the temporal dependencies between class variables, the latter have a capability of learning representations. Several attempts have been made to improve performance by combining these two approaches or increasing the processing capability of the hidden units in RNNs. This often results in complex models with a large number of learning parameters. In this paper, a compact model is proposed which offers both representation learning and temporal inference of class variables by rolling Restricted Boltzmann Machines (RBMs) and class variables over time. We address the key issue of intractability in this variant of RBMs by optimising a conditional distribution, instead of a joint distribution. Experiments reported in the paper on melody modelling and optical character recognition show that the proposed model can outperform the state-of-the-art. Also, the experimental results on optical character recognition, part-of-speech tagging and text chunking demonstrate that our model is comparable to recurrent neural networks with complex memory gates while requiring far fewer parameters
A simple multi-class boosting framework with theoretical guarantees and empirical proficiency
There is a need for simple yet accurate white-box learning systems that train quickly and with lit- tle data. To this end, we showcase REBEL, a multi-class boosting method, and present a novel family of weak learners called localized similar- ities. Our framework provably minimizes the training error of any dataset at an exponential rate. We carry out experiments on a variety of synthetic and real datasets, demonstrating a con- sistent tendency to avoid overfitting. We eval- uate our method on MNIST and standard UCI datasets against other state-of-the-art methods, showing the empirical proficiency of our method
- …