4,011 research outputs found
Multi-Label Classifier Chains for Bird Sound
Bird sound data collected with unattended microphones for automatic surveys,
or mobile devices for citizen science, typically contain multiple
simultaneously vocalizing birds of different species. However, few works have
considered the multi-label structure in birdsong. We propose to use an ensemble
of classifier chains combined with a histogram-of-segments representation for
multi-label classification of birdsong. The proposed method is compared with
binary relevance and three multi-instance multi-label learning (MIML)
algorithms from prior work (which focus more on structure in the sound, and
less on structure in the label sets). Experiments are conducted on two
real-world birdsong datasets, and show that the proposed method usually
outperforms binary relevance (using the same features and base-classifier), and
is better in some cases and worse in others compared to the MIML algorithms.Comment: 6 pages, 1 figure, submission to ICML 2013 workshop on bioacoustics.
Note: this is a minor revision- the blind submission format has been replaced
with one that shows author names, and a few corrections have been mad
Locally Non-linear Embeddings for Extreme Multi-label Learning
The objective in extreme multi-label learning is to train a classifier that
can automatically tag a novel data point with the most relevant subset of
labels from an extremely large label set. Embedding based approaches make
training and prediction tractable by assuming that the training label matrix is
low-rank and hence the effective number of labels can be reduced by projecting
the high dimensional label vectors onto a low dimensional linear subspace.
Still, leading embedding approaches have been unable to deliver high prediction
accuracies or scale to large problems as the low rank assumption is violated in
most real world applications.
This paper develops the X-One classifier to address both limitations. The
main technical contribution in X-One is a formulation for learning a small
ensemble of local distance preserving embeddings which can accurately predict
infrequently occurring (tail) labels. This allows X-One to break free of the
traditional low-rank assumption and boost classification accuracy by learning
embeddings which preserve pairwise distances between only the nearest label
vectors.
We conducted extensive experiments on several real-world as well as benchmark
data sets and compared our method against state-of-the-art methods for extreme
multi-label classification. Experiments reveal that X-One can make
significantly more accurate predictions then the state-of-the-art methods
including both embeddings (by as much as 35%) as well as trees (by as much as
6%). X-One can also scale efficiently to data sets with a million labels which
are beyond the pale of leading embedding methods
CIFAR-10: KNN-based Ensemble of Classifiers
In this paper, we study the performance of different classifiers on the
CIFAR-10 dataset, and build an ensemble of classifiers to reach a better
performance. We show that, on CIFAR-10, K-Nearest Neighbors (KNN) and
Convolutional Neural Network (CNN), on some classes, are mutually exclusive,
thus yield in higher accuracy when combined. We reduce KNN overfitting using
Principal Component Analysis (PCA), and ensemble it with a CNN to increase its
accuracy. Our approach improves our best CNN model from 93.33% to 94.03%
- …