103,324 research outputs found
Hellinger Distance Trees for Imbalanced Streams
Classifiers trained on data sets possessing an imbalanced class distribution
are known to exhibit poor generalisation performance. This is known as the
imbalanced learning problem. The problem becomes particularly acute when we
consider incremental classifiers operating on imbalanced data streams,
especially when the learning objective is rare class identification. As
accuracy may provide a misleading impression of performance on imbalanced data,
existing stream classifiers based on accuracy can suffer poor minority class
performance on imbalanced streams, with the result being low minority class
recall rates. In this paper we address this deficiency by proposing the use of
the Hellinger distance measure, as a very fast decision tree split criterion.
We demonstrate that by using Hellinger a statistically significant improvement
in recall rates on imbalanced data streams can be achieved, with an acceptable
increase in the false positive rate.Comment: 6 Pages, 2 figures, to be published in Proceedings 22nd International
Conference on Pattern Recognition (ICPR) 201
Unsupervised Learning with Imbalanced Data via Structure Consolidation Latent Variable Model
Unsupervised learning on imbalanced data is challenging because, when given
imbalanced data, current model is often dominated by the major category and
ignores the categories with small amount of data. We develop a latent variable
model that can cope with imbalanced data by dividing the latent space into a
shared space and a private space. Based on Gaussian Process Latent Variable
Models, we propose a new kernel formulation that enables the separation of
latent space and derives an efficient variational inference method. The
performance of our model is demonstrated with an imbalanced medical image
dataset.Comment: ICLR 2016 Worksho
Ferrimagnetism of dilute Ising antiferromagnets
It is shown that nearest-neighbor antiferromagnetic interactions of identical
Ising spins on imbalanced bipartite lattice and imbalanced bipartite
hierarchical fractal result in ferrimagnetic order instead of antiferromagnetic
one. On some crystal lattices dilute Ising antiferromagnets may also become
ferrimagnets due to the imbalanced nature of the magnetic percolation cluster
when it coexists with the percolation cluster of vacancies. As evidenced by the
existing experiments on , such ferrimagnetism is inherent
property of bcc lattice so thermodynamics of these compounds at low can be
similar to that of antiferromagnet on imbalanced hierarchical fractal.Comment: 6 pages, 4 figure
Imbalanced Ensemble Classifier for learning from imbalanced business school data set
Private business schools in India face a common problem of selecting quality
students for their MBA programs to achieve the desired placement percentage.
Generally, such data sets are biased towards one class, i.e., imbalanced in
nature. And learning from the imbalanced dataset is a difficult proposition.
This paper proposes an imbalanced ensemble classifier which can handle the
imbalanced nature of the dataset and achieves higher accuracy in case of the
feature selection (selection of important characteristics of students) cum
classification problem (prediction of placements based on the students'
characteristics) for Indian business school dataset. The optimal value of an
important model parameter is found. Numerical evidence is also provided using
Indian business school dataset to assess the outstanding performance of the
proposed classifier
- …