103,324 research outputs found

    Hellinger Distance Trees for Imbalanced Streams

    Get PDF
    Classifiers trained on data sets possessing an imbalanced class distribution are known to exhibit poor generalisation performance. This is known as the imbalanced learning problem. The problem becomes particularly acute when we consider incremental classifiers operating on imbalanced data streams, especially when the learning objective is rare class identification. As accuracy may provide a misleading impression of performance on imbalanced data, existing stream classifiers based on accuracy can suffer poor minority class performance on imbalanced streams, with the result being low minority class recall rates. In this paper we address this deficiency by proposing the use of the Hellinger distance measure, as a very fast decision tree split criterion. We demonstrate that by using Hellinger a statistically significant improvement in recall rates on imbalanced data streams can be achieved, with an acceptable increase in the false positive rate.Comment: 6 Pages, 2 figures, to be published in Proceedings 22nd International Conference on Pattern Recognition (ICPR) 201

    Unsupervised Learning with Imbalanced Data via Structure Consolidation Latent Variable Model

    Full text link
    Unsupervised learning on imbalanced data is challenging because, when given imbalanced data, current model is often dominated by the major category and ignores the categories with small amount of data. We develop a latent variable model that can cope with imbalanced data by dividing the latent space into a shared space and a private space. Based on Gaussian Process Latent Variable Models, we propose a new kernel formulation that enables the separation of latent space and derives an efficient variational inference method. The performance of our model is demonstrated with an imbalanced medical image dataset.Comment: ICLR 2016 Worksho

    Ferrimagnetism of dilute Ising antiferromagnets

    Get PDF
    It is shown that nearest-neighbor antiferromagnetic interactions of identical Ising spins on imbalanced bipartite lattice and imbalanced bipartite hierarchical fractal result in ferrimagnetic order instead of antiferromagnetic one. On some crystal lattices dilute Ising antiferromagnets may also become ferrimagnets due to the imbalanced nature of the magnetic percolation cluster when it coexists with the percolation cluster of vacancies. As evidenced by the existing experiments on FepZn1pF2Fe_pZn_{1-p}F_2, such ferrimagnetism is inherent property of bcc lattice so thermodynamics of these compounds at low pp can be similar to that of antiferromagnet on imbalanced hierarchical fractal.Comment: 6 pages, 4 figure

    Imbalanced Ensemble Classifier for learning from imbalanced business school data set

    Full text link
    Private business schools in India face a common problem of selecting quality students for their MBA programs to achieve the desired placement percentage. Generally, such data sets are biased towards one class, i.e., imbalanced in nature. And learning from the imbalanced dataset is a difficult proposition. This paper proposes an imbalanced ensemble classifier which can handle the imbalanced nature of the dataset and achieves higher accuracy in case of the feature selection (selection of important characteristics of students) cum classification problem (prediction of placements based on the students' characteristics) for Indian business school dataset. The optimal value of an important model parameter is found. Numerical evidence is also provided using Indian business school dataset to assess the outstanding performance of the proposed classifier
    corecore