Search CORE

507,471 research outputs found

Hellinger Distance Trees for Imbalanced Streams

Author: Brooke J. M.
Knowles J. D.
Lyon R. J.
Stappers B. W.
Publication venue
Publication date: 01/01/2014
Field of study

Classifiers trained on data sets possessing an imbalanced class distribution are known to exhibit poor generalisation performance. This is known as the imbalanced learning problem. The problem becomes particularly acute when we consider incremental classifiers operating on imbalanced data streams, especially when the learning objective is rare class identification. As accuracy may provide a misleading impression of performance on imbalanced data, existing stream classifiers based on accuracy can suffer poor minority class performance on imbalanced streams, with the result being low minority class recall rates. In this paper we address this deficiency by proposing the use of the Hellinger distance measure, as a very fast decision tree split criterion. We demonstrate that by using Hellinger a statistically significant improvement in recall rates on imbalanced data streams can be achieved, with an acceptable increase in the false positive rate.Comment: 6 Pages, 2 figures, to be published in Proceedings 22nd International Conference on Pattern Recognition (ICPR) 201

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Edge Hill University Research Information Repository

The University of Manchester - Institutional Repository

Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations

Author: Cardoso M. Jorge
Li Wenqi
Ourselin Sébastien
Sudre Carole H
Vercauteren Tom
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/07/2017
Field of study

Deep-learning has proved in recent years to be a powerful tool for image analysis and is now widely used to segment both 2D and 3D medical images. Deep-learning segmentation frameworks rely not only on the choice of network architecture but also on the choice of loss function. When the segmentation process targets rare observations, a severe class imbalance is likely to occur between candidate labels, thus resulting in sub-optimal performance. In order to mitigate this issue, strategies such as the weighted cross-entropy function, the sensitivity function or the Dice loss function, have been proposed. In this work, we investigate the behavior of these loss functions and their sensitivity to learning rate tuning in the presence of different rates of label imbalance across 2D and 3D segmentation tasks. We also propose to use the class re-balancing properties of the Generalized Dice overlap, a known metric for segmentation assessment, as a robust and accurate deep-learning loss function for unbalanced tasks

arXiv.org e-Print Archive

UCL Discovery

PubMed Central

King's Research Portal

CLINICAL: Targeted Active Learning for Imbalanced Medical Image Classification

Author: Iyer Rishabh
Iyer Venkat
Kothawade Suraj
Ramakrishnan Ganesh
Savarkar Atharv
Tamil Lakshman
Publication venue
Publication date: 04/10/2022
Field of study

Training deep learning models on medical datasets that perform well for all classes is a challenging task. It is often the case that a suboptimal performance is obtained on some classes due to the natural class imbalance issue that comes with medical data. An effective way to tackle this problem is by using targeted active learning, where we iteratively add data points to the training data that belong to the rare classes. However, existing active learning methods are ineffective in targeting rare classes in medical datasets. In this work, we propose Clinical (targeted aCtive Learning for ImbalaNced medICal imAge cLassification) a framework that uses submodular mutual information functions as acquisition functions to mine critical data points from rare classes. We apply our framework to a wide-array of medical imaging datasets on a variety of real-world class imbalance scenarios - namely, binary imbalance and long-tail imbalance. We show that Clinical outperforms the state-of-the-art active learning methods by acquiring a diverse set of data points that belong to the rare classes.Comment: Accepted to MICCAI 2022 MILLanD Worksho

arXiv.org e-Print Archive

Improving traffic sign recognition by active search

Author: Gustafsson H.
Gustafsson N.
Jaghouar S.
Mehlig B.
Werner E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/11/2021
Field of study

We describe an iterative active-learning algorithm to recognise rare traffic signs. A standard ResNet is trained on a training set containing only a single sample of the rare class. We demonstrate that by sorting the samples of a large, unlabeled set by the estimated probability of belonging to the rare class, we can efficiently identify samples from the rare class. This works despite the fact that this estimated probability is usually quite low. A reliable active-learning loop is obtained by labeling these candidate samples, including them in the training set, and iterating the procedure. Further, we show that we get similar results starting from a single synthetic sample. Our results are important as they indicate a straightforward way of improving traffic-sign recognition for automated driving systems. In addition, they show that we can make use of the information hidden in low confidence outputs, which is usually ignored.Comment: 6 pages, 7 Figure

arXiv.org e-Print Archive

Chalmers Research