1,949 research outputs found
Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning
Semi-supervised learning, i.e. jointly learning from labeled and unlabeled
samples, is an active research topic due to its key role on relaxing human
supervision. In the context of image classification, recent advances to learn
from unlabeled samples are mainly focused on consistency regularization methods
that encourage invariant predictions for different perturbations of unlabeled
samples. We, conversely, propose to learn from unlabeled data by generating
soft pseudo-labels using the network predictions. We show that a naive
pseudo-labeling overfits to incorrect pseudo-labels due to the so-called
confirmation bias and demonstrate that mixup augmentation and setting a minimum
number of labeled samples per mini-batch are effective regularization
techniques for reducing it. The proposed approach achieves state-of-the-art
results in CIFAR-10/100, SVHN, and Mini-ImageNet despite being much simpler
than other methods. These results demonstrate that pseudo-labeling alone can
outperform consistency regularization methods, while the opposite was supposed
in previous work. Source code is available at https://git.io/fjQsC
Pseudo-labeling and confirmation bias in deep semi-supervised learning
Semi-supervised learning, i.e. jointly learning from labeled and unlabeled samples, is an active research topic due to its key role on relaxing human supervision. In the context of image classification, recent advances to learn from unlabeled samples are mainly focused on consistency regularization methods that encourage invariant predictions for different perturbations of unlabeled samples. We, conversely, propose to learn from unlabeled data by generating soft pseudo-labels using the network predictions. We show that a naive pseudo-labeling overfits to incorrect pseudo-labels due to the so-called confirmation bias and demonstrate that mixup augmentation and setting a minimum number of labeled samples per mini-batch are effective regularization techniques for reducing it. The proposed approach achieves state-of-the-art results in CIFAR-10/100, SVHN, and Mini-ImageNet despite being much simpler than other methods. These results demonstrate that pseudo-labeling alone can outperform consistency regularization methods, while the opposite was supposed in previous work.
https://git.io/fjQs
Training from a Better Start Point: Active Self-Semi-Supervised Learning for Few Labeled Samples
Training with fewer annotations is a key issue for applying deep models to
various practical domains. To date, semi-supervised learning has achieved great
success in training with few annotations. However, confirmation bias increases
dramatically as the number of annotations decreases making it difficult to
continue reducing the number of annotations. Based on the observation that the
quality of pseudo-labels early in semi-supervised training plays an important
role in mitigating confirmation bias, in this paper we propose an active
self-semi-supervised learning (AS3L) framework. AS3L bootstraps semi-supervised
models with prior pseudo-labels (PPL), where PPL is obtained by label
propagation over self-supervised features. We illustrate that the accuracy of
PPL is not only affected by the quality of features, but also by the selection
of the labeled samples. We develop active learning and label propagation
strategies to obtain better PPL. Consequently, our framework can significantly
improve the performance of models in the case of few annotations while reducing
the training time. Experiments on four semi-supervised learning benchmarks
demonstrate the effectiveness of the proposed methods. Our method outperforms
the baseline method by an average of 7\% on the four datasets and outperforms
the baseline method in accuracy while taking about 1/3 of the training time.Comment: 12 pages, 8 figure
How To Overcome Confirmation Bias in Semi-Supervised Image Classification By Active Learning
Do we need active learning? The rise of strong deep semi-supervised methods
raises doubt about the usability of active learning in limited labeled data
settings. This is caused by results showing that combining semi-supervised
learning (SSL) methods with a random selection for labeling can outperform
existing active learning (AL) techniques. However, these results are obtained
from experiments on well-established benchmark datasets that can overestimate
the external validity. However, the literature lacks sufficient research on the
performance of active semi-supervised learning methods in realistic data
scenarios, leaving a notable gap in our understanding. Therefore we present
three data challenges common in real-world applications: between-class
imbalance, within-class imbalance, and between-class similarity. These
challenges can hurt SSL performance due to confirmation bias. We conduct
experiments with SSL and AL on simulated data challenges and find that random
sampling does not mitigate confirmation bias and, in some cases, leads to worse
performance than supervised learning. In contrast, we demonstrate that AL can
overcome confirmation bias in SSL in these realistic settings. Our results
provide insights into the potential of combining active and semi-supervised
learning in the presence of common real-world challenges, which is a promising
direction for robust methods when learning with limited labeled data in
real-world applications.Comment: Accepted @ ECML PKDD 2023. This is the author's version of the work.
The definitive Version of Record will be published in the Proceedings of ECML
PKDD 202
On the Importance of Calibration in Semi-supervised Learning
State-of-the-art (SOTA) semi-supervised learning (SSL) methods have been
highly successful in leveraging a mix of labeled and unlabeled data by
combining techniques of consistency regularization and pseudo-labeling. During
pseudo-labeling, the model's predictions on unlabeled data are used for
training and thus, model calibration is important in mitigating confirmation
bias. Yet, many SOTA methods are optimized for model performance, with little
focus directed to improve model calibration. In this work, we empirically
demonstrate that model calibration is strongly correlated with model
performance and propose to improve calibration via approximate Bayesian
techniques. We introduce a family of new SSL models that optimizes for
calibration and demonstrate their effectiveness across standard vision
benchmarks of CIFAR-10, CIFAR-100 and ImageNet, giving up to 15.9% improvement
in test accuracy. Furthermore, we also demonstrate their effectiveness in
additional realistic and challenging problems, such as class-imbalanced
datasets and in photonics science.Comment: 24 page
Boosting Semi-Supervised Learning by bridging high and low-confidence predictions
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL),
where artificial labels are generated for unlabeled data by a trained model,
allowing for the simultaneous training of labeled and unlabeled data in a
supervised setting. However, several studies have identified three main issues
with pseudo-labeling-based approaches. Firstly, these methods heavily rely on
predictions from the trained model, which may not always be accurate, leading
to a confirmation bias problem. Secondly, the trained model may be overfitted
to easy-to-learn examples, ignoring hard-to-learn ones, resulting in the
\textit{"Matthew effect"} where the already strong become stronger and the weak
weaker. Thirdly, most of the low-confidence predictions of unlabeled data are
discarded due to the use of a high threshold, leading to an underutilization of
unlabeled data during training. To address these issues, we propose a new
method called ReFixMatch, which aims to utilize all of the unlabeled data
during training, thus improving the generalizability of the model and
performance on SSL benchmarks. Notably, ReFixMatch achieves 41.05\% top-1
accuracy with 100k labeled examples on ImageNet, outperforming the baseline
FixMatch and current state-of-the-art methods.Comment: Accepted to ICCVW2023 (Workshop on representation learning with very
limited images: the potential of self-, synthetic- and formula-supervision
- …