65,995 research outputs found
Deep Anomaly Detection under Labeling Budget Constraints
Selecting informative data points for expert feedback can significantly
improve the performance of anomaly detection (AD) in various contexts, such as
medical diagnostics or fraud detection. In this paper, we determine a set of
theoretical conditions under which anomaly scores generalize from labeled
queries to unlabeled data. Motivated by these results, we propose a data
labeling strategy with optimal data coverage under labeling budget constraints.
In addition, we propose a new learning framework for semi-supervised AD.
Extensive experiments on image, tabular, and video data sets show that our
approach results in state-of-the-art semi-supervised AD performance under
labeling budget constraints.Comment: deep anomaly detection, active learning, semi-supervised learnin
The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation
In this paper, we introduce a novel learning scheme named weakly
semi-supervised instance segmentation (WSSIS) with point labels for
budget-efficient and high-performance instance segmentation. Namely, we
consider a dataset setting consisting of a few fully-labeled images and a lot
of point-labeled images. Motivated by the main challenge of semi-supervised
approaches mainly derives from the trade-off between false-negative and
false-positive instance proposals, we propose a method for WSSIS that can
effectively leverage the budget-friendly point labels as a powerful weak
supervision source to resolve the challenge. Furthermore, to deal with the hard
case where the amount of fully-labeled data is extremely limited, we propose a
MaskRefineNet that refines noise in rough masks. We conduct extensive
experiments on COCO and BDD100K datasets, and the proposed method achieves
promising results comparable to those of the fully-supervised model, even with
50% of the fully labeled COCO data (38.8% vs. 39.7%). Moreover, when using as
little as 5% of fully labeled COCO data, our method shows significantly
superior performance over the state-of-the-art semi-supervised learning method
(33.7% vs. 24.9%). The code is available at
https://github.com/clovaai/PointWSSIS.Comment: CVPR 202
Self-Path: self-supervision for classification of pathology images with limited annotations
While high-resolution pathology images lend themselves well to ‘data hungry’ deep learning algorithms, obtaining exhaustive annotations on these images for learning is a major challenge. In this paper, we propose a self-supervised convolutional neural network (CNN) frame-work to leverage unlabeled data for learning generalizable and domain invariant representations in pathology images. Our proposed framework, termed as Self-Path, employs multi-task learning where the main task is tissue classification and pretext tasks are a variety of self-supervised tasks with labels inherent to the input images.We introduce novel pathology-specific self-supervision tasks that leverage contextual, multi-resolution and semantic features in pathology images for semi-supervised learning and domain adaptation. We investigate the effectiveness of Self-Path on 3 different pathology datasets. Our results show that Self-Path with the pathology-specific pretext tasks achieves state-of-the-art performance for semi-supervised learning when small amounts of labeled data are available. Further, we show that Self-Path improves domain adaptation for histopathology image classification when there is no labeled data available for the target domain. This approach can potentially be employed for other applications in computational pathology, where annotation budget is often limited or large amount of unlabeled image data is available
Accuracy versus time frontiers of semi-supervised and self-supervised learning on medical images
For many applications of classifiers to medical images, a trustworthy label
for each image can be difficult or expensive to obtain. In contrast, images
without labels are more readily available. Two major research directions both
promise that additional unlabeled data can improve classifier performance:
self-supervised learning pretrains useful representations on unlabeled data
only, then fine-tunes a classifier on these representations via the labeled
set; semi-supervised learning directly trains a classifier on labeled and
unlabeled data simultaneously. Recent methods from both directions have claimed
significant gains on non-medical tasks, but do not systematically assess
medical images and mostly compare only to methods in the same direction. This
study contributes a carefully-designed benchmark to help answer a
practitioner's key question: given a small labeled dataset and a limited budget
of hours to spend on training, what gains from additional unlabeled images are
possible and which methods best achieve them? Unlike previous benchmarks, ours
uses realistic-sized validation sets to select hyperparameters, assesses
runtime-performance tradeoffs, and bridges two research fields. By comparing 6
semi-supervised methods and 5 self-supervised methods to strong labeled-only
baselines on 3 medical datasets with 30-1000 labels per class, we offer
insights to resource-constrained, results-focused practitioners: MixMatch,
SimCLR, and BYOL represent strong choices that were not surpassed by more
recent methods. After much effort selecting hyperparameters on one dataset, we
publish settings that enable strong methods to perform well on new medical
tasks within a few hours, with further search over dozens of hours delivering
modest additional gains.Comment: Semi-supervised Learning; Self-supervised Learning; Medical Imagin
- …