27 research outputs found
Learning with Out-of-Distribution Data for Audio Classification
In supervised machine learning, the assumption that training data is labelled
correctly is not always satisfied. In this paper, we investigate an instance of
labelling error for classification tasks in which the dataset is corrupted with
out-of-distribution (OOD) instances: data that does not belong to any of the
target classes, but is labelled as such. We show that detecting and relabelling
certain OOD instances, rather than discarding them, can have a positive effect
on learning. The proposed method uses an auxiliary classifier, trained on data
that is known to be in-distribution, for detection and relabelling. The amount
of data required for this is shown to be small. Experiments are carried out on
the FSDnoisy18k audio dataset, where OOD instances are very prevalent. The
proposed method is shown to improve the performance of convolutional neural
networks by a significant margin. Comparisons with other noise-robust
techniques are similarly encouraging.Comment: Paper accepted for 45th International Conference on Acoustics,
Speech, and Signal Processing (ICASSP 2020
Classification with unknown class-conditional label noise on non-compact feature spaces
We investigate the problem of classification in the presence of unknown
class-conditional label noise in which the labels observed by the learner have
been corrupted with some unknown class dependent probability. In order to
obtain finite sample rates, previous approaches to classification with unknown
class-conditional label noise have required that the regression function is
close to its extrema on sets of large measure. We shall consider this problem
in the setting of non-compact metric spaces, where the regression function need
not attain its extrema.
In this setting we determine the minimax optimal learning rates (up to
logarithmic factors). The rate displays interesting threshold behaviour: When
the regression function approaches its extrema at a sufficient rate, the
optimal learning rates are of the same order as those obtained in the
label-noise free setting. If the regression function approaches its extrema
more gradually then classification performance necessarily degrades. In
addition, we present an adaptive algorithm which attains these rates without
prior knowledge of either the distributional parameters or the local density.
This identifies for the first time a scenario in which finite sample rates are
achievable in the label noise setting, but they differ from the optimal rates
without label noise
Binary Classification with Instance and Label Dependent Label Noise
Learning with label dependent label noise has been extensively explored in
both theory and practice; however, dealing with instance (i.e., feature) and
label dependent label noise continues to be a challenging task. The difficulty
arises from the fact that the noise rate varies for each instance, making it
challenging to estimate accurately. The question of whether it is possible to
learn a reliable model using only noisy samples remains unresolved. We answer
this question with a theoretical analysis that provides matching upper and
lower bounds. Surprisingly, our results show that, without any additional
assumptions, empirical risk minimization achieves the optimal excess risk
bound. Specifically, we derive a novel excess risk bound proportional to the
noise level, which holds in very general settings, by comparing the empirical
risk minimizers obtained from clean samples and noisy samples. Second, we show
that the minimax lower bound for the 0-1 loss is a constant proportional to the
average noise rate. Our findings suggest that learning solely with noisy
samples is impossible without access to clean samples or strong assumptions on
the distribution of the data
Towards Label-free Scene Understanding by Vision Foundation Models
Vision foundation models such as Contrastive Vision-Language Pre-training
(CLIP) and Segment Anything (SAM) have demonstrated impressive zero-shot
performance on image classification and segmentation tasks. However, the
incorporation of CLIP and SAM for label-free scene understanding has yet to be
explored. In this paper, we investigate the potential of vision foundation
models in enabling networks to comprehend 2D and 3D worlds without labelled
data. The primary challenge lies in effectively supervising networks under
extremely noisy pseudo labels, which are generated by CLIP and further
exacerbated during the propagation from the 2D to the 3D domain. To tackle
these challenges, we propose a novel Cross-modality Noisy Supervision (CNS)
method that leverages the strengths of CLIP and SAM to supervise 2D and 3D
networks simultaneously. In particular, we introduce a prediction consistency
regularization to co-train 2D and 3D networks, then further impose the
networks' latent space consistency using the SAM's robust feature
representation. Experiments conducted on diverse indoor and outdoor datasets
demonstrate the superior performance of our method in understanding 2D and 3D
open environments. Our 2D and 3D network achieves label-free semantic
segmentation with 28.4% and 33.5% mIoU on ScanNet, improving 4.7% and 7.9%,
respectively. And for nuScenes dataset, our performance is 26.8% with an
improvement of 6%. Code will be released
(https://github.com/runnanchen/Label-Free-Scene-Understanding)
PI-GNN: A Novel Perspective on Semi-Supervised Node Classification against Noisy Labels
Semi-supervised node classification, as a fundamental problem in graph
learning, leverages unlabeled nodes along with a small portion of labeled nodes
for training. Existing methods rely heavily on high-quality labels, which,
however, are expensive to obtain in real-world applications since certain
noises are inevitably involved during the labeling process. It hence poses an
unavoidable challenge for the learning algorithm to generalize well. In this
paper, we propose a novel robust learning objective dubbed pairwise
interactions (PI) for the model, such as Graph Neural Network (GNN) to combat
noisy labels. Unlike classic robust training approaches that operate on the
pointwise interactions between node and class label pairs, PI explicitly forces
the embeddings for node pairs that hold a positive PI label to be close to each
other, which can be applied to both labeled and unlabeled nodes. We design
several instantiations for PI labels based on the graph structure and the node
class labels, and further propose a new uncertainty-aware training technique to
mitigate the negative effect of the sub-optimal PI labels. Extensive
experiments on different datasets and GNN architectures demonstrate the
effectiveness of PI, yielding a promising improvement over the state-of-the-art
methods.Comment: 16 pages, 3 figure