Using search engines for web image retrieval is a tempting alternative to
manual curation when creating an image dataset, but their main drawback remains
the proportion of incorrect (noisy) samples retrieved. These noisy samples have
been evidenced by previous works to be a mixture of in-distribution (ID)
samples, assigned to the incorrect category but presenting similar visual
semantics to other classes in the dataset, and out-of-distribution (OOD)
images, which share no semantic correlation with any category from the dataset.
The latter are, in practice, the dominant type of noisy images retrieved. To
tackle this noise duality, we propose a two stage algorithm starting with a
detection step where we use unsupervised contrastive feature learning to
represent images in a feature space. We find that the alignment and uniformity
principles of contrastive learning allow OOD samples to be linearly separated
from ID samples on the unit hypersphere. We then spectrally embed the
unsupervised representations using a fixed neighborhood size and apply an
outlier sensitive clustering at the class level to detect the clean and OOD
clusters as well as ID noisy outliers. We finally train a noise robust neural
network that corrects ID noise to the correct category and utilizes OOD samples
in a guided contrastive objective, clustering them to improve low-level
features. Our algorithm improves the state-of-the-art results on synthetic
noise image datasets as well as real-world web-crawled data. Our work is fully
reproducible github.com/PaulAlbert31/SNCF.Comment: Accepted at ECCV 202