Collecting large-scale datasets is crucial for training deep models,
annotating the data, however, inevitably yields noisy labels, which poses
challenges to deep learning algorithms. Previous efforts tend to mitigate this
problem via identifying and removing noisy samples or correcting their labels
according to the statistical properties (e.g., loss values) among training
samples. In this paper, we aim to tackle this problem from a new perspective,
delving into the deep feature maps, we empirically find that models trained
with clean and mislabeled samples manifest distinguishable activation feature
distributions. From this observation, a novel robust training approach termed
adversarial noisy masking is proposed. The idea is to regularize deep features
with a label quality guided masking scheme, which adaptively modulates the
input data and label simultaneously, preventing the model to overfit noisy
samples. Further, an auxiliary task is designed to reconstruct input data, it
naturally provides noise-free self-supervised signals to reinforce the
generalization ability of deep models. The proposed method is simple and
flexible, it is tested on both synthetic and real-world noisy datasets, where
significant improvements are achieved over previous state-of-the-art methods