Unveiling the Power of Mixup for Stronger Classifiers

Abstract

Mixup-based data augmentations have achieved great success as regularizers for deep neural networks. However, existing methods rely on deliberately handcrafted mixup policies, which ignore or oversell the semantic matching between mixed samples and labels. Driven by their prior assumptions, early methods attempt to smooth decision boundaries by random linear interpolation while others focus on maximizing class-related information via offline saliency optimization. As a result, the issue of label mismatch has not been well addressed. Additionally, the optimization stability of mixup training is constantly troubled by the label mismatch. To address these challenges, we first reformulate mixup for supervised classification as two sub-tasks, mixup sample generation and classification, then propose Automatic Mixup (AutoMix), a revolutionary mixup framework. Specifically, a learnable lightweight Mix Block (MB) with a cross-attention mechanism is proposed to generate a mixed sample by modeling a fair relationship between the pair of samples under direct supervision of the corresponding mixed label. Moreover, the proposed Momentum Pipeline (MP) enhances training stability and accelerates convergence on top of making the Mix Block fully trained end-to-end. Extensive experiments on five popular classification benchmarks show that the proposed approach consistently outperforms leading methods by a large margin.Comment: The second version of AutoMix. 12 pages, 7 figure

    Similar works

    Full text

    thumbnail-image

    Available Versions