635 research outputs found
On the Generalization Effects of Linear Transformations in Data Augmentation
Data augmentation is a powerful technique to improve performance in
applications such as image and text classification tasks. Yet, there is little
rigorous understanding of why and how various augmentations work. In this work,
we consider a family of linear transformations and study their effects on the
ridge estimator in an over-parametrized linear regression setting. First, we
show that transformations which preserve the labels of the data can improve
estimation by enlarging the span of the training data. Second, we show that
transformations which mix data can improve estimation by playing a
regularization effect. Finally, we validate our theoretical insights on MNIST.
Based on the insights, we propose an augmentation scheme that searches over the
space of transformations by how uncertain the model is about the transformed
data. We validate our proposed scheme on image and text datasets. For example,
our method outperforms RandAugment by 1.24% on CIFAR-100 using
Wide-ResNet-28-10. Furthermore, we achieve comparable accuracy to the SoTA
Adversarial AutoAugment on CIFAR datasets.Comment: International Conference on Machine learning (ICML) 2020. Added
experimental results on ImageNe
DFM-X: Augmentation by Leveraging Prior Knowledge of Shortcut Learning
Neural networks are prone to learn easy solutions from superficial statistics
in the data, namely shortcut learning, which impairs generalization and
robustness of models. We propose a data augmentation strategy, named DFM-X,
that leverages knowledge about frequency shortcuts, encoded in Dominant
Frequencies Maps computed for image classification models. We randomly select
X% training images of certain classes for augmentation, and process them by
retaining the frequencies included in the DFMs of other classes. This strategy
compels the models to leverage a broader range of frequencies for
classification, rather than relying on specific frequency sets. Thus, the
models learn more deep and task-related semantics compared to their counterpart
trained with standard setups. Unlike other commonly used augmentation
techniques which focus on increasing the visual variations of training data,
our method targets exploiting the original data efficiently, by distilling
prior knowledge about destructive learning behavior of models from data. Our
experimental results demonstrate that DFM-X improves robustness against common
corruptions and adversarial attacks. It can be seamlessly integrated with other
augmentation techniques to further enhance the robustness of models.Comment: Accepted at ICCVW202
Efficient and Effective Augmentation Strategy for Adversarial Training
Adversarial training of Deep Neural Networks is known to be significantly
more data-hungry when compared to standard training. Furthermore, complex data
augmentations such as AutoAugment, which have led to substantial gains in
standard training of image classifiers, have not been successful with
Adversarial Training. We first explain this contrasting behavior by viewing
augmentation during training as a problem of domain generalization, and further
propose Diverse Augmentation-based Joint Adversarial Training (DAJAT) to use
data augmentations effectively in adversarial training. We aim to handle the
conflicting goals of enhancing the diversity of the training dataset and
training with data that is close to the test distribution by using a
combination of simple and complex augmentations with separate batch
normalization layers during training. We further utilize the popular
Jensen-Shannon divergence loss to encourage the joint learning of the diverse
augmentations, thereby allowing simple augmentations to guide the learning of
complex ones. Lastly, to improve the computational efficiency of the proposed
method, we propose and utilize a two-step defense, Ascending Constraint
Adversarial Training (ACAT), that uses an increasing epsilon schedule and
weight-space smoothing to prevent gradient masking. The proposed method DAJAT
achieves substantially better robustness-accuracy trade-off when compared to
existing methods on the RobustBench Leaderboard on ResNet-18 and
WideResNet-34-10. The code for implementing DAJAT is available here:
https://github.com/val-iisc/DAJAT.Comment: NeurIPS 202
- …