51 research outputs found
Smart Augmentation - Learning an Optimal Data Augmentation Strategy
A recurring problem faced when training neural networks is that there is
typically not enough data to maximize the generalization capability of deep
neural networks(DNN). There are many techniques to address this, including data
augmentation, dropout, and transfer learning. In this paper, we introduce an
additional method which we call Smart Augmentation and we show how to use it to
increase the accuracy and reduce overfitting on a target network. Smart
Augmentation works by creating a network that learns how to generate augmented
data during the training process of a target network in a way that reduces that
networks loss. This allows us to learn augmentations that minimize the error of
that network.
Smart Augmentation has shown the potential to increase accuracy by
demonstrably significant measures on all datasets tested. In addition, it has
shown potential to achieve similar or improved performance levels with
significantly smaller network sizes in a number of tested cases
MixUp as Locally Linear Out-Of-Manifold Regularization
MixUp is a recently proposed data-augmentation scheme, which linearly
interpolates a random pair of training examples and correspondingly the one-hot
representations of their labels. Training deep neural networks with such
additional data is shown capable of significantly improving the predictive
accuracy of the current art. The power of MixUp, however, is primarily
established empirically and its working and effectiveness have not been
explained in any depth. In this paper, we develop an understanding for MixUp as
a form of "out-of-manifold regularization", which imposes certain "local
linearity" constraints on the model's input space beyond the data manifold.
This analysis enables us to identify a limitation of MixUp, which we call
"manifold intrusion". In a nutshell, manifold intrusion in MixUp is a form of
under-fitting resulting from conflicts between the synthetic labels of the
mixed-up examples and the labels of original training data. Such a phenomenon
usually happens when the parameters controlling the generation of mixing
policies are not sufficiently fine-tuned on the training data. To address this
issue, we propose a novel adaptive version of MixUp, where the mixing policies
are automatically learned from the data using an additional network and
objective function designed to avoid manifold intrusion. The proposed
regularizer, AdaMixUp, is empirically evaluated on several benchmark datasets.
Extensive experiments demonstrate that AdaMixUp improves upon MixUp when
applied to the current art of deep classification models.Comment: Accepted by AAAI201
Further advantages of data augmentation on convolutional neural networks
Data augmentation is a popular technique largely used to enhance the training
of convolutional neural networks. Although many of its benefits are well known
by deep learning researchers and practitioners, its implicit regularization
effects, as compared to popular explicit regularization techniques, such as
weight decay and dropout, remain largely unstudied. As a matter of fact,
convolutional neural networks for image object classification are typically
trained with both data augmentation and explicit regularization, assuming the
benefits of all techniques are complementary. In this paper, we systematically
analyze these techniques through ablation studies of different network
architectures trained with different amounts of training data. Our results
unveil a largely ignored advantage of data augmentation: networks trained with
just data augmentation more easily adapt to different architectures and amount
of training data, as opposed to weight decay and dropout, which require
specific fine-tuning of their hyperparameters.Comment: Preprint of the manuscript accepted for presentation at the
International Conference on Artificial Neural Networks (ICANN) 2018. Best
Paper Awar
- …