2,038 research outputs found
MixUp as Locally Linear Out-Of-Manifold Regularization
MixUp is a recently proposed data-augmentation scheme, which linearly
interpolates a random pair of training examples and correspondingly the one-hot
representations of their labels. Training deep neural networks with such
additional data is shown capable of significantly improving the predictive
accuracy of the current art. The power of MixUp, however, is primarily
established empirically and its working and effectiveness have not been
explained in any depth. In this paper, we develop an understanding for MixUp as
a form of "out-of-manifold regularization", which imposes certain "local
linearity" constraints on the model's input space beyond the data manifold.
This analysis enables us to identify a limitation of MixUp, which we call
"manifold intrusion". In a nutshell, manifold intrusion in MixUp is a form of
under-fitting resulting from conflicts between the synthetic labels of the
mixed-up examples and the labels of original training data. Such a phenomenon
usually happens when the parameters controlling the generation of mixing
policies are not sufficiently fine-tuned on the training data. To address this
issue, we propose a novel adaptive version of MixUp, where the mixing policies
are automatically learned from the data using an additional network and
objective function designed to avoid manifold intrusion. The proposed
regularizer, AdaMixUp, is empirically evaluated on several benchmark datasets.
Extensive experiments demonstrate that AdaMixUp improves upon MixUp when
applied to the current art of deep classification models.Comment: Accepted by AAAI201
On the Generalization Effects of Linear Transformations in Data Augmentation
Data augmentation is a powerful technique to improve performance in
applications such as image and text classification tasks. Yet, there is little
rigorous understanding of why and how various augmentations work. In this work,
we consider a family of linear transformations and study their effects on the
ridge estimator in an over-parametrized linear regression setting. First, we
show that transformations which preserve the labels of the data can improve
estimation by enlarging the span of the training data. Second, we show that
transformations which mix data can improve estimation by playing a
regularization effect. Finally, we validate our theoretical insights on MNIST.
Based on the insights, we propose an augmentation scheme that searches over the
space of transformations by how uncertain the model is about the transformed
data. We validate our proposed scheme on image and text datasets. For example,
our method outperforms RandAugment by 1.24% on CIFAR-100 using
Wide-ResNet-28-10. Furthermore, we achieve comparable accuracy to the SoTA
Adversarial AutoAugment on CIFAR datasets.Comment: International Conference on Machine learning (ICML) 2020. Added
experimental results on ImageNe
Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy
Regularization plays a crucial role in machine learning models, especially
for deep neural networks. The existing regularization techniques mainly reply
on the i.i.d. assumption and only employ the information of the current sample,
without the leverage of neighboring information between samples. In this work,
we propose a general regularizer called Patch-level Neighborhood
Interpolation~(\textbf{Pani}) that fully exploits the relationship between
samples. Furthermore, by explicitly constructing a patch-level graph in the
different network layers and interpolating the neighborhood features to refine
the representation of the current sample, our Patch-level Neighborhood
Interpolation can then be applied to enhance two popular regularization
strategies, namely Virtual Adversarial Training (VAT) and MixUp, yielding their
neighborhood versions. The first derived \textbf{Pani VAT} presents a novel way
to construct non-local adversarial smoothness by incorporating patch-level
interpolated perturbations. In addition, the \textbf{Pani MixUp} method extends
the original MixUp regularization to the patch level and then can be developed
to MixMatch, achieving the state-of-the-art performance. Finally, extensive
experiments are conducted to verify the effectiveness of the Patch-level
Neighborhood Interpolation in both supervised and semi-supervised settings
- …