34,264 research outputs found
Adaptive data augmentation for image classification
Data augmentation is the process of generating samples by transforming training data, with the target of improving the accuracy and robustness of classifiers. In this paper, we propose a new automatic and adaptive algorithm for choosing the transformations of the samples used in data augmentation. Specifically, for each sample, our main idea is to seek a small transformation that yields maximal classification loss on the transformed sample. We employ a trust-region optimization strategy, which consists of solving a sequence of linear programs. Our data augmentation scheme is then integrated into a Stochastic Gradient Descent algorithm for training deep neural networks. We perform experiments on two datasets, and show that that the proposed scheme outperforms random data augmentation algorithms in terms of accuracy and robustness, while yielding comparable or superior results with respect to existing selective sampling approaches
Learning-Based Biharmonic Augmentation for Point Cloud Classification
Point cloud datasets often suffer from inadequate sample sizes in comparison
to image datasets, making data augmentation challenging. While traditional
methods, like rigid transformations and scaling, have limited potential in
increasing dataset diversity due to their constraints on altering individual
sample shapes, we introduce the Biharmonic Augmentation (BA) method. BA is a
novel and efficient data augmentation technique that diversifies point cloud
data by imposing smooth non-rigid deformations on existing 3D structures. This
approach calculates biharmonic coordinates for the deformation function and
learns diverse deformation prototypes. Utilizing a CoefNet, our method predicts
coefficients to amalgamate these prototypes, ensuring comprehensive
deformation. Moreover, we present AdvTune, an advanced online augmentation
system that integrates adversarial training. This system synergistically
refines the CoefNet and the classification network, facilitating the automated
creation of adaptive shape deformations contingent on the learner status.
Comprehensive experimental analysis validates the superiority of Biharmonic
Augmentation, showcasing notable performance improvements over prevailing point
cloud augmentation techniques across varied network designs
Universal Adaptive Data Augmentation
Existing automatic data augmentation (DA) methods either ignore updating DA's
parameters according to the target model's state during training or adopt
update strategies that are not effective enough. In this work, we design a
novel data augmentation strategy called "Universal Adaptive Data Augmentation"
(UADA). Different from existing methods, UADA would adaptively update DA's
parameters according to the target model's gradient information during
training: given a pre-defined set of DA operations, we randomly decide types
and magnitudes of DA operations for every data batch during training, and
adaptively update DA's parameters along the gradient direction of the loss
concerning DA's parameters. In this way, UADA can increase the training loss of
the target networks, and the target networks would learn features from harder
samples to improve the generalization. Moreover, UADA is very general and can
be utilized in numerous tasks, e.g., image classification, semantic
segmentation and object detection. Extensive experiments with various models
are conducted on CIFAR-10, CIFAR-100, ImageNet, tiny-ImageNet, Cityscapes, and
VOC07+12 to prove the significant performance improvements brought by our
proposed adaptive augmentation.Comment: under submissio
Improved Noisy Student Training for Automatic Speech Recognition
Recently, a semi-supervised learning method known as "noisy student training"
has been shown to improve image classification performance of deep networks
significantly. Noisy student training is an iterative self-training method that
leverages augmentation to improve network performance. In this work, we adapt
and improve noisy student training for automatic speech recognition, employing
(adaptive) SpecAugment as the augmentation method. We find effective methods to
filter, balance and augment the data generated in between self-training
iterations. By doing so, we are able to obtain word error rates (WERs)
4.2%/8.6% on the clean/noisy LibriSpeech test sets by only using the clean 100h
subset of LibriSpeech as the supervised set and the rest (860h) as the
unlabeled set. Furthermore, we are able to achieve WERs 1.7%/3.4% on the
clean/noisy LibriSpeech test sets by using the unlab-60k subset of LibriLight
as the unlabeled set for LibriSpeech 960h. We are thus able to improve upon the
previous state-of-the-art clean/noisy test WERs achieved on LibriSpeech 100h
(4.74%/12.20%) and LibriSpeech (1.9%/4.1%).Comment: 5 pages, 5 figures, 4 tables; v2: minor revisions, reference adde
- …