4 research outputs found
Improving Resnet-9 Generalization Trained on Small Datasets
This paper presents our proposed approach that won the first prize at the
ICLR competition on Hardware Aware Efficient Training. The challenge is to
achieve the highest possible accuracy in an image classification task in less
than 10 minutes. The training is done on a small dataset of 5000 images picked
randomly from CIFAR-10 dataset. The evaluation is performed by the competition
organizers on a secret dataset with 1000 images of the same size. Our approach
includes applying a series of technique for improving the generalization of
ResNet-9 including: sharpness aware optimization, label smoothing, gradient
centralization, input patch whitening as well as metalearning based training.
Our experiments show that the ResNet-9 can achieve the accuracy of 88% while
trained only on a 10% subset of CIFAR-10 dataset in less than 10 minuet
SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling
In this paper, we present SwiftLearn, a data-efficient approach to accelerate
training of deep learning models using a subset of data samples selected during
the warm-up stages of training. This subset is selected based on an importance
criteria measured over the entire dataset during warm-up stages, aiming to
preserve the model performance with fewer examples during the rest of training.
The importance measure we propose could be updated during training every once
in a while, to make sure that all of the data samples have a chance to return
to the training loop if they show a higher importance. The model architecture
is unchanged but since the number of data samples controls the number of
forward and backward passes during training, we can reduce the training time by
reducing the number of training samples used in each epoch of training.
Experimental results on a variety of CV and NLP models during both pretraining
and finetuning show that the model performance could be preserved while
achieving a significant speed-up during training. More specifically, BERT
finetuning on GLUE benchmark shows that almost 90% of the data can be dropped
achieving an end-to-end average speedup of 3.36x while keeping the average
accuracy drop less than 0.92%
cGAN-Based High Dimensional IMU Sensor Data Generation for Therapeutic Activities
Human activity recognition is a core technology for applications such as
rehabilitation, ambient health monitoring, and human-computer interactions.
Wearable devices, particularly IMU sensors, can help us collect rich features
of human movements that can be leveraged in activity recognition. Developing a
robust classifier for activity recognition has always been of interest to
researchers. One major problem is that there is usually a deficit of training
data for some activities, making it difficult and sometimes impossible to
develop a classifier. In this work, a novel GAN network called TheraGAN was
developed to generate realistic IMU signals associated with a particular
activity. The generated signal is of a 6-channel IMU. i.e., angular velocities
and linear accelerations. Also, by introducing simple activities, which are
meaningful subparts of a complex full-length activity, the generation process
was facilitated for any activity with arbitrary length. To evaluate the
generated signals, besides perceptual similarity metrics, they were applied
along with real data to improve the accuracy of classifiers. The results show
that the maximum increase in the f1-score belongs to the LSTM classifier by a
13.27% rise when generated data were added. This shows the validity of the
generated data as well as TheraGAN as a tool to build more robust classifiers
in case of imbalanced data problem