Search CORE

4 research outputs found

Improving Resnet-9 Generalization Trained on Small Datasets

Author: Ahmed Walid
Awad Omar Mohamed
Deng Gordon
Gosal Gurpreet
Hajimolahoseini Habib
Lim Michael
Liu Yang
Publication venue
Publication date: 07/09/2023
Field of study

This paper presents our proposed approach that won the first prize at the ICLR competition on Hardware Aware Efficient Training. The challenge is to achieve the highest possible accuracy in an image classification task in less than 10 minutes. The training is done on a small dataset of 5000 images picked randomly from CIFAR-10 dataset. The evaluation is performed by the competition organizers on a secret dataset with 1000 images of the same size. Our approach includes applying a series of technique for improving the generalization of ResNet-9 including: sharpness aware optimization, label smoothing, gradient centralization, input patch whitening as well as metalearning based training. Our experiments show that the ResNet-9 can achieve the accuracy of 88% while trained only on a 10% subset of CIFAR-10 dataset in less than 10 minuet

arXiv.org e-Print Archive

SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

Author: Ahmadi Mehdi
Ahmed Walid
Asani Saina
Ataiefard Foozhan
Awad Omar Mohamed
Hajimolahoseini Habib
Hassanpour Mohammad
Javadi Farnoosh
Liu Kangling
Liu Yang
Wen Austin
Publication venue
Publication date: 25/11/2023
Field of study

In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training. This subset is selected based on an importance criteria measured over the entire dataset during warm-up stages, aiming to preserve the model performance with fewer examples during the rest of training. The importance measure we propose could be updated during training every once in a while, to make sure that all of the data samples have a chance to return to the training loop if they show a higher importance. The model architecture is unchanged but since the number of data samples controls the number of forward and backward passes during training, we can reduce the training time by reducing the number of training samples used in each epoch of training. Experimental results on a variety of CV and NLP models during both pretraining and finetuning show that the model performance could be preserved while achieving a significant speed-up during training. More specifically, BERT finetuning on GLUE benchmark shows that almost 90% of the data can be dropped achieving an end-to-end average speedup of 3.36x while keeping the average accuracy drop less than 0.92%

arXiv.org e-Print Archive

cGAN-Based High Dimensional IMU Sensor Data Generation for Therapeutic Activities

Author: Behzadipour Saeed
Ghadami Ali
Mohammadzadeh Mohammad
Taheri Alireza
Publication venue
Publication date: 15/02/2023
Field of study

Human activity recognition is a core technology for applications such as rehabilitation, ambient health monitoring, and human-computer interactions. Wearable devices, particularly IMU sensors, can help us collect rich features of human movements that can be leveraged in activity recognition. Developing a robust classifier for activity recognition has always been of interest to researchers. One major problem is that there is usually a deficit of training data for some activities, making it difficult and sometimes impossible to develop a classifier. In this work, a novel GAN network called TheraGAN was developed to generate realistic IMU signals associated with a particular activity. The generated signal is of a 6-channel IMU. i.e., angular velocities and linear accelerations. Also, by introducing simple activities, which are meaningful subparts of a complex full-length activity, the generation process was facilitated for any activity with arbitrary length. To evaluate the generated signals, besides perceptual similarity metrics, they were applied along with real data to improve the accuracy of classifiers. The results show that the maximum increase in the f1-score belongs to the LSTM classifier by a 13.27% rise when generated data were added. This shows the validity of the generated data as well as TheraGAN as a tool to build more robust classifiers in case of imbalanced data problem

arXiv.org e-Print Archive

Advances in artificial intelligence: 32nd Canadian conference on artificial intelligence, Canadian AI 2019, Kingston, ON, Canada, May 28-31, 2019, proceedings

Author: Meurs Marie-Jean
Rudzicz Frank
Publication venue: Springer International Publishing AG
Publication date: 01/01/2019
Field of study

CERN Document Server