6 research outputs found
Understanding BatchNorm in Ternary Training
Neural networks are comprised of two components, weights andactivation function. Ternary weight neural networks (TNNs) achievea good performance and offer up to 16x compression ratio. TNNsare difficult to train without BatchNorm and there has been no studyto clarify the role of BatchNorm in a ternary network. Benefitingfrom a study in binary networks, we show how BatchNorm helps inresolving the exploding gradients issue
Efficient Training Under Limited Resources
Training time budget and size of the dataset are among the factors affecting
the performance of a Deep Neural Network (DNN). This paper shows that Neural
Architecture Search (NAS), Hyper Parameters Optimization (HPO), and Data
Augmentation help DNNs perform much better while these two factors are limited.
However, searching for an optimal architecture and the best hyperparameter
values besides a good combination of data augmentation techniques under low
resources requires many experiments. We present our approach to achieving such
a goal in three steps: reducing training epoch time by compressing the model
while maintaining the performance compared to the original model, preventing
model overfitting when the dataset is small, and performing the hyperparameter
tuning. We used NOMAD, which is a blackbox optimization software based on a
derivative-free algorithm to do NAS and HPO. Our work achieved an accuracy of
86.0 % on a tiny subset of Mini-ImageNet at the ICLR 2021 Hardware Aware
Efficient Training (HAET) Challenge and won second place in the competition.
The competition results can be found at haet2021.github.io/challenge and our
source code can be found at github.com/DouniaLakhmiri/ICLR\_HAET2021