5,312 research outputs found
Sharpness-aware Quantization for Deep Neural Networks
Network quantization is an effective compression method to reduce the model
size and computational cost. Despite the high compression ratio, training a
low-precision model is difficult due to the discrete and non-differentiable
nature of quantization, resulting in considerable performance degradation.
Recently, Sharpness-Aware Minimization (SAM) has been proposed to improve the
generalization performance of the models by simultaneously minimizing the loss
value and the loss curvature. However, SAM can not be directly applied to
quantized models due to the discretization process in network quantization. In
this paper, we devise a Sharpness-Aware Quantization (SAQ) method to train
quantized models, leading to better generalization performance. Moreover, since
each layer contributes differently to the loss value and the loss sharpness of
a network, we further devise an effective method that learns a configuration
generator to automatically determine the bitwidth configurations of each layer,
encouraging lower bits for flat regions and vice versa for sharp landscapes,
while simultaneously promoting the flatness of minima to enable more aggressive
quantization. Extensive experiments on CIFAR-100 and ImageNet show the superior
performance of the proposed methods. For example, our quantized ResNet-18 with
53.7x Bit-Operation (BOP) reduction even outperforms the full-precision one by
0.7% in terms of the Top-1 accuracy. Code is available at
https://github.com/zip-group/SAQ.Comment: Tech repor
Does the way in which a firm interacts with its network partners influence its formulation of product innovation strategies?
Peer reviewedPostprin
Why do mainland Chinese firms succeed in some sectors and fail in others? A critical view of the Chinese system of innovation
- β¦