Understanding BatchNorm in Ternary Training

Abstract

Neural networks are comprised of two components, weights andactivation function. Ternary weight neural networks (TNNs) achievea good performance and offer up to 16x compression ratio. TNNsare difficult to train without BatchNorm and there has been no studyto clarify the role of BatchNorm in a ternary network. Benefitingfrom a study in binary networks, we show how BatchNorm helps inresolving the exploding gradients issue

    Similar works