1 research outputs found

    Rate-Accuracy Optimization of Deep Convolutional Neural Network Models

    No full text
    Recently, deep learning has enjoyed a great deal of success for computer vision problems due to its capability to model highly complex tasks, such as image classification, object detection, face recognition, among many others. Although these neural networks are nowadays very powerful, there is a huge amount of parameters (i.e. the model) that need to be learned and require considerable storage space and bandwidth during transmission. This paper addresses the problems of storage and transmission of large deep learning models by proposing a compression solution that is independent of the model being trained as well as the data used for training. An efficient compression framework for the parameters of a neural network, more precisely the weights that interconnect the different neurons, which consume a significant amount of resources (memory, storage and bandwidth) is proposed. Several quantization strategies are considered as well as a statistical models for the different layers of a neural network, which are exploited by an arithmetic coding engine. Experimental results show that up to 92% bitrate savings can be obtained with minimal impact in terms of image classification accuracy
    corecore