1 research outputs found
Universal and Succinct Source Coding of Deep Neural Networks
Deep neural networks have shown incredible performance for inference tasks in
a variety of domains. Unfortunately, most current deep networks are enormous
cloud-based structures that require significant storage space, which limits
scaling of deep learning as a service (DLaaS) and use for on-device
intelligence. This paper is concerned with finding universal lossless
compressed representations of deep feedforward networks with synaptic weights
drawn from discrete sets, and directly performing inference without full
decompression. The basic insight that allows less rate than naive approaches is
recognizing that the bipartite graph layers of feedforward networks have a kind
of permutation invariance to the labeling of nodes, in terms of inferential
operation. We provide efficient algorithms to dissipate this irrelevant
uncertainty and then use arithmetic coding to nearly achieve the entropy bound
in a universal manner. We also provide experimental results of our approach on
several standard datasets