7,792 research outputs found
Private Model Compression via Knowledge Distillation
The soaring demand for intelligent mobile applications calls for deploying
powerful deep neural networks (DNNs) on mobile devices. However, the
outstanding performance of DNNs notoriously relies on increasingly complex
models, which in turn is associated with an increase in computational expense
far surpassing mobile devices' capacity. What is worse, app service providers
need to collect and utilize a large volume of users' data, which contain
sensitive information, to build the sophisticated DNN models. Directly
deploying these models on public mobile devices presents prohibitive privacy
risk. To benefit from the on-device deep learning without the capacity and
privacy concerns, we design a private model compression framework RONA.
Following the knowledge distillation paradigm, we jointly use hint learning,
distillation learning, and self learning to train a compact and fast neural
network. The knowledge distilled from the cumbersome model is adaptively
bounded and carefully perturbed to enforce differential privacy. We further
propose an elegant query sample selection method to reduce the number of
queries and control the privacy loss. A series of empirical evaluations as well
as the implementation on an Android mobile device show that RONA can not only
compress cumbersome models efficiently but also provide a strong privacy
guarantee. For example, on SVHN, when a meaningful
-differential privacy is guaranteed, the compact model trained
by RONA can obtain 20 compression ratio and 19 speed-up with
merely 0.97% accuracy loss.Comment: Conference version accepted by AAAI'1
Knowledge Distillation with Adversarial Samples Supporting Decision Boundary
Many recent works on knowledge distillation have provided ways to transfer
the knowledge of a trained network for improving the learning process of a new
one, but finding a good technique for knowledge distillation is still an open
problem. In this paper, we provide a new perspective based on a decision
boundary, which is one of the most important component of a classifier. The
generalization performance of a classifier is closely related to the adequacy
of its decision boundary, so a good classifier bears a good decision boundary.
Therefore, transferring information closely related to the decision boundary
can be a good attempt for knowledge distillation. To realize this goal, we
utilize an adversarial attack to discover samples supporting a decision
boundary. Based on this idea, to transfer more accurate information about the
decision boundary, the proposed algorithm trains a student classifier based on
the adversarial samples supporting the decision boundary. Experiments show that
the proposed method indeed improves knowledge distillation and achieves the
state-of-the-arts performance.Comment: Accepted to AAAI 201
- โฆ