3,791 research outputs found
Group channel pruning and spatial attention distilling for object detection
Due to the over-parameterization of neural networks, many model compression
methods based on pruning and quantization have emerged. They are remarkable in
reducing the size, parameter number, and computational complexity of the model.
However, most of the models compressed by such methods need the support of
special hardware and software, which increases the deployment cost. Moreover,
these methods are mainly used in classification tasks, and rarely directly used
in detection tasks. To address these issues, for the object detection network
we introduce a three-stage model compression method: dynamic sparse training,
group channel pruning, and spatial attention distilling. Firstly, to select out
the unimportant channels in the network and maintain a good balance between
sparsity and accuracy, we put forward a dynamic sparse training method, which
introduces a variable sparse rate, and the sparse rate will change with the
training process of the network. Secondly, to reduce the effect of pruning on
network accuracy, we propose a novel pruning method called group channel
pruning. In particular, we divide the network into multiple groups according to
the scales of the feature layer and the similarity of module structure in the
network, and then we use different pruning thresholds to prune the channels in
each group. Finally, to recover the accuracy of the pruned network, we use an
improved knowledge distillation method for the pruned network. Especially, we
extract spatial attention information from the feature maps of specific scales
in each group as knowledge for distillation. In the experiments, we use YOLOv4
as the object detection network and PASCAL VOC as the training dataset. Our
method reduces the parameters of the model by 64.7 % and the calculation by
34.9%.Comment: Appl Intel
Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data
On-device machine learning (ML) enables the training process to exploit a
massive amount of user-generated private data samples. To enjoy this benefit,
inter-device communication overhead should be minimized. With this end, we
propose federated distillation (FD), a distributed model training algorithm
whose communication payload size is much smaller than a benchmark scheme,
federated learning (FL), particularly when the model size is large. Moreover,
user-generated data samples are likely to become non-IID across devices, which
commonly degrades the performance compared to the case with an IID dataset. To
cope with this, we propose federated augmentation (FAug), where each device
collectively trains a generative model, and thereby augments its local data
towards yielding an IID dataset. Empirical studies demonstrate that FD with
FAug yields around 26x less communication overhead while achieving 95-98% test
accuracy compared to FL.Comment: presented at the 32nd Conference on Neural Information Processing
Systems (NIPS 2018), 2nd Workshop on Machine Learning on the Phone and other
Consumer Devices (MLPCD 2), Montr\'eal, Canad
- …