1,067 research outputs found
Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks
Deeper and wider Convolutional Neural Networks (CNNs) achieve superior
performance but bring expensive computation cost. Accelerating such
over-parameterized neural network has received increased attention. A typical
pruning algorithm is a three-stage pipeline, i.e., training, pruning, and
retraining. Prevailing approaches fix the pruned filters to zero during
retraining, and thus significantly reduce the optimization space. Besides, they
directly prune a large number of filters at first, which would cause
unrecoverable information loss. To solve these problems, we propose an
Asymptotic Soft Filter Pruning (ASFP) method to accelerate the inference
procedure of the deep neural networks. First, we update the pruned filters
during the retraining stage. As a result, the optimization space of the pruned
model would not be reduced but be the same as that of the original model. In
this way, the model has enough capacity to learn from the training data.
Second, we prune the network asymptotically. We prune few filters at first and
asymptotically prune more filters during the training procedure. With
asymptotic pruning, the information of the training set would be gradually
concentrated in the remaining filters, so the subsequent training and pruning
process would be stable. Experiments show the effectiveness of our ASFP on
image classification benchmarks. Notably, on ILSVRC-2012, our ASFP reduces more
than 40% FLOPs on ResNet-50 with only 0.14% top-5 accuracy degradation, which
is higher than the soft filter pruning (SFP) by 8%.Comment: Extended Journal Version of arXiv:1808.0686
LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch
Structured pruning is a commonly used convolutional neural network (CNN)
compression approach. Pruning rate setting is a fundamental problem in
structured pruning. Most existing works introduce too many additional learnable
parameters to assign different pruning rates across different layers in CNN or
cannot control the compression rate explicitly. Since too narrow network blocks
information flow for training, automatic pruning rate setting cannot explore a
high pruning rate for a specific layer. To overcome these limitations, we
propose a novel framework named Layer Adaptive Progressive Pruning (LAPP),
which gradually compresses the network during initial training of a few epochs
from scratch. In particular, LAPP designs an effective and efficient pruning
strategy that introduces a learnable threshold for each layer and FLOPs
constraints for network. Guided by both task loss and FLOPs constraints, the
learnable thresholds are dynamically and gradually updated to accommodate
changes of importance scores during training. Therefore the pruning strategy
can gradually prune the network and automatically determine the appropriate
pruning rates for each layer. What's more, in order to maintain the expressive
power of the pruned layer, before training starts, we introduce an additional
lightweight bypass for each convolutional layer to be pruned, which only adds
relatively few additional burdens. Our method demonstrates superior performance
gains over previous compression methods on various datasets and backbone
architectures. For example, on CIFAR-10, our method compresses ResNet-20 to
40.3% without accuracy drop. 55.6% of FLOPs of ResNet-18 are reduced with 0.21%
top-1 accuracy increase and 0.40% top-5 accuracy increase on ImageNet.Comment: 12 pages, 8 tables, 3 figure
Mathematical Optimization Algorithms for Model Compression and Adversarial Learning in Deep Neural Networks
Large-scale deep neural networks (DNNs) have made breakthroughs in a variety of tasks, such as image recognition, speech recognition and self-driving cars. However, their large model size and computational requirements add a significant burden to state-of-the-art computing systems. Weight pruning is an effective approach to reduce the model size and computational requirements of DNNs. However, prior works in this area are mainly heuristic methods. As a result, the performance of a DNN cannot maintain for a high weight pruning ratio. To mitigate this limitation, we propose a systematic weight pruning framework for DNNs based on mathematical optimization. We first formulate the weight pruning for DNNs as a non-convex optimization problem, and then systematically solve it using alternating direction method of multipliers (ADMM). Our work achieves a higher weight pruning ratio on DNNs without accuracy loss and a higher acceleration on the inference of DNNs on CPU and GPU platforms compared with prior works.
Besides the issue of model size, DNNs are also sensitive to adversarial attacks, a small invisible noise on the input data can fully mislead a DNN. Research on the robustness of DNNs follows two directions in general. The first is to enhance the robustness of DNNs, which increases the degree of difficulty for adversarial attacks to fool DNNs. The second is to design adversarial attack methods to test the robustness of DNNs. These two aspects reciprocally benefit each other towards hardening DNNs. In our work, we propose to generate adversarial attacks with low distortion via convex optimization, which achieves 100% attack success rate with lower distortion compared with prior works. We also propose a unified min-max optimization framework for the adversarial attack and defense on DNNs over multiple domains. Our proposed method performs better compared with the prior works, which use average-based strategies to solve the problems over multiple domains
- …