Structured channel pruning has been shown to significantly accelerate
inference time for convolution neural networks (CNNs) on modern hardware, with
a relatively minor loss of network accuracy. Recent works permanently zero
these channels during training, which we observe to significantly hamper final
accuracy, particularly as the fraction of the network being pruned increases.
We propose Soft Masking for cost-constrained Channel Pruning (SMCP) to allow
pruned channels to adaptively return to the network while simultaneously
pruning towards a target cost constraint. By adding a soft mask
re-parameterization of the weights and channel pruning from the perspective of
removing input channels, we allow gradient updates to previously pruned
channels and the opportunity for the channels to later return to the network.
We then formulate input channel pruning as a global resource allocation
problem. Our method outperforms prior works on both the ImageNet classification
and PASCAL VOC detection datasets.Comment: Accepted by ECCV 202