A challenge of channel pruning is designing efficient and effective criteria
to select channels to prune. A widely used criterion is minimal performance
degeneration. To accurately evaluate the truth performance degeneration
requires retraining the survived weights to convergence, which is prohibitively
slow. Hence existing pruning methods use previous weights (without retraining)
to evaluate the performance degeneration. However, we observe the loss changes
differ significantly with and without retraining. It motivates us to develop a
technique to evaluate true loss changes without retraining, with which channels
to prune can be selected more reliably and confidently. We first derive a
closed-form estimator of the true loss change per pruning mask change, using
influence functions without retraining. Influence function which is from robust
statistics reveals the impacts of a training sample on the model's prediction
and is repurposed by us to assess impacts on true loss changes. We then show
how to assess the importance of all channels simultaneously and develop a novel
global channel pruning algorithm accordingly. We conduct extensive experiments
to verify the effectiveness of the proposed algorithm. To the best of our
knowledge, we are the first that shows evaluating true loss changes for pruning
without retraining is possible. This finding will open up opportunities for a
series of new paradigms to emerge that differ from existing pruning methods.
The code is available at https://github.com/hrcheng1066/IFSO.Comment: chrome-extension://ogjibjphoadhljaoicdnjnmgokohngcc/assets/icon-50207e67.pn