12,607 research outputs found
Data-free parameter pruning for Deep Neural Networks
Deep Neural nets (NNs) with millions of parameters are at the heart of many
state-of-the-art computer vision systems today. However, recent works have
shown that much smaller models can achieve similar levels of performance. In
this work, we address the problem of pruning parameters in a trained NN model.
Instead of removing individual weights one at a time as done in previous works,
we remove one neuron at a time. We show how similar neurons are redundant, and
propose a systematic way to remove them. Our experiments in pruning the densely
connected layers show that we can remove upto 85\% of the total parameters in
an MNIST-trained network, and about 35\% for AlexNet without significantly
affecting performance. Our method can be applied on top of most networks with a
fully connected layer to give a smaller network.Comment: BMVC 201
Data-Free Backbone Fine-Tuning for Pruned Neural Networks
Model compression techniques reduce the computational load and memory
consumption of deep neural networks. After the compression operation, e.g.
parameter pruning, the model is normally fine-tuned on the original training
dataset to recover from the performance drop caused by compression. However,
the training data is not always available due to privacy issues or other
factors. In this work, we present a data-free fine-tuning approach for pruning
the backbone of deep neural networks. In particular, the pruned network
backbone is trained with synthetically generated images, and our proposed
intermediate supervision to mimic the unpruned backbone's output feature map.
Afterwards, the pruned backbone can be combined with the original network head
to make predictions. We generate synthetic images by back-propagating gradients
to noise images while relying on L1-pruning for the backbone pruning. In our
experiments, we show that our approach is task-independent due to pruning only
the backbone. By evaluating our approach on 2D human pose estimation, object
detection, and image classification, we demonstrate promising performance
compared to the unpruned model. Our code is available at
https://github.com/holzbock/dfbf.Comment: Accpeted for presentation at the 31st European Signal Processing
Conference (EUSIPCO) 2023, September 4-8, 2023, Helsinki, Finlan
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
This paper presents a method for adding multiple tasks to a single deep
neural network while avoiding catastrophic forgetting. Inspired by network
pruning techniques, we exploit redundancies in large deep networks to free up
parameters that can then be employed to learn new tasks. By performing
iterative pruning and network re-training, we are able to sequentially "pack"
multiple tasks into a single network while ensuring minimal drop in performance
and minimal storage overhead. Unlike prior work that uses proxy losses to
maintain accuracy on older tasks, we always optimize for the task at hand. We
perform extensive experiments on a variety of network architectures and
large-scale datasets, and observe much better robustness against catastrophic
forgetting than prior work. In particular, we are able to add three
fine-grained classification tasks to a single ImageNet-trained VGG-16 network
and achieve accuracies close to those of separately trained networks for each
task. Code available at https://github.com/arunmallya/packne
Deep Anchored Convolutional Neural Networks
Convolutional Neural Networks (CNNs) have been proven to be extremely
successful at solving computer vision tasks. State-of-the-art methods favor
such deep network architectures for its accuracy performance, with the cost of
having massive number of parameters and high weights redundancy. Previous works
have studied how to prune such CNNs weights. In this paper, we go to another
extreme and analyze the performance of a network stacked with a single
convolution kernel across layers, as well as other weights sharing techniques.
We name it Deep Anchored Convolutional Neural Network (DACNN). Sharing the same
kernel weights across layers allows to reduce the model size tremendously, more
precisely, the network is compressed in memory by a factor of L, where L is the
desired depth of the network, disregarding the fully connected layer for
prediction. The number of parameters in DACNN barely increases as the network
grows deeper, which allows us to build deep DACNNs without any concern about
memory costs. We also introduce a partial shared weights network (DACNN-mix) as
well as an easy-plug-in module, coined regulators, to boost the performance of
our architecture. We validated our idea on 3 datasets: CIFAR-10, CIFAR-100 and
SVHN. Our results show that we can save massive amounts of memory with our
model, while maintaining a high accuracy performance.Comment: This paper is accepted to 2019 IEEE/CVF Conference on Computer Vision
and Pattern Recognition Workshops (CVPRW
Fine-Pruning: Joint Fine-Tuning and Compression of a Convolutional Network with Bayesian Optimization
When approaching a novel visual recognition problem in a specialized image
domain, a common strategy is to start with a pre-trained deep neural network
and fine-tune it to the specialized domain. If the target domain covers a
smaller visual space than the source domain used for pre-training (e.g.
ImageNet), the fine-tuned network is likely to be over-parameterized. However,
applying network pruning as a post-processing step to reduce the memory
requirements has drawbacks: fine-tuning and pruning are performed
independently; pruning parameters are set once and cannot adapt over time; and
the highly parameterized nature of state-of-the-art pruning methods make it
prohibitive to manually search the pruning parameter space for deep networks,
leading to coarse approximations. We propose a principled method for jointly
fine-tuning and compressing a pre-trained convolutional network that overcomes
these limitations. Experiments on two specialized image domains (remote sensing
images and describable textures) demonstrate the validity of the proposed
approach.Comment: BMVC 2017 ora
- …