34 research outputs found
Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima
Recently, a race towards the simplification of deep networks has begun,
showing that it is effectively possible to reduce the size of these models with
minimal or no performance loss. However, there is a general lack in
understanding why these pruning strategies are effective. In this work, we are
going to compare and analyze pruned solutions with two different pruning
approaches, one-shot and gradual, showing the higher effectiveness of the
latter. In particular, we find that gradual pruning allows access to narrow,
well-generalizing minima, which are typically ignored when using one-shot
approaches. In this work we also propose PSP-entropy, a measure to understand
how a given neuron correlates to some specific learned classes. Interestingly,
we observe that the features extracted by iteratively-pruned models are less
correlated to specific classes, potentially making these models a better fit in
transfer learning approaches
Efficient Structure Slimming for Spiking Neural Networks
Spiking neural networks (SNNs) are deeply inspired by biological neural information systems. Compared to convolutional neural networks (CNNs), SNNs are low power consumption because of their spike based information processing mechanism. However, most of the current structures of SNNs are fully-connected or converted from deep CNNs which poses redundancy connections. While the structure and topology in human brain systems are sparse and efficient. This paper aims at taking full advantage of sparse structure and low power consumption which lie in human brain and proposed efficient structure slimming methods. Inspired by the development of biological neural network structures, this paper designed types of structure slimming methods including neuron pruning and channel pruning. In addition to pruning, this paper also considers the growth and development of the nervous system. Through iterative application of the proposed neural pruning and rewiring algorithms, experimental evaluations on CIFAR-10, CIFAR-100, and DVS-Gesture datasets demonstrate the effectiveness of the structure slimming methods. When the parameter count is reduced to only about 10% of the original, the performance decreases by less than 1%
1xN Pattern for Pruning Convolutional Neural Networks
Though network pruning receives popularity in reducing the complexity of
convolutional neural networks (CNNs), it remains an open issue to concurrently
maintain model accuracy as well as achieve significant speedups on general
CPUs. In this paper, we propose a novel 1xN pruning pattern to break this
limitation. In particular, consecutive N output kernels with the same input
channel index are grouped into one block, which serves as a basic pruning
granularity of our pruning pattern. Our 1xN pattern prunes these blocks
considered unimportant. We also provide a workflow of filter rearrangement that
first rearranges the weight matrix in the output channel dimension to derive
more influential blocks for accuracy improvements and then applies similar
rearrangement to the next-layer weights in the input channel dimension to
ensure correct convolutional operations. Moreover, the output computation after
our 1xN pruning can be realized via a parallelized block-wise vectorized
operation, leading to significant speedups on general CPUs. The efficacy of our
pruning pattern is proved with experiments on ILSVRC-2012. For example, Given
the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements
over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it
obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our
project is made available at https://github.com/lmbxmu/1xN