34 research outputs found

    Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima

    Full text link
    Recently, a race towards the simplification of deep networks has begun, showing that it is effectively possible to reduce the size of these models with minimal or no performance loss. However, there is a general lack in understanding why these pruning strategies are effective. In this work, we are going to compare and analyze pruned solutions with two different pruning approaches, one-shot and gradual, showing the higher effectiveness of the latter. In particular, we find that gradual pruning allows access to narrow, well-generalizing minima, which are typically ignored when using one-shot approaches. In this work we also propose PSP-entropy, a measure to understand how a given neuron correlates to some specific learned classes. Interestingly, we observe that the features extracted by iteratively-pruned models are less correlated to specific classes, potentially making these models a better fit in transfer learning approaches

    Efficient Structure Slimming for Spiking Neural Networks

    Get PDF
    Spiking neural networks (SNNs) are deeply inspired by biological neural information systems. Compared to convolutional neural networks (CNNs), SNNs are low power consumption because of their spike based information processing mechanism. However, most of the current structures of SNNs are fully-connected or converted from deep CNNs which poses redundancy connections. While the structure and topology in human brain systems are sparse and efficient. This paper aims at taking full advantage of sparse structure and low power consumption which lie in human brain and proposed efficient structure slimming methods. Inspired by the development of biological neural network structures, this paper designed types of structure slimming methods including neuron pruning and channel pruning. In addition to pruning, this paper also considers the growth and development of the nervous system. Through iterative application of the proposed neural pruning and rewiring algorithms, experimental evaluations on CIFAR-10, CIFAR-100, and DVS-Gesture datasets demonstrate the effectiveness of the structure slimming methods. When the parameter count is reduced to only about 10% of the original, the performance decreases by less than 1%

    1xN Pattern for Pruning Convolutional Neural Networks

    Full text link
    Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1xN pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1xN pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1xN pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, Given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN
    corecore