3,292 research outputs found

    Implicit Filter Sparsification In Convolutional Neural Networks

    Get PDF
    We show implicit filter level sparsity manifests in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. Through an extensive empirical study (Mehta et al., 2019) we hypothesize the mechanism behind the sparsification process, and find surprising links to certain filter sparsification heuristics proposed in literature. Emergence of, and the subsequent pruning of selective features is observed to be one of the contributing mechanisms, leading to feature sparsity at par or better than certain explicit sparsification / pruning approaches. In this workshop article we summarize our findings, and point out corollaries of selective-featurepenalization which could also be employed as heuristics for filter pruningComment: ODML-CDNNR 2019 (ICML'19 workshop) extended abstract of the CVPR 2019 paper "On Implicit Filter Level Sparsity in Convolutional Neural Networks, Mehta et al." (arXiv:1811.12495

    Implicit Filter Sparsification In Convolutional Neural Networks

    No full text
    We show implicit filter level sparsity manifests in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. Through an extensive empirical study (Mehta et al., 2019) we hypothesize the mechanism behind the sparsification process, and find surprising links to certain filter sparsification heuristics proposed in literature. Emergence of, and the subsequent pruning of selective features is observed to be one of the contributing mechanisms, leading to feature sparsity at par or better than certain explicit sparsification / pruning approaches. In this workshop article we summarize our findings, and point out corollaries of selective-featurepenalization which could also be employed as heuristics for filter prunin

    CondenseNet: An Efficient DenseNet using Learned Group Convolutions

    Full text link
    Deep neural networks are increasingly used on mobile devices, where computational resources are limited. In this paper we develop CondenseNet, a novel network architecture with unprecedented efficiency. It combines dense connectivity with a novel module called learned group convolution. The dense connectivity facilitates feature re-use in the network, whereas learned group convolutions remove connections between layers for which this feature re-use is superfluous. At test time, our model can be implemented using standard group convolutions, allowing for efficient computation in practice. Our experiments show that CondenseNets are far more efficient than state-of-the-art compact convolutional networks such as MobileNets and ShuffleNets

    Implicit Discourse Relation Classification via Multi-Task Neural Networks

    Full text link
    Without discourse connectives, classifying implicit discourse relations is a challenging task and a bottleneck for building a practical discourse parser. Previous research usually makes use of one kind of discourse framework such as PDTB or RST to improve the classification performance on discourse relations. Actually, under different discourse annotation frameworks, there exist multiple corpora which have internal connections. To exploit the combination of different discourse corpora, we design related discourse classification tasks specific to a corpus, and propose a novel Convolutional Neural Network embedded multi-task learning system to synthesize these tasks by learning both unique and shared representations for each task. The experimental results on the PDTB implicit discourse relation classification task demonstrate that our model achieves significant gains over baseline systems.Comment: This is the pre-print version of a paper accepted by AAAI-1

    Weightless: Lossy Weight Encoding For Deep Neural Network Compression

    Get PDF
    The large memory requirements of deep neural networks limit their deployment and adoption on many devices. Model compression methods effectively reduce the memory requirements of these models, usually through applying transformations such as weight pruning or quantization. In this paper, we present a novel scheme for lossy weight encoding which complements conventional compression techniques. The encoding is based on the Bloomier filter, a probabilistic data structure that can save space at the cost of introducing random errors. Leveraging the ability of neural networks to tolerate these imperfections and by re-training around the errors, the proposed technique, Weightless, can compress DNN weights by up to 496x with the same model accuracy. This results in up to a 1.51x improvement over the state-of-the-art
    • …
    corecore