Search CORE

3,292 research outputs found

Implicit Filter Sparsification In Convolutional Neural Networks

Author: Kim Kwang In
Mehta Dushyant
Theobalt Christian
Publication venue
Publication date: 01/01/2019
Field of study

We show implicit filter level sparsity manifests in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. Through an extensive empirical study (Mehta et al., 2019) we hypothesize the mechanism behind the sparsification process, and find surprising links to certain filter sparsification heuristics proposed in literature. Emergence of, and the subsequent pruning of selective features is observed to be one of the contributing mechanisms, leading to feature sparsity at par or better than certain explicit sparsification / pruning approaches. In this workshop article we summarize our findings, and point out corollaries of selective-featurepenalization which could also be employed as heuristics for filter pruningComment: ODML-CDNNR 2019 (ICML'19 workshop) extended abstract of the CVPR 2019 paper "On Implicit Filter Level Sparsity in Convolutional Neural Networks, Mehta et al." (arXiv:1811.12495

arXiv.org e-Print Archive

MPG.PuRe

Implicit Filter Sparsification In Convolutional Neural Networks

Author: Kim K.
Mehta D.
Theobalt C.
Publication venue
Publication date: 01/01/2019
Field of study

MPG.PuRe

CondenseNet: An Efficient DenseNet using Learned Group Convolutions

Author: Huang Gao
Liu Shichen
van der Maaten Laurens
Weinberger Kilian Q.
Publication venue
Publication date: 07/06/2018
Field of study

Deep neural networks are increasingly used on mobile devices, where computational resources are limited. In this paper we develop CondenseNet, a novel network architecture with unprecedented efficiency. It combines dense connectivity with a novel module called learned group convolution. The dense connectivity facilitates feature re-use in the network, whereas learned group convolutions remove connections between layers for which this feature re-use is superfluous. At test time, our model can be implemented using standard group convolutions, allowing for efficient computation in practice. Our experiments show that CondenseNets are far more efficient than state-of-the-art compact convolutional networks such as MobileNets and ShuffleNets

arXiv.org e-Print Archive

Crossref

Implicit Discourse Relation Classification via Multi-Task Neural Networks

Author: Li Sujian
Liu Yang
Sui Zhifang
Zhang Xiaodong
Publication venue
Publication date: 05/03/2016
Field of study

Without discourse connectives, classifying implicit discourse relations is a challenging task and a bottleneck for building a practical discourse parser. Previous research usually makes use of one kind of discourse framework such as PDTB or RST to improve the classification performance on discourse relations. Actually, under different discourse annotation frameworks, there exist multiple corpora which have internal connections. To exploit the combination of different discourse corpora, we design related discourse classification tasks specific to a corpus, and propose a novel Convolutional Neural Network embedded multi-task learning system to synthesize these tasks by learning both unique and shared representations for each task. The experimental results on the PDTB implicit discourse relation classification task demonstrate that our model achieves significant gains over baseline systems.Comment: This is the pre-print version of a paper accepted by AAAI-1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Weightless: Lossy Weight Encoding For Deep Neural Network Compression

Author: Adolf Robert
Brooks David
Gupta Udit
Mitzenmacher Michael M.
Reagen Brandon
Rush Alexander M.
Wei Gu-Yeon
Publication venue
Publication date: 13/11/2017
Field of study

The large memory requirements of deep neural networks limit their deployment and adoption on many devices. Model compression methods effectively reduce the memory requirements of these models, usually through applying transformations such as weight pruning or quantization. In this paper, we present a novel scheme for lossy weight encoding which complements conventional compression techniques. The encoding is based on the Bloomier filter, a probabilistic data structure that can save space at the cost of introducing random errors. Leveraging the ability of neural networks to tolerate these imperfections and by re-training around the errors, the proposed technique, Weightless, can compress DNN weights by up to 496x with the same model accuracy. This results in up to a 1.51x improvement over the state-of-the-art

arXiv.org e-Print Archive

UCL Discovery