3,292 research outputs found
Implicit Filter Sparsification In Convolutional Neural Networks
We show implicit filter level sparsity manifests in convolutional neural
networks (CNNs) which employ Batch Normalization and ReLU activation, and are
trained with adaptive gradient descent techniques and L2 regularization or
weight decay. Through an extensive empirical study (Mehta et al., 2019) we
hypothesize the mechanism behind the sparsification process, and find
surprising links to certain filter sparsification heuristics proposed in
literature. Emergence of, and the subsequent pruning of selective features is
observed to be one of the contributing mechanisms, leading to feature sparsity
at par or better than certain explicit sparsification / pruning approaches. In
this workshop article we summarize our findings, and point out corollaries of
selective-featurepenalization which could also be employed as heuristics for
filter pruningComment: ODML-CDNNR 2019 (ICML'19 workshop) extended abstract of the CVPR 2019
paper "On Implicit Filter Level Sparsity in Convolutional Neural Networks,
Mehta et al." (arXiv:1811.12495
Implicit Filter Sparsification In Convolutional Neural Networks
We show implicit filter level sparsity manifests in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. Through an extensive empirical study (Mehta et al., 2019) we hypothesize the mechanism behind the sparsification process, and find surprising links to certain filter sparsification heuristics proposed in literature. Emergence of, and the subsequent pruning of selective features is observed to be one of the contributing mechanisms, leading to feature sparsity at par or better than certain explicit sparsification / pruning approaches. In this workshop article we summarize our findings, and point out corollaries of selective-featurepenalization which could also be employed as heuristics for filter prunin
CondenseNet: An Efficient DenseNet using Learned Group Convolutions
Deep neural networks are increasingly used on mobile devices, where
computational resources are limited. In this paper we develop CondenseNet, a
novel network architecture with unprecedented efficiency. It combines dense
connectivity with a novel module called learned group convolution. The dense
connectivity facilitates feature re-use in the network, whereas learned group
convolutions remove connections between layers for which this feature re-use is
superfluous. At test time, our model can be implemented using standard group
convolutions, allowing for efficient computation in practice. Our experiments
show that CondenseNets are far more efficient than state-of-the-art compact
convolutional networks such as MobileNets and ShuffleNets
Implicit Discourse Relation Classification via Multi-Task Neural Networks
Without discourse connectives, classifying implicit discourse relations is a
challenging task and a bottleneck for building a practical discourse parser.
Previous research usually makes use of one kind of discourse framework such as
PDTB or RST to improve the classification performance on discourse relations.
Actually, under different discourse annotation frameworks, there exist multiple
corpora which have internal connections. To exploit the combination of
different discourse corpora, we design related discourse classification tasks
specific to a corpus, and propose a novel Convolutional Neural Network embedded
multi-task learning system to synthesize these tasks by learning both unique
and shared representations for each task. The experimental results on the PDTB
implicit discourse relation classification task demonstrate that our model
achieves significant gains over baseline systems.Comment: This is the pre-print version of a paper accepted by AAAI-1
Weightless: Lossy Weight Encoding For Deep Neural Network Compression
The large memory requirements of deep neural networks limit their deployment
and adoption on many devices. Model compression methods effectively reduce the
memory requirements of these models, usually through applying transformations
such as weight pruning or quantization. In this paper, we present a novel
scheme for lossy weight encoding which complements conventional compression
techniques. The encoding is based on the Bloomier filter, a probabilistic data
structure that can save space at the cost of introducing random errors.
Leveraging the ability of neural networks to tolerate these imperfections and
by re-training around the errors, the proposed technique, Weightless, can
compress DNN weights by up to 496x with the same model accuracy. This results
in up to a 1.51x improvement over the state-of-the-art
- …