1,189 research outputs found
Generalized Zero-Shot Learning via Synthesized Examples
We present a generative framework for generalized zero-shot learning where
the training and test classes are not necessarily disjoint. Built upon a
variational autoencoder based architecture, consisting of a probabilistic
encoder and a probabilistic conditional decoder, our model can generate novel
exemplars from seen/unseen classes, given their respective class attributes.
These exemplars can subsequently be used to train any off-the-shelf
classification model. One of the key aspects of our encoder-decoder
architecture is a feedback-driven mechanism in which a discriminator (a
multivariate regressor) learns to map the generated exemplars to the
corresponding class attribute vectors, leading to an improved generator. Our
model's ability to generate and leverage examples from unseen classes to train
the classification model naturally helps to mitigate the bias towards
predicting seen classes in generalized zero-shot learning settings. Through a
comprehensive set of experiments, we show that our model outperforms several
state-of-the-art methods, on several benchmark datasets, for both standard as
well as generalized zero-shot learning.Comment: Accepted in CVPR'1
Hetconv: Heterogeneous kernel-based convolutions for deep cnns
We present a novel deep learning architecture in which the convolution
operation leverages heterogeneous kernels. The proposed HetConv (Heterogeneous
Kernel-Based Convolution) reduces the computation (FLOPs) and the number of
parameters as compared to standard convolution operation while still
maintaining representational efficiency. To show the effectiveness of our
proposed convolution, we present extensive experimental results on the standard
convolutional neural network (CNN) architectures such as VGG \cite{vgg2014very}
and ResNet \cite{resnet}. We find that after replacing the standard
convolutional filters in these architectures with our proposed HetConv filters,
we achieve 3X to 8X FLOPs based improvement in speed while still maintaining
(and sometimes improving) the accuracy. We also compare our proposed
convolutions with group/depth wise convolutions and show that it achieves more
FLOPs reduction with significantly higher accuracy.Comment: Accepted in CVPR 201
Play and Prune:Adaptive filter pruning for deep model compression
While convolutional neural networks (CNN) have achieved impressive
performance on various classification/recognition tasks, they typically consist
of a massive number of parameters. This results in significant memory
requirement as well as computational overheads. Consequently, there is a
growing need for filter-level pruning approaches for compressing CNN based
models that not only reduce the total number of parameters but reduce the
overall computation as well. We present a new min-max framework for
filter-level pruning of CNNs. Our framework, called Play and Prune (PP),
jointly prunes and fine-tunes CNN model parameters, with an adaptive pruning
rate, while maintaining the model's predictive performance. Our framework
consists of two modules: (1) An adaptive filter pruning (AFP) module, which
minimizes the number of filters in the model; and (2) A pruning rate controller
(PRC) module, which maximizes the accuracy during pruning. Moreover, unlike
most previous approaches, our approach allows directly specifying the desired
error tolerance instead of pruning level. Our compressed models can be deployed
at run-time, without requiring any special libraries or hardware. Our approach
reduces the number of parameters of VGG-16 by an impressive factor of 17.5X,
and number of FLOPS by 6.43X, with no loss of accuracy, significantly
outperforming other state-of-the-art filter pruning methods.Comment: International Joint Conference on Artificial Intelligence
(IJCAI-2019
A "network pruning network" Approach to deep model compression
We present a filter pruning approach for deep model compression, using a
multitask network. Our approach is based on learning a a pruner network to
prune a pre-trained target network. The pruner is essentially a multitask deep
neural network with binary outputs that help identify the filters from each
layer of the original network that do not have any significant contribution to
the model and can therefore be pruned. The pruner network has the same
architecture as the original network except that it has a
multitask/multi-output last layer containing binary-valued outputs (one per
filter), which indicate which filters have to be pruned. The pruner's goal is
to minimize the number of filters from the original network by assigning zero
weights to the corresponding output feature-maps. In contrast to most of the
existing methods, instead of relying on iterative pruning, our approach can
prune the network (original network) in one go and, moreover, does not require
specifying the degree of pruning for each layer (and can learn it instead). The
compressed model produced by our approach is generic and does not need any
special hardware/software support. Moreover, augmenting with other methods such
as knowledge distillation, quantization, and connection pruning can increase
the degree of compression for the proposed approach. We show the efficacy of
our proposed approach for classification and object detection tasks.Comment: Accepted in WACV'2
Leveraging Filter Correlations for Deep Model Compression
We present a filter correlation based model compression approach for deep
convolutional neural networks. Our approach iteratively identifies pairs of
filters with the largest pairwise correlations and drops one of the filters
from each such pair. However, instead of discarding one of the filters from
each such pair na\"{i}vely, the model is re-optimized to make the filters in
these pairs maximally correlated, so that discarding one of the filters from
the pair results in minimal information loss. Moreover, after discarding the
filters in each round, we further finetune the model to recover from the
potential small loss incurred by the compression. We evaluate our proposed
approach using a comprehensive set of experiments and ablation studies. Our
compression method yields state-of-the-art FLOPs compression rates on various
benchmarks, such as LeNet-5, VGG-16, and ResNet-50,56, while still achieving
excellent predictive performance for tasks such as object detection on
benchmark datasets.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV),
202
- …