4,087 research outputs found
Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM
Weight pruning and weight quantization are two important categories of DNN
model compression. Prior work on these techniques are mainly based on
heuristics. A recent work developed a systematic frame-work of DNN weight
pruning using the advanced optimization technique ADMM (Alternating Direction
Methods of Multipliers), achieving one of state-of-art in weight pruning
results. In this work, we first extend such one-shot ADMM-based framework to
guarantee solution feasibility and provide fast convergence rate, and
generalize to weight quantization as well. We have further developed a
multi-step, progressive DNN weight pruning and quantization framework, with
dual benefits of (i) achieving further weight pruning/quantization thanks to
the special property of ADMM regularization, and (ii) reducing the search space
within each step. Extensive experimental results demonstrate the superior
performance compared with prior work. Some highlights: (i) we achieve 246x,36x,
and 8x weight pruning on LeNet-5, AlexNet, and ResNet-50 models, respectively,
with (almost) zero accuracy loss; (ii) even a significant 61x weight pruning in
AlexNet (ImageNet) results in only minor degradation in actual accuracy
compared with prior work; (iii) we are among the first to derive notable weight
pruning results for ResNet and MobileNet models; (iv) we derive the first
lossless, fully binarized (for all layers) LeNet-5 for MNIST and VGG-16 for
CIFAR-10; and (v) we derive the first fully binarized (for all layers) ResNet
for ImageNet with reasonable accuracy loss
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers
This paper presents a novel end-to-end methodology for enabling the
deployment of low-error deep networks on microcontrollers. To fit the memory
and computational limitations of resource-constrained edge-devices, we exploit
mixed low-bitwidth compression, featuring 8, 4 or 2-bit uniform quantization,
and we model the inference graph with integer-only operations. Our approach
aims at determining the minimum bit precision of every activation and weight
tensor given the memory constraints of a device. This is achieved through a
rule-based iterative procedure, which cuts the number of bits of the most
memory-demanding layers, aiming at meeting the memory constraints. After a
quantization-aware retraining step, the fake-quantized graph is converted into
an inference integer-only model by inserting the Integer Channel-Normalization
(ICN) layers, which introduce a negligible loss as demonstrated on INT4
MobilenetV1 models. We report the latency-accuracy evaluation of
mixed-precision MobilenetV1 family networks on a STM32H7 microcontroller. Our
experimental results demonstrate an end-to-end deployment of an integer-only
Mobilenet network with Top1 accuracy of 68% on a device with only 2MB of FLASH
memory and 512kB of RAM, improving by 8% the Top1 accuracy with respect to
previously published 8 bit implementations for microcontrollers.Comment: Submitted to NeurIPS 201
Postprocessing of Compressed Images via Sequential Denoising
In this work we propose a novel postprocessing technique for
compression-artifact reduction. Our approach is based on posing this task as an
inverse problem, with a regularization that leverages on existing
state-of-the-art image denoising algorithms. We rely on the recently proposed
Plug-and-Play Prior framework, suggesting the solution of general inverse
problems via Alternating Direction Method of Multipliers (ADMM), leading to a
sequence of Gaussian denoising steps. A key feature in our scheme is a
linearization of the compression-decompression process, so as to get a
formulation that can be optimized. In addition, we supply a thorough analysis
of this linear approximation for several basic compression procedures. The
proposed method is suitable for diverse compression techniques that rely on
transform coding. Specifically, we demonstrate impressive gains in image
quality for several leading compression methods - JPEG, JPEG2000, and HEVC.Comment: Submitted to IEEE Transactions on Image Processin
Deep AutoEncoder-based Lossy Geometry Compression for Point Clouds
Point cloud is a fundamental 3D representation which is widely used in real
world applications such as autonomous driving. As a newly-developed media
format which is characterized by complexity and irregularity, point cloud
creates a need for compression algorithms which are more flexible than existing
codecs. Recently, autoencoders(AEs) have shown their effectiveness in many
visual analysis tasks as well as image compression, which inspires us to employ
it in point cloud compression. In this paper, we propose a general
autoencoder-based architecture for lossy geometry point cloud compression. To
the best of our knowledge, it is the first autoencoder-based geometry
compression codec that directly takes point clouds as input rather than voxel
grids or collections of images. Compared with handcrafted codecs, this approach
adapts much more quickly to previously unseen media contents and media formats,
meanwhile achieving competitive performance. Our architecture consists of a
pointnet-based encoder, a uniform quantizer, an entropy estimation block and a
nonlinear synthesis transformation module. In lossy geometry compression of
point cloud, results show that the proposed method outperforms the test model
for categories 1 and 3 (TMC13) published by MPEG-3DG group on the 125th
meeting, and on average a 73.15\% BD-rate gain is achieved
PruneNet: Channel Pruning via Global Importance
Channel pruning is one of the predominant approaches for accelerating deep
neural networks. Most existing pruning methods either train from scratch with a
sparsity inducing term such as group lasso, or prune redundant channels in a
pretrained network and then fine tune the network. Both strategies suffer from
some limitations: the use of group lasso is computationally expensive,
difficult to converge and often suffers from worse behavior due to the
regularization bias. The methods that start with a pretrained network either
prune channels uniformly across the layers or prune channels based on the basic
statistics of the network parameters. These approaches either ignore the fact
that some CNN layers are more redundant than others or fail to adequately
identify the level of redundancy in different layers. In this work, we
investigate a simple-yet-effective method for pruning channels based on a
computationally light-weight yet effective data driven optimization step that
discovers the necessary width per layer. Experiments conducted on ILSVRC-
confirm effectiveness of our approach. With non-uniform pruning across the
layers on ResNet-, we are able to match the FLOP reduction of
state-of-the-art channel pruning results while achieving a higher
accuracy. Further, we show that our pruned ResNet- network outperforms
ResNet- and ResNet- networks, and that our pruned ResNet-
outperforms ResNet-.Comment: 12 pages, 3 figures, Published in ICLR 2020 NAS Worksho
Lossy Image Compression with Compressive Autoencoders
We propose a new approach to the problem of optimizing autoencoders for lossy
image compression. New media formats, changing hardware technology, as well as
diverse requirements and content types create a need for compression algorithms
which are more flexible than existing codecs. Autoencoders have the potential
to address this need, but are difficult to optimize directly due to the
inherent non-differentiabilty of the compression loss. We here show that
minimal changes to the loss are sufficient to train deep autoencoders
competitive with JPEG 2000 and outperforming recently proposed approaches based
on RNNs. Our network is furthermore computationally efficient thanks to a
sub-pixel architecture, which makes it suitable for high-resolution images.
This is in contrast to previous work on autoencoders for compression using
coarser approximations, shallower architectures, computationally expensive
methods, or focusing on small images
Image Compression Based on Compressive Sensing: End-to-End Comparison with JPEG
We present an end-to-end image compression system based on compressive
sensing. The presented system integrates the conventional scheme of compressive
sampling and reconstruction with quantization and entropy coding. The
compression performance, in terms of decoded image quality versus data rate, is
shown to be comparable with JPEG and significantly better at the low rate
range. We study the parameters that influence the system performance, including
(i) the choice of sensing matrix, (ii) the trade-off between quantization and
compression ratio, and (iii) the reconstruction algorithms. We propose an
effective method to jointly control the quantization step and compression ratio
in order to achieve near optimal quality at any given bit rate. Furthermore,
our proposed image compression system can be directly used in the compressive
sensing camera, e.g. the single pixel camera, to construct a hardware
compressive sampling system.Comment: 17 pages, 13 figure
An End-to-End Compression Framework Based on Convolutional Neural Networks
Deep learning, e.g., convolutional neural networks (CNNs), has achieved great
success in image processing and computer vision especially in high level vision
applications such as recognition and understanding. However, it is rarely used
to solve low-level vision problems such as image compression studied in this
paper. Here, we move forward a step and propose a novel compression framework
based on CNNs. To achieve high-quality image compression at low bit rates, two
CNNs are seamlessly integrated into an end-to-end compression framework. The
first CNN, named compact convolutional neural network (ComCNN), learns an
optimal compact representation from an input image, which preserves the
structural information and is then encoded using an image codec (e.g., JPEG,
JPEG2000 or BPG). The second CNN, named reconstruction convolutional neural
network (RecCNN), is used to reconstruct the decoded image with high-quality in
the decoding end. To make two CNNs effectively collaborate, we develop a
unified end-to-end learning algorithm to simultaneously learn ComCNN and
RecCNN, which facilitates the accurate reconstruction of the decoded image
using RecCNN. Such a design also makes the proposed compression framework
compatible with existing image coding standards. Experimental results validate
that the proposed compression framework greatly outperforms several compression
frameworks that use existing image coding standards with state-of-the-art
deblocking or denoising post-processing methods.Comment: Submitted to IEEE Transactions on Circuits and Systems for Video
Technolog
MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning
In this paper, we propose a novel meta learning approach for automatic
channel pruning of very deep neural networks. We first train a PruningNet, a
kind of meta network, which is able to generate weight parameters for any
pruned structure given the target network. We use a simple stochastic structure
sampling method for training the PruningNet. Then, we apply an evolutionary
procedure to search for good-performing pruned networks. The search is highly
efficient because the weights are directly generated by the trained PruningNet
and we do not need any finetuning at search time. With a single PruningNet
trained for the target network, we can search for various Pruned Networks under
different constraints with little human participation. Compared to the
state-of-the-art pruning methods, we have demonstrated superior performances on
MobileNet V1/V2 and ResNet. Codes are available on
https://github.com/liuzechun/MetaPruning.Comment: ICCV 2019 Camera ready version. Codes are available on
https://github.com/liuzechun/MetaPrunin
Proceedings of Workshop AEW10: Concepts in Information Theory and Communications
The 10th Asia-Europe workshop in "Concepts in Information Theory and
Communications" AEW10 was held in Boppard, Germany on June 21-23, 2017. It is
based on a longstanding cooperation between Asian and European scientists. The
first workshop was held in Eindhoven, the Netherlands in 1989. The idea of the
workshop is threefold: 1) to improve the communication between the scientist in
the different parts of the world; 2) to exchange knowledge and ideas; and 3) to
pay a tribute to a well respected and special scientist.Comment: 44 pages, editors for the proceedings: Yanling Chen and A. J. Han
Vinc
- …