5,390 research outputs found
Learning Convolutional Networks for Content-weighted Image Compression
Lossy image compression is generally formulated as a joint rate-distortion
optimization to learn encoder, quantizer, and decoder. However, the quantizer
is non-differentiable, and discrete entropy estimation usually is required for
rate control. These make it very challenging to develop a convolutional network
(CNN)-based image compression system. In this paper, motivated by that the
local information content is spatially variant in an image, we suggest that the
bit rate of the different parts of the image should be adapted to local
content. And the content aware bit rate is allocated under the guidance of a
content-weighted importance map. Thus, the sum of the importance map can serve
as a continuous alternative of discrete entropy estimation to control
compression rate. And binarizer is adopted to quantize the output of encoder
due to the binarization scheme is also directly defined by the importance map.
Furthermore, a proxy function is introduced for binary operation in backward
propagation to make it differentiable. Therefore, the encoder, decoder,
binarizer and importance map can be jointly optimized in an end-to-end manner
by using a subset of the ImageNet database. In low bit rate image compression,
experiments show that our system significantly outperforms JPEG and JPEG 2000
by structural similarity (SSIM) index, and can produce the much better visual
result with sharp edges, rich textures, and fewer artifacts
Full Resolution Image Compression with Recurrent Neural Networks
This paper presents a set of full-resolution lossy image compression methods
based on neural networks. Each of the architectures we describe can provide
variable compression rates during deployment without requiring retraining of
the network: each network need only be trained once. All of our architectures
consist of a recurrent neural network (RNN)-based encoder and decoder, a
binarizer, and a neural network for entropy coding. We compare RNN types (LSTM,
associative LSTM) and introduce a new hybrid of GRU and ResNet. We also study
"one-shot" versus additive reconstruction architectures and introduce a new
scaled-additive framework. We compare to previous work, showing improvements of
4.3%-8.8% AUC (area under the rate-distortion curve), depending on the
perceptual metric used. As far as we know, this is the first neural network
architecture that is able to outperform JPEG at image compression across most
bitrates on the rate-distortion curve on the Kodak dataset images, with and
without the aid of entropy coding.Comment: Updated with content for CVPR and removed supplemental material to an
external link for size limitation
- …