8,484 research outputs found
Non-local Attention Optimized Deep Image Compression
This paper proposes a novel Non-Local Attention Optimized Deep Image
Compression (NLAIC) framework, which is built on top of the popular variational
auto-encoder (VAE) structure. Our NLAIC framework embeds non-local operations
in the encoders and decoders for both image and latent feature probability
information (known as hyperprior) to capture both local and global
correlations, and apply attention mechanism to generate masks that are used to
weigh the features for the image and hyperprior, which implicitly adapt bit
allocation for different features based on their importance. Furthermore, both
hyperpriors and spatial-channel neighbors of the latent features are used to
improve entropy coding. The proposed model outperforms the existing methods on
Kodak dataset, including learned (e.g., Balle2019, Balle2018) and conventional
(e.g., BPG, JPEG2000, JPEG) image compression methods, for both PSNR and
MS-SSIM distortion metrics
Learning Accurate Entropy Model with Global Reference for Image Compression
In recent deep image compression neural networks, the entropy model plays a
critical role in estimating the prior distribution of deep image encodings.
Existing methods combine hyperprior with local context in the entropy
estimation function. This greatly limits their performance due to the absence
of a global vision. In this work, we propose a novel Global Reference Model for
image compression to effectively leverage both the local and the global context
information, leading to an enhanced compression rate. The proposed method scans
decoded latents and then finds the most relevant latent to assist the
distribution estimating of the current latent. A by-product of this work is the
innovation of a mean-shifting GDN module that further improves the performance.
Experimental results demonstrate that the proposed model outperforms the
rate-distortion performance of most of the state-of-the-art methods in the
industry
Selective compression learning of latent representations for variable-rate image compression
Recently, many neural network-based image compression methods have shown
promising results superior to the existing tool-based conventional codecs.
However, most of them are often trained as separate models for different target
bit rates, thus increasing the model complexity. Therefore, several studies
have been conducted for learned compression that supports variable rates with
single models, but they require additional network modules, layers, or inputs
that often lead to complexity overhead, or do not provide sufficient coding
efficiency. In this paper, we firstly propose a selective compression method
that partially encodes the latent representations in a fully generalized manner
for deep learning-based variable-rate image compression. The proposed method
adaptively determines essential representation elements for compression of
different target quality levels. For this, we first generate a 3D importance
map as the nature of input content to represent the underlying importance of
the representation elements. The 3D importance map is then adjusted for
different target quality levels using importance adjustment curves. The
adjusted 3D importance map is finally converted into a 3D binary mask to
determine the essential representation elements for compression. The proposed
method can be easily integrated with the existing compression models with a
negligible amount of overhead increase. Our method can also enable continuously
variable-rate compression via simple interpolation of the importance adjustment
curves among different quality levels. The extensive experimental results show
that the proposed method can achieve comparable compression efficiency as those
of the separately trained reference compression models and can reduce decoding
time owing to the selective compression. The sample codes are publicly
available at https://github.com/JooyoungLeeETRI/SCR.Comment: Accepted as a NeurIPS 2022 paper. [Github]
https://github.com/JooyoungLeeETRI/SC
DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression
We propose a new architecture for distributed image compression from a group
of distributed data sources. The work is motivated by practical needs of
data-driven codec design, low power consumption, robustness, and data privacy.
The proposed architecture, which we refer to as Distributed Recurrent
Autoencoder for Scalable Image Compression (DRASIC), is able to train
distributed encoders and one joint decoder on correlated data sources. Its
compression capability is much better than the method of training codecs
separately. Meanwhile, the performance of our distributed system with 10
distributed sources is only within 2 dB peak signal-to-noise ratio (PSNR) of
the performance of a single codec trained with all data sources. We experiment
distributed sources with different correlations and show how our data-driven
methodology well matches the Slepian-Wolf Theorem in Distributed Source Coding
(DSC). To the best of our knowledge, this is the first data-driven DSC
framework for general distributed code design with deep learning
- …