3,965 research outputs found
Non-local Attention Optimized Deep Image Compression
This paper proposes a novel Non-Local Attention Optimized Deep Image
Compression (NLAIC) framework, which is built on top of the popular variational
auto-encoder (VAE) structure. Our NLAIC framework embeds non-local operations
in the encoders and decoders for both image and latent feature probability
information (known as hyperprior) to capture both local and global
correlations, and apply attention mechanism to generate masks that are used to
weigh the features for the image and hyperprior, which implicitly adapt bit
allocation for different features based on their importance. Furthermore, both
hyperpriors and spatial-channel neighbors of the latent features are used to
improve entropy coding. The proposed model outperforms the existing methods on
Kodak dataset, including learned (e.g., Balle2019, Balle2018) and conventional
(e.g., BPG, JPEG2000, JPEG) image compression methods, for both PSNR and
MS-SSIM distortion metrics
Scene Matters: Model-based Deep Video Compression
Video compression has always been a popular research area, where many
traditional and deep video compression methods have been proposed. These
methods typically rely on signal prediction theory to enhance compression
performance by designing high efficient intra and inter prediction strategies
and compressing video frames one by one. In this paper, we propose a novel
model-based video compression (MVC) framework that regards scenes as the
fundamental units for video sequences. Our proposed MVC directly models the
intensity variation of the entire video sequence in one scene, seeking
non-redundant representations instead of reducing redundancy through
spatio-temporal predictions. To achieve this, we employ implicit neural
representation as our basic modeling architecture. To improve the efficiency of
video modeling, we first propose context-related spatial positional embedding
and frequency domain supervision in spatial context enhancement. For temporal
correlation capturing, we design the scene flow constrain mechanism and
temporal contrastive loss. Extensive experimental results demonstrate that our
method achieves up to a 20\% bitrate reduction compared to the latest video
coding standard H.266 and is more efficient in decoding than existing video
coding strategies
- …