Search CORE

13 research outputs found

Non-local Attention Optimized Deep Image Compression

Author: Cao Xun
Chen Tong
Guo Peiyao
Liu Haojie
Ma Zhan
Shen Qiu
Wang Yao
Publication venue
Publication date: 22/04/2019
Field of study

This paper proposes a novel Non-Local Attention Optimized Deep Image Compression (NLAIC) framework, which is built on top of the popular variational auto-encoder (VAE) structure. Our NLAIC framework embeds non-local operations in the encoders and decoders for both image and latent feature probability information (known as hyperprior) to capture both local and global correlations, and apply attention mechanism to generate masks that are used to weigh the features for the image and hyperprior, which implicitly adapt bit allocation for different features based on their importance. Furthermore, both hyperpriors and spatial-channel neighbors of the latent features are used to improve entropy coding. The proposed model outperforms the existing methods on Kodak dataset, including learned (e.g., Balle2019, Balle2018) and conventional (e.g., BPG, JPEG2000, JPEG) image compression methods, for both PSNR and MS-SSIM distortion metrics

arXiv.org e-Print Archive

G-VAE: A Continuously Variable Rate Deep Image Compression Framework

Author: Bai Bo
Cui Ze
Feng Yihui
Guo Tiansheng
Wang Jing
Publication venue
Publication date: 21/04/2020
Field of study

Rate adaption of deep image compression in a single model will become one of the decisive factors competing with the classical image compression codecs. However, until now, there is no perfect solution that neither increases the computation nor affects the compression performance. In this paper, we propose a novel image compression framework G-VAE (Gained Variational Autoencoder), which could achieve continuously variable rate in a single model. Unlike the previous solutions that encode progressively or change the internal unit of the network, G-VAE only adds a pair of gain units at the output of encoder and the input of decoder. It is so concise that G-VAE could be applied to almost all the image compression methods and achieve continuously variable rate with negligible additional parameters and computation. We also propose a new deep image compression framework, which outperforms all the published results on Kodak datasets in PSNR and MS-SSIM metrics. Experimental results show that adding a pair of gain units will not affect the performance of the basic models while endowing them with continuously variable rate

arXiv.org e-Print Archive

Adaptation and Attention for Neural Video Coding

Author: Aksu Emre
Cricri Francesco
Hannuksela Miska
Lainema Jani
Rahtu Esa
Tavakoli Hamed R.
Youvalari Ramin G.
Zhang Honglei
Zou Nannan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Neural image coding represents now the state-of-The-Art image compression approach. However, a lot of work is still to be done in the video domain. In this work, we propose an end-To-end learned video codec that introduces several architectural novelties as well as training novelties, revolving around the concepts of adaptation and attention. Our codec is organized as an intra-frame codec paired with an inter-frame codec. As one architectural novelty, we propose to train the inter-frame codec model to adapt the motion estimation process based on the resolution of the input video. A second architectural novelty is a new neural block that combines concepts from split-Attention based neural networks and from DenseNets. Finally, we propose to overfit a set of decoder-side multiplicative parameters at inference time. Through ablation studies and comparisons to prior art, we show the benefits of our proposed techniques in terms of coding gains. We compare our codec to VVC/H.266 and RLVC, which represent the state-of-The-Art traditional and end-To-end learned codecs, respectively, and to the top performing end-To-end learned approach in 2021 CLIC competition, E2E_T_OL. Our codec clearly outperforms E2E_T_OL, and compare favorably to VVC and RLVC in some settings.acceptedVersionPeer reviewe

Trepo - Institutional Repository of Tampere University

Adaptation and Attention for Neural Video Coding

Author: Aksu Emre
Cricri Francesco
Hannuksela Miska
Lainema Jani
Rahtu Esa
Tavakoli Hamed R.
Youvalari Ramin G.
Zhang Honglei
Zou Nannan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

Learned Point Cloud Geometry Compression

Author: Chen Tong
Liu Haojie
Ma Zhan
Shen Qiu
Wang Jianqiang
Zhu Hao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/09/2019
Field of study

This paper presents a novel end-to-end Learned Point Cloud Geometry Compression (a.k.a., Learned-PCGC) framework, to efficiently compress the point cloud geometry (PCG) using deep neural networks (DNN) based variational autoencoders (VAE). In our approach, PCG is first voxelized, scaled and partitioned into non-overlapped 3D cubes, which is then fed into stacked 3D convolutions for compact latent feature and hyperprior generation. Hyperpriors are used to improve the conditional probability modeling of latent features. A weighted binary cross-entropy (WBCE) loss is applied in training while an adaptive thresholding is used in inference to remove unnecessary voxels and reduce the distortion. Objectively, our method exceeds the geometry-based point cloud compression (G-PCC) algorithm standardized by well-known Moving Picture Experts Group (MPEG) with a significant performance margin, e.g., at least 60% BD-Rate (Bjontegaard Delta Rate) gains, using common test datasets. Subjectively, our method has presented better visual quality with smoother surface reconstruction and appealing details, in comparison to all existing MPEG standard compliant PCC methods. Our method requires about 2.5MB parameters in total, which is a fairly small size for practical implementation, even on embedded platform. Additional ablation studies analyze a variety of aspects (e.g., cube size, kernels, etc) to explore the application potentials of our learned-PCGC.Comment: 13 page

arXiv.org e-Print Archive