ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image
  Compression

Ghorbel, Ahmed; Hamidouche, Wassim; Morin, Luce

ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image Compression

Authors: Ahmed Ghorbel
Wassim Hamidouche
Luce Morin
Publication date: 12 July 2023
Publisher

Abstract

Over the last few years, neural image compression has gained wide attention from research and industry, yielding promising end-to-end deep neural codecs outperforming their conventional counterparts in rate-distortion performance. Despite significant advancement, current methods, including attention-based transform coding, still need to be improved in reducing the coding rate while preserving the reconstruction fidelity, especially in non-homogeneous textured image areas. Those models also require more parameters and a higher decoding time. To tackle the above challenges, we propose ConvNeXt-ChARM, an efficient ConvNeXt-based transform coding framework, paired with a compute-efficient channel-wise auto-regressive prior to capturing both global and local contexts from the hyper and quantized latent representations. The proposed architecture can be optimized end-to-end to fully exploit the context information and extract compact latent representation while reconstructing higher-quality images. Experimental results on four widely-used datasets showed that ConvNeXt-ChARM brings consistent and significant BD-rate (PSNR) reductions estimated on average to 5.24% and 1.22% over the versatile video coding (VVC) reference encoder (VTM-18.0) and the state-of-the-art learned image compression method SwinT-ChARM, respectively. Moreover, we provide model scaling studies to verify the computational efficiency of our approach and conduct several objective and subjective analyses to bring to the fore the performance gap between the next generation ConvNet, namely ConvNeXt, and Swin Transformer.Comment: arXiv admin note: substantial text overlap with arXiv:2307.02273. text overlap with arXiv:2307.0609

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2307.06342

Last time updated on 20/07/2023