Search CORE

77,990 research outputs found

Multi-Context Dual Hyper-Prior Neural Image Compression

Author: Akyash Mohammad
Kashiani Hossein
Khoshkhahtinat Atefeh
Mehta Piyush M.
Nasrabadi Nasser M.
Zafari Ali
Publication venue
Publication date: 19/09/2023
Field of study

Transform and entropy models are the two core components in deep image compression neural networks. Most existing learning-based image compression methods utilize convolutional-based transform, which lacks the ability to model long-range dependencies, primarily due to the limited receptive field of the convolution operation. To address this limitation, we propose a Transformer-based nonlinear transform. This transform has the remarkable ability to efficiently capture both local and global information from the input image, leading to a more decorrelated latent representation. In addition, we introduce a novel entropy model that incorporates two different hyperpriors to model cross-channel and spatial dependencies of the latent representation. To further improve the entropy model, we add a global context that leverages distant relationships to predict the current latent more accurately. This global context employs a causal attention mechanism to extract long-range information in a content-dependent manner. Our experiments show that our proposed framework performs better than the state-of-the-art methods in terms of rate-distortion performance.Comment: Accepted to IEEE 22

^nd

International Conference on Machine Learning and Applications 2023 (ICMLA) - Selected for Oral Presentatio

arXiv.org e-Print Archive

Leveraging progressive model and overfitting for efficient learned image compression

Author: Aksu Emre
Cricri Francesco
Hannuksela Miska M.
Tavakoli Hamed Rezazadegan
Zhang Honglei
Publication venue
Publication date: 08/10/2022
Field of study

Deep learning is overwhelmingly dominant in the field of computer vision and image/video processing for the last decade. However, for image and video compression, it lags behind the traditional techniques based on discrete cosine transform (DCT) and linear filters. Built on top of an autoencoder architecture, learned image compression (LIC) systems have drawn enormous attention in recent years. Nevertheless, the proposed LIC systems are still inferior to the state-of-the-art traditional techniques, for example, the Versatile Video Coding (VVC/H.266) standard, due to either their compression performance or decoding complexity. Although claimed to outperform the VVC/H.266 on a limited bit rate range, some proposed LIC systems take over 40 seconds to decode a 2K image on a GPU system. In this paper, we introduce a powerful and flexible LIC framework with multi-scale progressive (MSP) probability model and latent representation overfitting (LOF) technique. With different predefined profiles, the proposed framework can achieve various balance points between compression efficiency and computational complexity. Experiments show that the proposed framework achieves 2.5%, 1.0%, and 1.3% Bjontegaard delta bit rate (BD-rate) reduction over the VVC/H.266 standard on three benchmark datasets on a wide bit rate range. More importantly, the decoding complexity is reduced from O(n) to O(1) compared to many other LIC systems, resulting in over 20 times speedup when decoding 2K images

arXiv.org e-Print Archive

Image fusion in the JPEG 2000 domain

Author: Bull DR
Canagarajah CN
Canga EF
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2006
Field of study

Explore Bristol Research