1,342 research outputs found
JPEG Quantized Coefficient Recovery via DCT Domain Spatial-Frequential Transformer
JPEG compression adopts the quantization of Discrete Cosine Transform (DCT)
coefficients for effective bit-rate reduction, whilst the quantization could
lead to a significant loss of important image details. Recovering compressed
JPEG images in the frequency domain has attracted more and more attention
recently, in addition to numerous restoration approaches developed in the pixel
domain. However, the current DCT domain methods typically suffer from limited
effectiveness in handling a wide range of compression quality factors, or fall
short in recovering sparse quantized coefficients and the components across
different colorspace. To address these challenges, we propose a DCT domain
spatial-frequential Transformer, named as DCTransformer. Specifically, a
dual-branch architecture is designed to capture both spatial and frequential
correlations within the collocated DCT coefficients. Moreover, we incorporate
the operation of quantization matrix embedding, which effectively allows our
single model to handle a wide range of quality factors, and a
luminance-chrominance alignment head that produces a unified feature map to
align different-sized luminance and chrominance components. Our proposed
DCTransformer outperforms the current state-of-the-art JPEG artifact removal
techniques, as demonstrated by our extensive experiments.Comment: 13 pages, 8 figure
Deep Markov Random Field for Image Modeling
Markov Random Fields (MRFs), a formulation widely used in generative image
modeling, have long been plagued by the lack of expressive power. This issue is
primarily due to the fact that conventional MRFs formulations tend to use
simplistic factors to capture local patterns. In this paper, we move beyond
such limitations, and propose a novel MRF model that uses fully-connected
neurons to express the complex interactions among pixels. Through theoretical
analysis, we reveal an inherent connection between this model and recurrent
neural networks, and thereon derive an approximated feed-forward network that
couples multiple RNNs along opposite directions. This formulation combines the
expressive power of deep neural networks and the cyclic dependency structure of
MRF in a unified model, bringing the modeling capability to a new level. The
feed-forward approximation also allows it to be efficiently learned from data.
Experimental results on a variety of low-level vision tasks show notable
improvement over state-of-the-arts.Comment: Accepted at ECCV 201
Recommended from our members
Protection of medical images and patient related information in healthcare: Using an intelligent and reversible watermarking technique
This work presents an intelligent technique based on reversible watermarking for protecting patient and medical related information. In the proposed technique ‘IRW-Med’, the concept of companding function is exploited for reducing embedding distortion, while Integer Wavelet Transform (IWT) is used as an embedding domain for achieving reversibility. Histogram processing is employed to avoid underflow/overflow. In addition, the learning capabilities of Genetic Programming (GP) are exploited for intelligent wavelet coefficient selection. In this context, GP is used to evolve models that not only make an optimal tradeoff between imperceptibility and capacity of the watermark, but also exploit the wavelet coefficient hidden dependencies and information related to the type of sub band. The novelty of the proposed IRW-Med technique lies in its ability to generate a model that can find optimal wavelet coefficients for embedding, and also acts as a companding factor for watermark embedding. The proposed IRW-Med is thus able to embed watermark with low distortion, take out the hidden information, and also recovers the original image. The proposed IRW-Med technique is effective with respect to capacity and imperceptibility and effectiveness is demonstrated through experimental comparisons with existing techniques using standard images as well as a publically available medical image dataset
Deep Multi-modality Soft-decoding of Very Low Bit-rate Face Videos
We propose a novel deep multi-modality neural network for restoring very low
bit rate videos of talking heads. Such video contents are very common in social
media, teleconferencing, distance education, tele-medicine, etc., and often
need to be transmitted with limited bandwidth. The proposed CNN method exploits
the correlations among three modalities, video, audio and emotion state of the
speaker, to remove the video compression artifacts caused by spatial down
sampling and quantization. The deep learning approach turns out to be ideally
suited for the video restoration task, as the complex non-linear cross-modality
correlations are very difficult to model analytically and explicitly. The new
method is a video post processor that can significantly boost the perceptual
quality of aggressively compressed talking head videos, while being fully
compatible with all existing video compression standards.Comment: Accepted by Proceedings of the 28th ACM International Conference on
Multimedia(ACM MM),202
- …