4,551 research outputs found
Compression Artifacts Reduction by a Deep Convolutional Network
Lossy compression introduces complex compression artifacts, particularly the
blocking artifacts, ringing effects and blurring. Existing algorithms either
focus on removing blocking artifacts and produce blurred output, or restores
sharpened images that are accompanied with ringing effects. Inspired by the
deep convolutional networks (DCN) on super-resolution, we formulate a compact
and efficient network for seamless attenuation of different compression
artifacts. We also demonstrate that a deeper model can be effectively trained
with the features learned in a shallow network. Following a similar "easy to
hard" idea, we systematically investigate several practical transfer settings
and show the effectiveness of transfer learning in low-level vision problems.
Our method shows superior performance than the state-of-the-arts both on the
benchmark datasets and the real-world use case (i.e. Twitter). In addition, we
show that our method can be applied as pre-processing to facilitate other
low-level vision routines when they take compressed images as input.Comment: 9 pages, 12 figures, conferenc
Quality Adaptive Low-Rank Based JPEG Decoding with Applications
Small compression noises, despite being transparent to human eyes, can
adversely affect the results of many image restoration processes, if left
unaccounted for. Especially, compression noises are highly detrimental to
inverse operators of high-boosting (sharpening) nature, such as deblurring and
superresolution against a convolution kernel. By incorporating the non-linear
DCT quantization mechanism into the formulation for image restoration, we
propose a new sparsity-based convex programming approach for joint compression
noise removal and image restoration. Experimental results demonstrate
significant performance gains of the new approach over existing image
restoration methods
DPW-SDNet: Dual Pixel-Wavelet Domain Deep CNNs for Soft Decoding of JPEG-Compressed Images
JPEG is one of the widely used lossy compression methods. JPEG-compressed
images usually suffer from compression artifacts including blocking and
blurring, especially at low bit-rates. Soft decoding is an effective solution
to improve the quality of compressed images without changing codec or
introducing extra coding bits. Inspired by the excellent performance of the
deep convolutional neural networks (CNNs) on both low-level and high-level
computer vision problems, we develop a dual pixel-wavelet domain deep
CNNs-based soft decoding network for JPEG-compressed images, namely DPW-SDNet.
The pixel domain deep network takes the four downsampled versions of the
compressed image to form a 4-channel input and outputs a pixel domain
prediction, while the wavelet domain deep network uses the 1-level discrete
wavelet transformation (DWT) coefficients to form a 4-channel input to produce
a DWT domain prediction. The pixel domain and wavelet domain estimates are
combined to generate the final soft decoded result. Experimental results
demonstrate the superiority of the proposed DPW-SDNet over several
state-of-the-art compression artifacts reduction algorithms.Comment: CVPRW 201
Random Walk Graph Laplacian based Smoothness Prior for Soft Decoding of JPEG Images
Given the prevalence of JPEG compressed images, optimizing image
reconstruction from the compressed format remains an important problem. Instead
of simply reconstructing a pixel block from the centers of indexed DCT
coefficient quantization bins (hard decoding), soft decoding reconstructs a
block by selecting appropriate coefficient values within the indexed bins with
the help of signal priors. The challenge thus lies in how to define suitable
priors and apply them effectively.
In this paper, we combine three image priors---Laplacian prior for DCT
coefficients, sparsity prior and graph-signal smoothness prior for image
patches---to construct an efficient JPEG soft decoding algorithm. Specifically,
we first use the Laplacian prior to compute a minimum mean square error (MMSE)
initial solution for each code block. Next, we show that while the sparsity
prior can reduce block artifacts, limiting the size of the over-complete
dictionary (to lower computation) would lead to poor recovery of high DCT
frequencies. To alleviate this problem, we design a new graph-signal smoothness
prior (desired signal has mainly low graph frequencies) based on the left
eigenvectors of the random walk graph Laplacian matrix (LERaG). Compared to
previous graph-signal smoothness priors, LERaG has desirable image filtering
properties with low computation overhead. We demonstrate how LERaG can
facilitate recovery of high DCT frequencies of a piecewise smooth (PWS) signal
via an interpretation of low graph frequency components as relaxed solutions to
normalized cut in spectral clustering. Finally, we construct a soft decoding
algorithm using the three signal priors with appropriate prior weights.
Experimental results show that our proposal outperforms state-of-the-art soft
decoding algorithms in both objective and subjective evaluations noticeably
On learning optimized reaction diffusion processes for effective image restoration
For several decades, image restoration remains an active research topic in
low-level computer vision and hence new approaches are constantly emerging.
However, many recently proposed algorithms achieve state-of-the-art performance
only at the expense of very high computation time, which clearly limits their
practical relevance. In this work, we propose a simple but effective approach
with both high computational efficiency and high restoration quality. We extend
conventional nonlinear reaction diffusion models by several parametrized linear
filters as well as several parametrized influence functions. We propose to
train the parameters of the filters and the influence functions through a loss
based approach. Experiments show that our trained nonlinear reaction diffusion
models largely benefit from the training of the parameters and finally lead to
the best reported performance on common test datasets for image restoration.
Due to their structural simplicity, our trained models are highly efficient and
are also well-suited for parallel computation on GPUs.Comment: 9 pages, 3 figures, 3 tables. CVPR2015 oral presentation together
with the supplemental material of 13 pages, 8 pages (Notes on diffusion
networks
Visual Data Deblocking using Structural Layer Priors
The blocking artifact frequently appears in compressed real-world images or
video sequences, especially coded at low bit rates, which is visually annoying
and likely hurts the performance of many computer vision algorithms. A
compressed frame can be viewed as the superimposition of an intrinsic layer and
an artifact one. Recovering the two layers from such frames seems to be a
severely ill-posed problem since the number of unknowns to recover is twice as
many as the given measurements. In this paper, we propose a simple and robust
method to separate these two layers, which exploits structural layer priors
including the gradient sparsity of the intrinsic layer, and the independence of
the gradient fields of the two layers. A novel Augmented Lagrangian Multiplier
based algorithm is designed to efficiently and effectively solve the recovery
problem. Extensive experimental results demonstrate the superior performance of
our method over the state of the arts, in terms of visual quality and
simplicity
Deep Learning-Based Video Coding: A Review and A Case Study
The past decade has witnessed great success of deep learning technology in
many disciplines, especially in computer vision and image processing. However,
deep learning-based video coding remains in its infancy. This paper reviews the
representative works about using deep learning for image/video coding, which
has been an actively developing research area since the year of 2015. We divide
the related works into two categories: new coding schemes that are built
primarily upon deep networks (deep schemes), and deep network-based coding
tools (deep tools) that shall be used within traditional coding schemes or
together with traditional coding tools. For deep schemes, pixel probability
modeling and auto-encoder are the two approaches, that can be viewed as
predictive coding scheme and transform coding scheme, respectively. For deep
tools, there have been several proposed techniques using deep learning to
perform intra-picture prediction, inter-picture prediction, cross-channel
prediction, probability distribution prediction, transform, post- or in-loop
filtering, down- and up-sampling, as well as encoding optimizations. In the
hope of advocating the research of deep learning-based video coding, we present
a case study of our developed prototype video codec, namely Deep Learning Video
Coding (DLVC). DLVC features two deep tools that are both based on
convolutional neural network (CNN), namely CNN-based in-loop filter (CNN-ILF)
and CNN-based block adaptive resolution coding (CNN-BARC). Both tools help
improve the compression efficiency by a significant margin. With the two deep
tools as well as other non-deep coding tools, DLVC is able to achieve on
average 39.6\% and 33.0\% bits saving than HEVC, under random-access and
low-delay configurations, respectively. The source code of DLVC has been
released for future researches
Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image
Although deep convolutional neural network has been proved to efficiently
eliminate coding artifacts caused by the coarse quantization of traditional
codec, it's difficult to train any neural network in front of the encoder for
gradient's back-propagation. In this paper, we propose an end-to-end image
compression framework based on convolutional neural network to resolve the
problem of non-differentiability of the quantization function in the standard
codec. First, the feature description neural network is used to get a valid
description in the low-dimension space with respect to the ground-truth image
so that the amount of image data is greatly reduced for storage or
transmission. After image's valid description, standard image codec such as
JPEG is leveraged to further compress image, which leads to image's great
distortion and compression artifacts, especially blocking artifacts, detail
missing, blurring, and ringing artifacts. Then, we use a post-processing neural
network to remove these artifacts. Due to the challenge of directly learning a
non-linear function for a standard codec based on convolutional neural network,
we propose to learn a virtual codec neural network to approximate the
projection from the valid description image to the post-processed compressed
image, so that the gradient could be efficiently back-propagated from the
post-processing neural network to the feature description neural network during
training. Meanwhile, an advanced learning algorithm is proposed to train our
deep neural networks for compression. Obviously, the priority of the proposed
method is compatible with standard existing codecs and our learning strategy
can be easily extended into these codecs based on convolutional neural network.
Experimental results have demonstrated the advances of the proposed method as
compared to several state-of-the-art approaches, especially at very low
bit-rate.Comment: 11 pages, 7 figure
Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration
Image restoration is a long-standing problem in low-level computer vision
with many interesting applications. We describe a flexible learning framework
based on the concept of nonlinear reaction diffusion models for various image
restoration problems. By embodying recent improvements in nonlinear diffusion
models, we propose a dynamic nonlinear reaction diffusion model with
time-dependent parameters (\ie, linear filters and influence functions). In
contrast to previous nonlinear diffusion models, all the parameters, including
the filters and the influence functions, are simultaneously learned from
training data through a loss based approach. We call this approach TNRD --
\textit{Trainable Nonlinear Reaction Diffusion}. The TNRD approach is
applicable for a variety of image restoration tasks by incorporating
appropriate reaction force. We demonstrate its capabilities with three
representative applications, Gaussian image denoising, single image super
resolution and JPEG deblocking. Experiments show that our trained nonlinear
diffusion models largely benefit from the training of the parameters and
finally lead to the best reported performance on common test datasets for the
tested applications. Our trained models preserve the structural simplicity of
diffusion models and take only a small number of diffusion steps, thus are
highly efficient. Moreover, they are also well-suited for parallel computation
on GPUs, which makes the inference procedure extremely fast.Comment: 14 pages, 13 figures, to appear in IEEE Transactions on Pattern
Analysis and Machine Intelligence (TPAMI
Gradient Distribution Priors for Biomedical Image Processing
Ill-posed inverse problems are commonplace in biomedical image processing.
Their solution typically requires imposing prior knowledge about the latent
ground truth. While this regularizes the problem to an extent where it can be
solved, it also biases the result toward the expected. With inappropriate
priors harming more than they use, it remains unclear what prior to use for a
given practical problem. Priors are hence mostly chosen in an {\em ad hoc} or
empirical fashion. We argue here that the gradient distribution of
natural-scene images may provide a versatile and well-founded prior for
biomedical images. We provide motivation for this choice from different points
of view, and we fully validate the resulting prior for use on biomedical images
by showing its stability and correlation with image quality. We then provide a
set of simple parametric models for the resulting prior, leading to
straightforward (quasi-)convex optimization problems for which we provide
efficient solver algorithms. We illustrate the use of the present models and
solvers in a variety of common image-processing tasks, including contrast
enhancement, noise level estimation, denoising, blind deconvolution,
zooming/up-sampling, and dehazing. In all cases we show that the present method
leads to results that are comparable to or better than the state of the art;
always using the same, simple prior. We conclude by discussing the limitations
and possible interpretations of the prior.Comment: submitted to journa
- …