2,078 research outputs found
Compression Artifacts Reduction by a Deep Convolutional Network
Lossy compression introduces complex compression artifacts, particularly the
blocking artifacts, ringing effects and blurring. Existing algorithms either
focus on removing blocking artifacts and produce blurred output, or restores
sharpened images that are accompanied with ringing effects. Inspired by the
deep convolutional networks (DCN) on super-resolution, we formulate a compact
and efficient network for seamless attenuation of different compression
artifacts. We also demonstrate that a deeper model can be effectively trained
with the features learned in a shallow network. Following a similar "easy to
hard" idea, we systematically investigate several practical transfer settings
and show the effectiveness of transfer learning in low-level vision problems.
Our method shows superior performance than the state-of-the-arts both on the
benchmark datasets and the real-world use case (i.e. Twitter). In addition, we
show that our method can be applied as pre-processing to facilitate other
low-level vision routines when they take compressed images as input.Comment: 9 pages, 12 figures, conferenc
DPW-SDNet: Dual Pixel-Wavelet Domain Deep CNNs for Soft Decoding of JPEG-Compressed Images
JPEG is one of the widely used lossy compression methods. JPEG-compressed
images usually suffer from compression artifacts including blocking and
blurring, especially at low bit-rates. Soft decoding is an effective solution
to improve the quality of compressed images without changing codec or
introducing extra coding bits. Inspired by the excellent performance of the
deep convolutional neural networks (CNNs) on both low-level and high-level
computer vision problems, we develop a dual pixel-wavelet domain deep
CNNs-based soft decoding network for JPEG-compressed images, namely DPW-SDNet.
The pixel domain deep network takes the four downsampled versions of the
compressed image to form a 4-channel input and outputs a pixel domain
prediction, while the wavelet domain deep network uses the 1-level discrete
wavelet transformation (DWT) coefficients to form a 4-channel input to produce
a DWT domain prediction. The pixel domain and wavelet domain estimates are
combined to generate the final soft decoded result. Experimental results
demonstrate the superiority of the proposed DPW-SDNet over several
state-of-the-art compression artifacts reduction algorithms.Comment: CVPRW 201
S-Net: A Scalable Convolutional Neural Network for JPEG Compression Artifact Reduction
Recent studies have used deep residual convolutional neural networks (CNNs)
for JPEG compression artifact reduction. This study proposes a scalable CNN
called S-Net. Our approach effectively adjusts the network scale dynamically in
a multitask system for real-time operation with little performance loss. It
offers a simple and direct technique to evaluate the performance gains obtained
with increasing network depth, and it is helpful for removing redundant network
layers to maximize the network efficiency. We implement our architecture using
the Keras framework with the TensorFlow backend on an NVIDIA K80 GPU server. We
train our models on the DIV2K dataset and evaluate their performance on public
benchmark datasets. To validate the generality and universality of the proposed
method, we created and utilized a new dataset, called WIN143, for
over-processed images evaluation. Experimental results indicate that our
proposed approach outperforms other CNN-based methods and achieves
state-of-the-art performance.Comment: accepted by Journal of Electronic Imagin
Deep Generative Adversarial Compression Artifact Removal
Compression artifacts arise in images whenever a lossy compression algorithm
is applied. These artifacts eliminate details present in the original image, or
add noise and small structures; because of these effects they make images less
pleasant for the human eye, and may also lead to decreased performance of
computer vision algorithms such as object detectors. To eliminate such
artifacts, when decompressing an image, it is required to recover the original
image from a disturbed version. To this end, we present a feed-forward fully
convolutional residual network model trained using a generative adversarial
framework. To provide a baseline, we show that our model can be also trained
optimizing the Structural Similarity (SSIM), which is a better loss with
respect to the simpler Mean Squared Error (MSE). Our GAN is able to produce
images with more photorealistic details than MSE or SSIM based networks.
Moreover we show that our approach can be used as a pre-processing step for
object detection in case images are degraded by compression to a point that
state-of-the art detectors fail. In this task, our GAN method obtains better
performance than MSE or SSIM trained networks.Comment: ICCV 2017 Camera Ready + Acknowledgement
Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image
Although deep convolutional neural network has been proved to efficiently
eliminate coding artifacts caused by the coarse quantization of traditional
codec, it's difficult to train any neural network in front of the encoder for
gradient's back-propagation. In this paper, we propose an end-to-end image
compression framework based on convolutional neural network to resolve the
problem of non-differentiability of the quantization function in the standard
codec. First, the feature description neural network is used to get a valid
description in the low-dimension space with respect to the ground-truth image
so that the amount of image data is greatly reduced for storage or
transmission. After image's valid description, standard image codec such as
JPEG is leveraged to further compress image, which leads to image's great
distortion and compression artifacts, especially blocking artifacts, detail
missing, blurring, and ringing artifacts. Then, we use a post-processing neural
network to remove these artifacts. Due to the challenge of directly learning a
non-linear function for a standard codec based on convolutional neural network,
we propose to learn a virtual codec neural network to approximate the
projection from the valid description image to the post-processed compressed
image, so that the gradient could be efficiently back-propagated from the
post-processing neural network to the feature description neural network during
training. Meanwhile, an advanced learning algorithm is proposed to train our
deep neural networks for compression. Obviously, the priority of the proposed
method is compatible with standard existing codecs and our learning strategy
can be easily extended into these codecs based on convolutional neural network.
Experimental results have demonstrated the advances of the proposed method as
compared to several state-of-the-art approaches, especially at very low
bit-rate.Comment: 11 pages, 7 figure
Implicit Dual-domain Convolutional Network for Robust Color Image Compression Artifact Reduction
Several dual-domain convolutional neural network-based methods show
outstanding performance in reducing image compression artifacts. However, they
suffer from handling color images because the compression processes for
gray-scale and color images are completely different. Moreover, these methods
train a specific model for each compression quality and require multiple models
to achieve different compression qualities. To address these problems, we
proposed an implicit dual-domain convolutional network (IDCN) with the pixel
position labeling map and the quantization tables as inputs. Specifically, we
proposed an extractor-corrector framework-based dual-domain correction unit
(DCU) as the basic component to formulate the IDCN. A dense block was
introduced to improve the performance of extractor in DRU. The implicit
dual-domain translation allows the IDCN to handle color images with the
discrete cosine transform (DCT)-domain priors. A flexible version of IDCN
(IDCN-f) was developed to handle a wide range of compression qualities.
Experiments for both objective and subjective evaluations on benchmark datasets
show that IDCN is superior to the state-of-the-art methods and IDCN-f exhibits
excellent abilities to handle a wide range of compression qualities with little
performance sacrifice and demonstrates great potential for practical
applications.Comment: accepted by IEEE Transactions on Circuits and Systems for Video
Technology(T-CSVT
Machine Learning based Post Processing Artifact Reduction in HEVC Intra Coding
The lossy compression techniques produce various artifacts like blurring,
distortion at block bounders, ringing and contouring effects on outputs
especially at low bit rates. To reduce those compression artifacts various
Convolutional Neural Network (CNN) based post processing techniques have been
experimented over recent years. The latest video coding standard HEVC adopts
two post processing filtering operations namely de-blocking filter (DBF)
followed by sample adaptive offset (SAO). These operations consumes extra
signaling bit and becomes an overhead to network. In this paper we proposed a
new Deep learning based algorithm on SAO filtering operation. We designed a
variable filter size Sub-layered Deeper CNN (SDCNN) architecture to improve
filtering operation and incorporated large stride convolutional, deconvolution
layers for further speed up. We also demonstrated that deeper architecture
model can effectively be trained with the features learnt in a shallow network
using data augmentation and transfer learning based techniques. Experimental
results shows that our proposed network outperforms other networks in terms on
PSNR and SSIM measurements on widely available benchmark video sequences and
also perform an average of 4.1 % bit rate reduction as compared to HEVC
baseline
Deep Reference Generation with Multi-Domain Hierarchical Constraints for Inter Prediction
Inter prediction is an important module in video coding for temporal
redundancy removal, where similar reference blocks are searched from previously
coded frames and employed to predict the block to be coded. Although
traditional video codecs can estimate and compensate for block-level motions,
their inter prediction performance is still heavily affected by the remaining
inconsistent pixel-wise displacement caused by irregular rotation and
deformation. In this paper, we address the problem by proposing a deep frame
interpolation network to generate additional reference frames in coding
scenarios. First, we summarize the previous adaptive convolutions used for
frame interpolation and propose a factorized kernel convolutional network to
improve the modeling capacity and simultaneously keep its compact form. Second,
to better train this network, multi-domain hierarchical constraints are
introduced to regularize the training of our factorized kernel convolutional
network. For spatial domain, we use a gradually down-sampled and up-sampled
auto-encoder to generate the factorized kernels for frame interpolation at
different scales. For quality domain, considering the inconsistent quality of
the input frames, the factorized kernel convolution is modulated with
quality-related features to learn to exploit more information from high quality
frames. For frequency domain, a sum of absolute transformed difference loss
that performs frequency transformation is utilized to facilitate network
optimization from the view of coding performance. With the well-designed frame
interpolation network regularized by multi-domain hierarchical constraints, our
method surpasses HEVC on average 6.1% BD-rate saving and up to 11.0% BD-rate
saving for the luma component under the random access configuration
Spatial-Temporal Residue Network Based In-Loop Filter for Video Coding
Deep learning has demonstrated tremendous break through in the area of
image/video processing. In this paper, a spatial-temporal residue network
(STResNet) based in-loop filter is proposed to suppress visual artifacts such
as blocking, ringing in video coding. Specifically, the spatial and temporal
information is jointly exploited by taking both current block and co-located
block in reference frame into consideration during the processing of in-loop
filter. The architecture of STResNet only consists of four convolution layers
which shows hospitality to memory and coding complexity. Moreover, to fully
adapt the input content and improve the performance of the proposed in-loop
filter, coding tree unit (CTU) level control flag is applied in the sense of
rate-distortion optimization. Extensive experimental results show that our
scheme provides up to 5.1% bit-rate reduction compared to the state-of-the-art
video coding standard.Comment: 4 pages, 2 figures, accepted by VCIP201
Learned Compression Artifact Removal by Deep Residual Networks
We propose a method for learned compression artifact removal by
post-processing of BPG compressed images. We trained three networks of
different sizes. We encoded input images using BPG with different QP values. We
submitted the best combination of test images, encoded with different QP and
post-processed by one of three networks, which satisfy the file size and decode
time constraints imposed by the Challenge. The selection of the best
combination is posed as an integer programming problem. Although the visual
improvements in image quality is impressive, the average PSNR improvement for
the results is about 0.5 dB.Comment: Accepted for publication in the CVPR 2018, Challenge on Learned Image
Compression (CLIC), Salt Lake City, Utah, USA, 18 June 2018 and appears in
compression.c
- …