2,561 research outputs found
Learning a Dilated Residual Network for SAR Image Despeckling
In this paper, to break the limit of the traditional linear models for
synthetic aperture radar (SAR) image despeckling, we propose a novel deep
learning approach by learning a non-linear end-to-end mapping between the noisy
and clean SAR images with a dilated residual network (SAR-DRN). SAR-DRN is
based on dilated convolutions, which can both enlarge the receptive field and
maintain the filter size and layer depth with a lightweight structure. In
addition, skip connections and residual learning strategy are added to the
despeckling model to maintain the image details and reduce the vanishing
gradient problem. Compared with the traditional despeckling methods, the
proposed method shows superior performance over the state-of-the-art methods on
both quantitative and visual assessments, especially for strong speckle noise.Comment: 18 pages, 13 figures, 7 table
Understanding Convolution for Semantic Segmentation
Recent advances in deep learning, especially deep convolutional neural
networks (CNNs), have led to significant improvement over previous semantic
segmentation systems. Here we show how to improve pixel-wise semantic
segmentation by manipulating convolution-related operations that are of both
theoretical and practical value. First, we design dense upsampling convolution
(DUC) to generate pixel-level prediction, which is able to capture and decode
more detailed information that is generally missing in bilinear upsampling.
Second, we propose a hybrid dilated convolution (HDC) framework in the encoding
phase. This framework 1) effectively enlarges the receptive fields (RF) of the
network to aggregate global information; 2) alleviates what we call the
"gridding issue" caused by the standard dilated convolution operation. We
evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a
state-of-art result of 80.1% mIOU in the test set at the time of submission. We
also have achieved state-of-the-art overall on the KITTI road estimation
benchmark and the PASCAL VOC2012 segmentation task. Our source code can be
found at https://github.com/TuSimple/TuSimple-DUC .Comment: WACV 2018. Updated acknowledgements. Source code:
https://github.com/TuSimple/TuSimple-DU
Learning Deep CNN Denoiser Prior for Image Restoration
Model-based optimization methods and discriminative learning methods have
been the two dominant strategies for solving various inverse problems in
low-level vision. Typically, those two kinds of methods have their respective
merits and drawbacks, e.g., model-based optimization methods are flexible for
handling different inverse problems but are usually time-consuming with
sophisticated priors for the purpose of good performance; in the meanwhile,
discriminative learning methods have fast testing speed but their application
range is greatly restricted by the specialized task. Recent works have revealed
that, with the aid of variable splitting techniques, denoiser prior can be
plugged in as a modular part of model-based optimization methods to solve other
inverse problems (e.g., deblurring). Such an integration induces considerable
advantage when the denoiser is obtained via discriminative learning. However,
the study of integration with fast discriminative denoiser prior is still
lacking. To this end, this paper aims to train a set of fast and effective CNN
(convolutional neural network) denoisers and integrate them into model-based
optimization method to solve other inverse problems. Experimental results
demonstrate that the learned set of denoisers not only achieve promising
Gaussian denoising results but also can be used as prior to deliver good
performance for various low-level vision applications.Comment: Accepted to CVPR 2017. Code: https://github.com/cszn/ircn
Learned Perceptual Image Enhancement
Learning a typical image enhancement pipeline involves minimization of a loss
function between enhanced and reference images. While L1 and L2 losses are
perhaps the most widely used functions for this purpose, they do not
necessarily lead to perceptually compelling results. In this paper, we show
that adding a learned no-reference image quality metric to the loss can
significantly improve enhancement operators. This metric is implemented using a
CNN (convolutional neural network) trained on a large-scale dataset labelled
with aesthetic preferences of human raters. This loss allows us to conveniently
perform back-propagation in our learning framework to simultaneously optimize
for similarity to a given ground truth reference and perceptual quality. This
perceptual loss is only used to train parameters of image processing operators,
and does not impose any extra complexity at inference time. Our experiments
demonstrate that this loss can be effective for tuning a variety of operators
such as local tone mapping and dehazing
- …