1,012 research outputs found
Unified Depth Prediction and Intrinsic Image Decomposition from a Single Image via Joint Convolutional Neural Fields
We present a method for jointly predicting a depth map and intrinsic images
from single-image input. The two tasks are formulated in a synergistic manner
through a joint conditional random field (CRF) that is solved using a novel
convolutional neural network (CNN) architecture, called the joint convolutional
neural field (JCNF) model. Tailored to our joint estimation problem, JCNF
differs from previous CNNs in its sharing of convolutional activations and
layers between networks for each task, its inference in the gradient domain
where there exists greater correlation between depth and intrinsic images, and
the incorporation of a gradient scale network that learns the confidence of
estimated gradients in order to effectively balance them in the solution. This
approach is shown to surpass state-of-the-art methods both on single-image
depth estimation and on intrinsic image decomposition
Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning
Single image super resolution (SR), which refers to reconstruct a
higher-resolution (HR) image from the observed low-resolution (LR) image, has
received substantial attention due to its tremendous application potentials.
Despite the breakthroughs of recently proposed SR methods using convolutional
neural networks (CNNs), their generated results usually lack of preserving
structural (high-frequency) details. In this paper, regarding global boundary
context and residual context as complimentary information for enhancing
structural details in image restoration, we develop a contextualized multi-task
learning framework to address the SR problem. Specifically, our method first
extracts convolutional features from the input LR image and applies one
deconvolutional module to interpolate the LR feature maps in a content-adaptive
way. Then, the resulting feature maps are fed into two branched sub-networks.
During the neural network training, one sub-network outputs salient image
boundaries and the HR image, and the other sub-network outputs the local
residual map, i.e., the residual difference between the generated HR image and
ground-truth image. On several standard benchmarks (i.e., Set5, Set14 and
BSD200), our extensive evaluations demonstrate the effectiveness of our SR
method on achieving both higher restoration quality and computational
efficiency compared with several state-of-the-art SR approaches. The source
code and some SR results can be found at:
http://hcp.sysu.edu.cn/structure-preserving-image-super-resolution/Comment: To appear in Transactions on Multimedia 201
Deep Likelihood Network for Image Restoration with Multiple Degradation Levels
Convolutional neural networks have been proven effective in a variety of
image restoration tasks. Most state-of-the-art solutions, however, are trained
using images with a single particular degradation level, and their performance
deteriorates drastically when applied to other degradation settings. In this
paper, we propose deep likelihood network (DL-Net), aiming at generalizing
off-the-shelf image restoration networks to succeed over a spectrum of
degradation levels. We slightly modify an off-the-shelf network by appending a
simple recursive module, which is derived from a fidelity term, for
disentangling the computation for multiple degradation levels. Extensive
experimental results on image inpainting, interpolation, and super-resolution
show the effectiveness of our DL-Net.Comment: Accepted by IEEE Transactions on Image Processing; 13 pages, 6
figure
A Cascaded Convolutional Neural Network for Single Image Dehazing
Images captured under outdoor scenes usually suffer from low contrast and
limited visibility due to suspended atmospheric particles, which directly
affects the quality of photos. Despite numerous image dehazing methods have
been proposed, effective hazy image restoration remains a challenging problem.
Existing learning-based methods usually predict the medium transmission by
Convolutional Neural Networks (CNNs), but ignore the key global atmospheric
light. Different from previous learning-based methods, we propose a flexible
cascaded CNN for single hazy image restoration, which considers the medium
transmission and global atmospheric light jointly by two task-driven
subnetworks. Specifically, the medium transmission estimation subnetwork is
inspired by the densely connected CNN while the global atmospheric light
estimation subnetwork is a light-weight CNN. Besides, these two subnetworks are
cascaded by sharing the common features. Finally, with the estimated model
parameters, the haze-free image is obtained by the atmospheric scattering model
inversion, which achieves more accurate and effective restoration performance.
Qualitatively and quantitatively experimental results on the synthetic and
real-world hazy images demonstrate that the proposed method effectively removes
haze from such images, and outperforms several state-of-the-art dehazing
methods.Comment: This manuscript is accepted by IEEE ACCES
Super-Resolution via Deep Learning
The recent phenomenal interest in convolutional neural networks (CNNs) must
have made it inevitable for the super-resolution (SR) community to explore its
potential. The response has been immense and in the last three years, since the
advent of the pioneering work, there appeared too many works not to warrant a
comprehensive survey. This paper surveys the SR literature in the context of
deep learning. We focus on the three important aspects of multimedia - namely
image, video and multi-dimensions, especially depth maps. In each case, first
relevant benchmarks are introduced in the form of datasets and state of the art
SR methods, excluding deep learning. Next is a detailed analysis of the
individual works, each including a short description of the method and a
critique of the results with special reference to the benchmarking done. This
is followed by minimum overall benchmarking in the form of comparison on some
common dataset, while relying on the results reported in various works
Joint Image Filtering with Deep Convolutional Networks
Joint image filters leverage the guidance image as a prior and transfer the
structural details from the guidance image to the target image for suppressing
noise or enhancing spatial resolution. Existing methods either rely on various
explicit filter constructions or hand-designed objective functions, thereby
making it difficult to understand, improve, and accelerate these filters in a
coherent framework. In this paper, we propose a learning-based approach for
constructing joint filters based on Convolutional Neural Networks. In contrast
to existing methods that consider only the guidance image, the proposed
algorithm can selectively transfer salient structures that are consistent with
both guidance and target images. We show that the model trained on a certain
type of data, e.g., RGB and depth images, generalizes well to other modalities,
e.g., flash/non-Flash and RGB/NIR images. We validate the effectiveness of the
proposed joint filter through extensive experimental evaluations with
state-of-the-art methods.Comment: Accepted by TPAM
Missing Data Reconstruction in Remote Sensing image with a Unified Spatial-Temporal-Spectral Deep Convolutional Neural Network
Because of the internal malfunction of satellite sensors and poor atmospheric
conditions such as thick cloud, the acquired remote sensing data often suffer
from missing information, i.e., the data usability is greatly reduced. In this
paper, a novel method of missing information reconstruction in remote sensing
images is proposed. The unified spatial-temporal-spectral framework based on a
deep convolutional neural network (STS-CNN) employs a unified deep
convolutional neural network combined with spatial-temporal-spectral
supplementary information. In addition, to address the fact that most methods
can only deal with a single missing information reconstruction task, the
proposed approach can solve three typical missing information reconstruction
tasks: 1) dead lines in Aqua MODIS band 6; 2) the Landsat ETM+ Scan Line
Corrector (SLC)-off problem; and 3) thick cloud removal. It should be noted
that the proposed model can use multi-source data (spatial, spectral, and
temporal) as the input of the unified framework. The results of both simulated
and real-data experiments demonstrate that the proposed model exhibits high
effectiveness in the three missing information reconstruction tasks listed
above.Comment: To be published in IEEE Transactions on Geoscience and Remote Sensin
Towards Real Scene Super-Resolution with Raw Images
Most existing super-resolution methods do not perform well in real scenarios
due to lack of realistic training data and information loss of the model input.
To solve the first problem, we propose a new pipeline to generate realistic
training data by simulating the imaging process of digital cameras. And to
remedy the information loss of the input, we develop a dual convolutional
neural network to exploit the originally captured radiance information in raw
images. In addition, we propose to learn a spatially-variant color
transformation which helps more effective color corrections. Extensive
experiments demonstrate that super-resolution with raw data helps recover fine
details and clear structures, and more importantly, the proposed network and
data generation pipeline achieve superior results for single image
super-resolution in real scenarios.Comment: Accepted in CVPR 2019, project page:
https://sites.google.com/view/xiangyuxu/rawsr_cvpr1
Bayesian Convolutional Neural Networks for Compressed Sensing Restoration
Deep Neural Networks (DNNs) have aroused great attention in Compressed
Sensing (CS) restoration. However, the working mechanism of DNNs is not
explainable, thereby it is unclear that how to design an optimal DNNs for CS
restoration. In this paper, we propose a novel statistical framework to explain
DNNs, which proves that the hidden layers of DNNs are equivalent to Gibbs
distributions and interprets DNNs as a Bayesian hierarchical model. The
framework provides a Bayesian perspective to explain the working mechanism of
DNNs, namely some hidden layers learn a prior distribution and other layers
learn a likelihood distribution. Moreover, the framework provides insights into
DNNs and reveals two inherent limitations of DNNs for CS restoration. In
contrast to most previous works designing an end-to-end DNNs for CS
restoration, we propose a novel DNNs to model a prior distribution only, which
can circumvent the limitations of DNNs. Given the prior distribution generated
from the DNNs, we design a Bayesian inference algorithm to realize CS
restoration in the framework of Bayesian Compressed Sensing. Finally, extensive
simulations validate the proposed theory of DNNs and demonstrate that the
proposed algorithm outperforms the state-of-the-art CS restoration methods
Deformable kernel networks for guided depth map upsampling
We address the problem of upsampling a low-resolution (LR) depth map using a
registered high-resolution (HR) color image of the same scene. Previous methods
based on convolutional neural networks (CNNs) combine nonlinear activations of
spatially-invariant kernels to estimate structural details from LR depth and HR
color images, and regress upsampling results directly from the networks. In
this paper, we revisit the weighted averaging process that has been widely used
to transfer structural details from hand-crafted visual features to LR depth
maps. We instead learn explicitly sparse and spatially-variant kernels for this
task. To this end, we propose a CNN architecture and its efficient
implementation, called the deformable kernel network (DKN), that outputs sparse
sets of neighbors and the corresponding weights adaptively for each pixel. We
also propose a fast version of DKN (FDKN) that runs about 17 times faster (0.01
seconds for a HR image of size 640 x 480). Experimental results on standard
benchmarks demonstrate the effectiveness of our approach. In particular, we
show that the weighted averaging process with 3 x 3 kernels (i.e., aggregating
9 samples sparsely chosen) outperforms the state of the art by a significant
margin.Comment: conference submissio
- …