2,628 research outputs found
Deep SR-ITM: Joint Learning of Super-Resolution and Inverse Tone-Mapping for 4K UHD HDR Applications
Recent modern displays are now able to render high dynamic range (HDR), high
resolution (HR) videos of up to 8K UHD (Ultra High Definition). Consequently,
UHD HDR broadcasting and streaming have emerged as high quality premium
services. However, due to the lack of original UHD HDR video content,
appropriate conversion technologies are urgently needed to transform the legacy
low resolution (LR) standard dynamic range (SDR) videos into UHD HDR versions.
In this paper, we propose a joint super-resolution (SR) and inverse
tone-mapping (ITM) framework, called Deep SR-ITM, which learns the direct
mapping from LR SDR video to their HR HDR version. Joint SR and ITM is an
intricate task, where high frequency details must be restored for SR, jointly
with the local contrast, for ITM. Our network is able to restore fine details
by decomposing the input image and focusing on the separate base (low
frequency) and detail (high frequency) layers. Moreover, the proposed
modulation blocks apply location-variant operations to enhance local contrast.
The Deep SR-ITM shows good subjective quality with increased contrast and
details, outperforming the previous joint SR-ITM method.Comment: Accepted at ICCV 2019 (Oral
Learnable Exposure Fusion for Dynamic Scenes
In this paper, we focus on Exposure Fusion (EF) [ExposFusi2] for dynamic
scenes. The task is to fuse multiple images obtained by exposure bracketing to
create an image which comprises a high level of details. Typically, such images
are not possible to obtain directly from a camera due to hardware limitations,
e.g., a limited dynamic range of the sensor. A major problem of such tasks is
that the images may not be spatially aligned due to scene motion or camera
motion. It is known that the required alignment by image registration problems
is ill-posed. In this case, the images to be aligned vary in their intensity
range, which makes the problem even more difficult.
To address the mentioned problems, we propose an end-to-end
\emph{Convolutional Neural Network} (CNN) based approach to learn to estimate
exposure fusion from and Low Dynamic Range (LDR) images depicting
different scene contents. To the best of our knowledge, no efficient and robust
CNN-based end-to-end approach can be found in the literature for this kind of
problem. The idea is to create a dataset with perfectly aligned LDR images to
obtain ground-truth exposure fusion images. At the same time, we obtain
additional LDR images with some motion, having the same exposure fusion
ground-truth as the perfectly aligned LDR images. This way, we can train an
end-to-end CNN having misaligned LDR input images, but with a proper ground
truth exposure fusion image. We propose a specific CNN-architecture to solve
this problem. In various experiments, we show that the proposed approach yields
excellent results
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
Deep Retinex Decomposition for Low-Light Enhancement
Retinex model is an effective tool for low-light image enhancement. It
assumes that observed images can be decomposed into the reflectance and
illumination. Most existing Retinex-based methods have carefully designed
hand-crafted constraints and parameters for this highly ill-posed
decomposition, which may be limited by model capacity when applied in various
scenes. In this paper, we collect a LOw-Light dataset (LOL) containing
low/normal-light image pairs and propose a deep Retinex-Net learned on this
dataset, including a Decom-Net for decomposition and an Enhance-Net for
illumination adjustment. In the training process for Decom-Net, there is no
ground truth of decomposed reflectance and illumination. The network is learned
with only key constraints including the consistent reflectance shared by paired
low/normal-light images, and the smoothness of illumination. Based on the
decomposition, subsequent lightness enhancement is conducted on illumination by
an enhancement network called Enhance-Net, and for joint denoising there is a
denoising operation on reflectance. The Retinex-Net is end-to-end trainable, so
that the learned decomposition is by nature good for lightness adjustment.
Extensive experiments demonstrate that our method not only achieves visually
pleasing quality for low-light enhancement but also provides a good
representation of image decomposition.Comment: BMVC 2018(Oral). Dataset and Project page:
https://daooshee.github.io/BMVC2018website
UG Track 2: A Collective Benchmark Effort for Evaluating and Advancing Image Understanding in Poor Visibility Environments
The UG challenge in IEEE CVPR 2019 aims to evoke a comprehensive
discussion and exploration about how low-level vision techniques can benefit
the high-level automatic visual recognition in various scenarios. In its second
track, we focus on object or face detection in poor visibility enhancements
caused by bad weathers (haze, rain) and low light conditions. While existing
enhancement methods are empirically expected to help the high-level end task,
that is observed to not always be the case in practice. To provide a more
thorough examination and fair comparison, we introduce three benchmark sets
collected in real-world hazy, rainy, and low-light conditions, respectively,
with annotate objects/faces annotated. To our best knowledge, this is the first
and currently largest effort of its kind. Baseline results by cascading
existing enhancement and detection models are reported, indicating the highly
challenging nature of our new data as well as the large room for further
technical innovations. We expect a large participation from the broad research
community to address these challenges together.Comment: A summary paper on datasets, fact sheets, baseline results, challenge
results, and winning methods in UG Challenge (Track 2). More materials
are provided in http://www.ug2challenge.org/index.htm
MANTIS: Model-Augmented Neural neTwork with Incoherent k-space Sampling for efficient MR T2 mapping
Quantitative mapping of magnetic resonance (MR) parameters have been shown as
valuable methods for improved assessment of a range of diseases. Due to the
need to image an anatomic structure multiple times, parameter mapping usually
requires long scan times compared to conventional static imaging. Therefore,
accelerated parameter mapping is highly-desirable and remains a topic of great
interest in the MR research community. While many recent deep learning methods
have focused on highly efficient image reconstruction for conventional static
MR imaging, applications of deep learning for dynamic imaging and in particular
accelerated parameter mapping have been limited. The purpose of this work was
to develop and evaluate a novel deep learning-based reconstruction framework
called Model-Augmented Neural neTwork with Incoherent k-space Sampling (MANTIS)
for efficient MR parameter mapping. Our approach combines end-to-end CNN
mapping with k-space consistency using the concept of cyclic loss to further
enforce data and model fidelity. Incoherent k-space sampling is used to improve
reconstruction performance. A physical model is incorporated into the proposed
framework, so that the parameter maps can be efficiently estimated directly
from undersampled images. The performance of MANTIS was demonstrated for the
spin-spin relaxation time (T2) mapping of the knee joint. Compared to
conventional reconstruction approaches that exploited image sparsity, MANTIS
yielded lower errors and higher similarity with respect to the reference in the
T2 estimation. Our study demonstrated that the proposed MANTIS framework, with
a combination of end-to-end CNN mapping, signal model-augmented data
consistency, and incoherent k-space sampling, represents a promising approach
for efficient MR parameter mapping. MANTIS can potentially be extended to other
types of parameter mapping with appropriate models
Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral Imagery
Recently, single gray/RGB image super-resolution reconstruction task has been
extensively studied and made significant progress by leveraging the advanced
machine learning techniques based on deep convolutional neural networks
(DCNNs). However, there has been limited technical development focusing on
single hyperspectral image super-resolution due to the high-dimensional and
complex spectral patterns in hyperspectral image. In this paper, we make a step
forward by investigating how to adapt state-of-the-art residual learning based
single gray/RGB image super-resolution approaches for computationally efficient
single hyperspectral image super-resolution, referred as SSPSR. Specifically,
we introduce a spatial-spectral prior network (SSPN) to fully exploit the
spatial information and the correlation between the spectra of the
hyperspectral data. Considering that the hyperspectral training samples are
scarce and the spectral dimension of hyperspectral image data is very high, it
is nontrivial to train a stable and effective deep network. Therefore, a group
convolution (with shared network parameters) and progressive upsampling
framework is proposed. This will not only alleviate the difficulty in feature
extraction due to high-dimension of the hyperspectral data, but also make the
training process more stable. To exploit the spatial and spectral prior, we
design a spatial-spectral block (SSB), which consists of a spatial residual
module and a spectral attention residual module. Experimental results on some
hyperspectral images demonstrate that the proposed SSPSR method enhances the
details of the recovered high-resolution hyperspectral images, and outperforms
state-of-the-arts. The source code is available at
\url{https://github.com/junjun-jiang/SSPSRComment: Accepted for publication at IEEE Transactions on Computational
Imagin
Recurrent Generative Adversarial Networks for Proximal Learning and Automated Compressive Image Recovery
Recovering images from undersampled linear measurements typically leads to an
ill-posed linear inverse problem, that asks for proper statistical priors.
Building effective priors is however challenged by the low train and test
overhead dictated by real-time tasks; and the need for retrieving visually
"plausible" and physically "feasible" images with minimal hallucination. To
cope with these challenges, we design a cascaded network architecture that
unrolls the proximal gradient iterations by permeating benefits from generative
residual networks (ResNet) to modeling the proximal operator. A mixture of
pixel-wise and perceptual costs is then deployed to train proximals. The
overall architecture resembles back-and-forth projection onto the intersection
of feasible and plausible images. Extensive computational experiments are
examined for a global task of reconstructing MR images of pediatric patients,
and a more local task of superresolving CelebA faces, that are insightful to
design efficient architectures. Our observations indicate that for MRI
reconstruction, a recurrent ResNet with a single residual block effectively
learns the proximal. This simple architecture appears to significantly
outperform the alternative deep ResNet architecture by 2dB SNR, and the
conventional compressed-sensing MRI by 4dB SNR with 100x faster inference. For
image superresolution, our preliminary results indicate that modeling the
denoising proximal demands deep ResNets.Comment: 11 pages, 11 figure
Deep Learning Techniques for Inverse Problems in Imaging
Recent work in machine learning shows that deep neural networks can be used
to solve a wide variety of inverse problems arising in computational imaging.
We explore the central prevailing themes of this emerging area and present a
taxonomy that can be used to categorize different problems and reconstruction
methods. Our taxonomy is organized along two central axes: (1) whether or not a
forward model is known and to what extent it is used in training and testing,
and (2) whether or not the learning is supervised or unsupervised, i.e.,
whether or not the training relies on access to matched ground truth image and
measurement pairs. We also discuss the trade-offs associated with these
different reconstruction approaches, caveats and common failure modes, plus
open problems and avenues for future work
Deep learning for fast MR imaging: a review for learning reconstruction from incomplete k-space data
Magnetic resonance imaging is a powerful imaging modality that can provide
versatile information but it has a bottleneck problem "slow imaging speed".
Reducing the scanned measurements can accelerate MR imaging with the aid of
powerful reconstruction methods, which have evolved from linear analytic models
to nonlinear iterative ones. The emerging trend in this area is replacing
human-defined signal models with that learned from data. Specifically, from
2016, deep learning has been incorporated into the fast MR imaging task, which
draws valuable prior knowledge from big datasets to facilitate accurate MR
image reconstruction from limited measurements. This survey aims to review deep
learning based MR image reconstruction works from 2016- June 2020 and will
discuss merits, limitations and challenges associated with such methods. Last
but not least, this paper will provide a starting point for researchers
interested in contributing to this field by pointing out good tutorial
resources, state-of-the-art open-source codes and meaningful data sources.Comment: Invited review submitted to Biomedical signal processing and control
in Jan 202
- …