21,962 research outputs found
Light Weight Color Image Warping with Inter-Channel Information
Image warping is a necessary step in many multimedia applications such as
texture mapping, image-based rendering, panorama stitching, image resizing and
optical flow computation etc. Traditionally, color image warping interpolation
is performed in each color channel independently. In this paper, we show that
the warping quality can be significantly enhanced by exploiting the
cross-channel correlation. We design a warping scheme that integrates
intra-channel interpolation with cross-channel variation at very low
computational cost, which is required for interactive multimedia applications
on mobile devices. The effectiveness and efficiency of our method are validated
by extensive experiments
A Unified Framework for Multi-Sensor HDR Video Reconstruction
One of the most successful approaches to modern high quality HDR-video
capture is to use camera setups with multiple sensors imaging the scene through
a common optical system. However, such systems pose several challenges for HDR
reconstruction algorithms. Previous reconstruction techniques have considered
debayering, denoising, resampling (align- ment) and exposure fusion as separate
problems. In contrast, in this paper we present a unifying approach, performing
HDR assembly directly from raw sensor data. Our framework includes a camera
noise model adapted to HDR video and an algorithm for spatially adaptive HDR
reconstruction based on fitting of local polynomial approximations to observed
sensor data. The method is easy to implement and allows reconstruction to an
arbitrary resolution and output mapping. We present an implementation in CUDA
and show real-time performance for an experimental 4 Mpixel multi-sensor HDR
video system. We further show that our algorithm has clear advantages over
existing methods, both in terms of flexibility and reconstruction quality
Pyramid Attention Networks for Image Restoration
Self-similarity refers to the image prior widely used in image restoration
algorithms that small but similar patterns tend to occur at different locations
and scales. However, recent advanced deep convolutional neural network based
methods for image restoration do not take full advantage of self-similarities
by relying on self-attention neural modules that only process information at
the same scale. To solve this problem, we present a novel Pyramid Attention
module for image restoration, which captures long-range feature correspondences
from a multi-scale feature pyramid. Inspired by the fact that corruptions, such
as noise or compression artifacts, drop drastically at coarser image scales,
our attention module is designed to be able to borrow clean signals from their
"clean" correspondences at the coarser levels. The proposed pyramid attention
module is a generic building block that can be flexibly integrated into various
neural architectures. Its effectiveness is validated through extensive
experiments on multiple image restoration tasks: image denoising, demosaicing,
compression artifact reduction, and super resolution. Without any bells and
whistles, our PANet (pyramid attention module with simple network backbones)
can produce state-of-the-art results with superior accuracy and visual quality.
Our code will be available at
https://github.com/SHI-Labs/Pyramid-Attention-Network
Three-dimensional virtual refocusing of fluorescence microscopy images using deep learning
Three-dimensional (3D) fluorescence microscopy in general requires axial
scanning to capture images of a sample at different planes. Here we demonstrate
that a deep convolutional neural network can be trained to virtually refocus a
2D fluorescence image onto user-defined 3D surfaces within the sample volume.
With this data-driven computational microscopy framework, we imaged the neuron
activity of a Caenorhabditis elegans worm in 3D using a time-sequence of
fluorescence images acquired at a single focal plane, digitally increasing the
depth-of-field of the microscope by 20-fold without any axial scanning,
additional hardware, or a trade-off of imaging resolution or speed.
Furthermore, we demonstrate that this learning-based approach can correct for
sample drift, tilt, and other image aberrations, all digitally performed after
the acquisition of a single fluorescence image. This unique framework also
cross-connects different imaging modalities to each other, enabling 3D
refocusing of a single wide-field fluorescence image to match confocal
microscopy images acquired at different sample planes. This deep learning-based
3D image refocusing method might be transformative for imaging and tracking of
3D biological samples, especially over extended periods of time, mitigating
photo-toxicity, sample drift, aberration and defocusing related challenges
associated with standard 3D fluorescence microscopy techniques.Comment: 47 pages, 5 figures (main text
NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image
This paper reviews the second challenge on spectral reconstruction from RGB
images, i.e., the recovery of whole-scene hyperspectral (HS) information from a
3-channel RGB image. As in the previous challenge, two tracks were provided:
(i) a "Clean" track where HS images are estimated from noise-free RGBs, the RGB
images are themselves calculated numerically using the ground-truth HS images
and supplied spectral sensitivity functions (ii) a "Real World" track,
simulating capture by an uncalibrated and unknown camera, where the HS images
are recovered from noisy JPEG-compressed RGB images. A new, larger-than-ever,
natural hyperspectral image data set is presented, containing a total of 510 HS
images. The Clean and Real World tracks had 103 and 78 registered participants
respectively, with 14 teams competing in the final testing phase. A description
of the proposed methods, alongside their challenge scores and an extensive
evaluation of top performing methods is also provided. They gauge the
state-of-the-art in spectral reconstruction from an RGB image
Sub-Pixel Registration of Wavelet-Encoded Images
Sub-pixel registration is a crucial step for applications such as
super-resolution in remote sensing, motion compensation in magnetic resonance
imaging, and non-destructive testing in manufacturing, to name a few. Recently,
these technologies have been trending towards wavelet encoded imaging and
sparse/compressive sensing. The former plays a crucial role in reducing imaging
artifacts, while the latter significantly increases the acquisition speed. In
view of these new emerging needs for applications of wavelet encoded imaging,
we propose a sub-pixel registration method that can achieve direct wavelet
domain registration from a sparse set of coefficients. We make the following
contributions: (i) We devise a method of decoupling scale, rotation, and
translation parameters in the Haar wavelet domain, (ii) We derive explicit
mathematical expressions that define in-band sub-pixel registration in terms of
wavelet coefficients, (iii) Using the derived expressions, we propose an
approach to achieve in-band subpixel registration, avoiding back and forth
transformations. (iv) Our solution remains highly accurate even when a sparse
set of coefficients are used, which is due to localization of signals in a
sparse set of wavelet coefficients. We demonstrate the accuracy of our method,
and show that it outperforms the state-of-the-art on simulated and real data,
even when the data is sparse
Finding Correspondences for Optical Flow and Disparity Estimations using a Sub-pixel Convolution-based Encoder-Decoder Network
Deep convolutional neural networks (DCNN) have recently shown promising
results in low-level computer vision problems such as optical flow and
disparity estimation, but still, have much room to further improve their
performance. In this paper, we propose a novel sub-pixel convolution-based
encoder-decoder network for optical flow and disparity estimations, which can
extend FlowNetS and DispNet by replacing the deconvolution layers with
sup-pixel convolution blocks. By using sub-pixel refinement and estimation on
the decoder stages instead of deconvolution, we can significantly improve the
estimation accuracy for optical flow and disparity, even with reduced numbers
of parameters. We show a supervised end-to-end training of our proposed
networks for optical flow and disparity estimations, and an unsupervised
end-to-end training for monocular depth and pose estimations. In order to
verify the effectiveness of our proposed networks, we perform intensive
experiments for (i) optical flow and disparity estimations, and (ii) monocular
depth and pose estimations. Throughout the extensive experiments, our proposed
networks outperform the baselines such as FlowNetS and DispNet in terms of
estimation accuracy and training times
Image quality assessment for determining efficacy and limitations of Super-Resolution Convolutional Neural Network (SRCNN)
Traditional metrics for evaluating the efficacy of image processing
techniques do not lend themselves to understanding the capabilities and
limitations of modern image processing methods - particularly those enabled by
deep learning. When applying image processing in engineering solutions, a
scientist or engineer has a need to justify their design decisions with clear
metrics. By applying blind/referenceless image spatial quality (BRISQUE),
Structural SIMilarity (SSIM) index scores, and Peak signal-to-noise ratio
(PSNR) to images before and after image processing, we can quantify quality
improvements in a meaningful way and determine the lowest recoverable image
quality for a given method
Neural Imaging Pipelines - the Scourge or Hope of Forensics?
Forensic analysis of digital photographs relies on intrinsic statistical
traces introduced at the time of their acquisition or subsequent editing. Such
traces are often removed by post-processing (e.g., down-sampling and
re-compression applied upon distribution in the Web) which inhibits reliable
provenance analysis. Increasing adoption of computational methods within
digital cameras further complicates the process and renders explicit
mathematical modeling infeasible. While this trend challenges forensic analysis
even in near-acquisition conditions, it also creates new opportunities. This
paper explores end-to-end optimization of the entire image acquisition and
distribution workflow to facilitate reliable forensic analysis at the end of
the distribution channel, where state-of-the-art forensic techniques fail. We
demonstrate that a neural network can be trained to replace the entire photo
development pipeline, and jointly optimized for high-fidelity photo rendering
and reliable provenance analysis. Such optimized neural imaging pipeline
allowed us to increase image manipulation detection accuracy from approx. 45%
to over 90%. The network learns to introduce carefully crafted artifacts, akin
to digital watermarks, which facilitate subsequent manipulation detection.
Analysis of performance trade-offs indicates that most of the gains can be
obtained with only minor distortion. The findings encourage further research
towards building more reliable imaging pipelines with explicit
provenance-guaranteeing properties.Comment: Manuscript + supplement; currently under review; compressed figures
to minimize file size. arXiv admin note: text overlap with arXiv:1812.0151
Aerial Spectral Super-Resolution using Conditional Adversarial Networks
Inferring spectral signatures from ground based natural images has acquired a
lot of interest in applied deep learning. In contrast to the spectra of ground
based images, aerial spectral images have low spatial resolution and suffer
from higher noise interference. In this paper, we train a conditional
adversarial network to learn an inverse mapping from a trichromatic space to 31
spectral bands within 400 to 700 nm. The network is trained on AeroCampus, a
first of its kind aerial hyperspectral dataset. AeroCampus consists of high
spatial resolution color images and low spatial resolution hyperspectral images
(HSI). Color images synthesized from 31 spectral bands are used to train our
network. With a baseline root mean square error of 2.48 on the synthesized RGB
test data, we show that it is possible to generate spectral signatures in
aerial imagery
- …