890 research outputs found
Video Frame Interpolation for High Dynamic Range Sequences Captured with Dual-exposure Sensors
Video frame interpolation (VFI) enables many important applications thatmight involve the temporal domain, such as slow motion playback, or the spatialdomain, such as stop motion sequences. We are focusing on the former task,where one of the key challenges is handling high dynamic range (HDR) scenes inthe presence of complex motion. To this end, we explore possible advantages ofdual-exposure sensors that readily provide sharp short and blurry longexposures that are spatially registered and whose ends are temporally aligned.This way, motion blur registers temporally continuous information on the scenemotion that, combined with the sharp reference, enables more precise motionsampling within a single camera shot. We demonstrate that this facilitates amore complex motion reconstruction in the VFI task, as well as HDR framereconstruction that so far has been considered only for the originally capturedframes, not in-between interpolated frames. We design a neural network trainedin these tasks that clearly outperforms existing solutions. We also propose ametric for scene motion complexity that provides important insights into theperformance of VFI methods at the test time.<br
HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains
High dynamic range (HDR) video reconstruction is attracting more and more
attention due to the superior visual quality compared with those of low dynamic
range (LDR) videos. The availability of LDR-HDR training pairs is essential for
the HDR reconstruction quality. However, there are still no real LDR-HDR pairs
for dynamic scenes due to the difficulty in capturing LDR-HDR frames
simultaneously. In this work, we propose to utilize a staggered sensor to
capture two alternate exposure images simultaneously, which are then fused into
an HDR frame in both raw and sRGB domains. In this way, we build a large scale
LDR-HDR video dataset with 85 scenes and each scene contains 60 frames. Based
on this dataset, we further propose a Raw-HDRNet, which utilizes the raw LDR
frames as inputs. We propose a pyramid flow-guided deformation convolution to
align neighboring frames. Experimental results demonstrate that 1) the proposed
dataset can improve the HDR reconstruction performance on real scenes for three
benchmark networks; 2) Compared with sRGB inputs, utilizing raw inputs can
further improve the reconstruction quality and our proposed Raw-HDRNet is a
strong baseline for raw HDR reconstruction. Our dataset and code will be
released after the acceptance of this paper
Convolutional sparse coding for high dynamic range imaging
Current HDR acquisition techniques are based on either (i) fusing multibracketed, low dynamic range (LDR) images, (ii) modifying existing hardware and capturing different exposures simultaneously with multiple sensors, or (iii) reconstructing a single image with spatially-varying pixel exposures. In this paper, we propose a novel algorithm to recover high-quality HDRI images from a single, coded exposure. The proposed reconstruction method builds on recently-introduced ideas of convolutional sparse coding (CSC); this paper demonstrates how to make CSC practical for HDR imaging. We demonstrate that the proposed algorithm achieves higher-quality reconstructions than alternative methods, we evaluate optical coding schemes, analyze algorithmic parameters, and build a prototype coded HDR camera that demonstrates the utility of convolutional sparse HDRI coding with a custom hardware platform
Contrastive Learning Based Recursive Dynamic Multi-Scale Network for Image Deraining
Rain streaks significantly decrease the visibility of captured images and are
also a stumbling block that restricts the performance of subsequent computer
vision applications. The existing deep learning-based image deraining methods
employ manually crafted networks and learn a straightforward projection from
rainy images to clear images. In pursuit of better deraining performance, they
focus on elaborating a more complicated architecture rather than exploiting the
intrinsic properties of the positive and negative information. In this paper,
we propose a contrastive learning-based image deraining method that
investigates the correlation between rainy and clear images and leverages a
contrastive prior to optimize the mutual information of the rainy and restored
counterparts. Given the complex and varied real-world rain patterns, we develop
a recursive mechanism. It involves multi-scale feature extraction and dynamic
cross-level information recruitment modules. The former advances the portrayal
of diverse rain patterns more precisely, while the latter can selectively
compensate high-level features for shallow-level information. We term the
proposed recursive dynamic multi-scale network with a contrastive prior, RDMC.
Extensive experiments on synthetic benchmarks and real-world images demonstrate
that the proposed RDMC delivers strong performance on the depiction of rain
streaks and outperforms the state-of-the-art methods. Moreover, a practical
evaluation of object detection and semantic segmentation shows the
effectiveness of the proposed method.Comment: 13 pages, 16 figure
Towards Scalable Multi-View Reconstruction of Geometry and Materials
In this paper, we propose a novel method for joint recovery of camera pose,
object geometry and spatially-varying Bidirectional Reflectance Distribution
Function (svBRDF) of 3D scenes that exceed object-scale and hence cannot be
captured with stationary light stages. The input are high-resolution RGB-D
images captured by a mobile, hand-held capture system with point lights for
active illumination. Compared to previous works that jointly estimate geometry
and materials from a hand-held scanner, we formulate this problem using a
single objective function that can be minimized using off-the-shelf
gradient-based solvers. To facilitate scalability to large numbers of
observation views and optimization variables, we introduce a distributed
optimization algorithm that reconstructs 2.5D keyframe-based representations of
the scene. A novel multi-view consistency regularizer effectively synchronizes
neighboring keyframes such that the local optimization results allow for
seamless integration into a globally consistent 3D model. We provide a study on
the importance of each component in our formulation and show that our method
compares favorably to baselines. We further demonstrate that our method
accurately reconstructs various objects and materials and allows for expansion
to spatially larger scenes. We believe that this work represents a significant
step towards making geometry and material estimation from hand-held scanners
scalable
- …