794 research outputs found
Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs
Human visual system relies on both binocular stereo cues and monocular
focusness cues to gain effective 3D perception. In computer vision, the two
problems are traditionally solved in separate tracks. In this paper, we present
a unified learning-based technique that simultaneously uses both types of cues
for depth inference. Specifically, we use a pair of focal stacks as input to
emulate human perception. We first construct a comprehensive focal stack
training dataset synthesized by depth-guided light field rendering. We then
construct three individual networks: a Focus-Net to extract depth from a single
focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from
the focal stack, and a Stereo-Net to conduct stereo matching. We show how to
integrate them into a unified BDfF-Net to obtain high-quality depth maps.
Comprehensive experiments show that our approach outperforms the
state-of-the-art in both accuracy and speed and effectively emulates human
vision systems
Depth Estimation and Image Restoration by Deep Learning from Defocused Images
Monocular depth estimation and image deblurring are two fundamental tasks in
computer vision, given their crucial role in understanding 3D scenes.
Performing any of them by relying on a single image is an ill-posed problem.
The recent advances in the field of Deep Convolutional Neural Networks (DNNs)
have revolutionized many tasks in computer vision, including depth estimation
and image deblurring. When it comes to using defocused images, the depth
estimation and the recovery of the All-in-Focus (Aif) image become related
problems due to defocus physics. Despite this, most of the existing models
treat them separately. There are, however, recent models that solve these
problems simultaneously by concatenating two networks in a sequence to first
estimate the depth or defocus map and then reconstruct the focused image based
on it. We propose a DNN that solves the depth estimation and image deblurring
in parallel. Our Two-headed Depth Estimation and Deblurring Network (2HDED:NET)
extends a conventional Depth from Defocus (DFD) networks with a deblurring
branch that shares the same encoder as the depth branch. The proposed method
has been successfully tested on two benchmarks, one for indoor and the other
for outdoor scenes: NYU-v2 and Make3D. Extensive experiments with 2HDED:NET on
these benchmarks have demonstrated superior or close performances to those of
the state-of-the-art models for depth estimation and image deblurring
Learning Depth from Focus in the Wild
For better photography, most recent commercial cameras including smartphones
have either adopted large-aperture lens to collect more light or used a burst
mode to take multiple images within short times. These interesting features
lead us to examine depth from focus/defocus.
In this work, we present a convolutional neural network-based depth
estimation from single focal stacks. Our method differs from relevant
state-of-the-art works with three unique features. First, our method allows
depth maps to be inferred in an end-to-end manner even with image alignment.
Second, we propose a sharp region detection module to reduce blur ambiguities
in subtle focus changes and weakly texture-less regions. Third, we design an
effective downsampling module to ease flows of focal information in feature
extractions. In addition, for the generalization of the proposed network, we
develop a simulator to realistically reproduce the features of commercial
cameras, such as changes in field of view, focal length and principal points.
By effectively incorporating these three unique features, our network
achieves the top rank in the DDFF 12-Scene benchmark on most metrics. We also
demonstrate the effectiveness of the proposed method on various quantitative
evaluations and real-world images taken from various off-the-shelf cameras
compared with state-of-the-art methods. Our source code is publicly available
at https://github.com/wcy199705/DfFintheWild
Learnable Blur Kernel for Single-Image Defocus Deblurring in the Wild
Recent research showed that the dual-pixel sensor has made great progress in
defocus map estimation and image defocus deblurring. However, extracting
real-time dual-pixel views is troublesome and complex in algorithm deployment.
Moreover, the deblurred image generated by the defocus deblurring network lacks
high-frequency details, which is unsatisfactory in human perception. To
overcome this issue, we propose a novel defocus deblurring method that uses the
guidance of the defocus map to implement image deblurring. The proposed method
consists of a learnable blur kernel to estimate the defocus map, which is an
unsupervised method, and a single-image defocus deblurring generative
adversarial network (DefocusGAN) for the first time. The proposed network can
learn the deblurring of different regions and recover realistic details. We
propose a defocus adversarial loss to guide this training process. Competitive
experimental results confirm that with a learnable blur kernel, the generated
defocus map can achieve results comparable to supervised methods. In the
single-image defocus deblurring task, the proposed method achieves
state-of-the-art results, especially significant improvements in perceptual
quality, where PSNR reaches 25.56 dB and LPIPS reaches 0.111.Comment: 9 pages, 7 figure
- …