813 research outputs found
3D Face Reconstruction from Light Field Images: A Model-free Approach
Reconstructing 3D facial geometry from a single RGB image has recently
instigated wide research interest. However, it is still an ill-posed problem
and most methods rely on prior models hence undermining the accuracy of the
recovered 3D faces. In this paper, we exploit the Epipolar Plane Images (EPI)
obtained from light field cameras and learn CNN models that recover horizontal
and vertical 3D facial curves from the respective horizontal and vertical EPIs.
Our 3D face reconstruction network (FaceLFnet) comprises a densely connected
architecture to learn accurate 3D facial curves from low resolution EPIs. To
train the proposed FaceLFnets from scratch, we synthesize photo-realistic light
field images from 3D facial scans. The curve by curve 3D face estimation
approach allows the networks to learn from only 14K images of 80 identities,
which still comprises over 11 Million EPIs/curves. The estimated facial curves
are merged into a single pointcloud to which a surface is fitted to get the
final 3D face. Our method is model-free, requires only a few training samples
to learn FaceLFnet and can reconstruct 3D faces with high accuracy from single
light field images under varying poses, expressions and lighting conditions.
Comparison on the BU-3DFE and BU-4DFE datasets show that our method reduces
reconstruction errors by over 20% compared to recent state of the art
Light Field Denoising via Anisotropic Parallax Analysis in a CNN Framework
Light field (LF) cameras provide perspective information of scenes by taking
directional measurements of the focusing light rays. The raw outputs are
usually dark with additive camera noise, which impedes subsequent processing
and applications. We propose a novel LF denoising framework based on
anisotropic parallax analysis (APA). Two convolutional neural networks are
jointly designed for the task: first, the structural parallax synthesis network
predicts the parallax details for the entire LF based on a set of anisotropic
parallax features. These novel features can efficiently capture the high
frequency perspective components of a LF from noisy observations. Second, the
view-dependent detail compensation network restores non-Lambertian variation to
each LF view by involving view-specific spatial energies. Extensive experiments
show that the proposed APA LF denoiser provides a much better denoising
performance than state-of-the-art methods in terms of visual quality and in
preservation of parallax details
Learning based Deep Disentangling Light Field Reconstruction and Disparity Estimation Application
Light field cameras have a wide range of uses due to their ability to
simultaneously record light intensity and direction. The angular resolution of
light fields is important for downstream tasks such as depth estimation, yet is
often difficult to improve due to hardware limitations. Conventional methods
tend to perform poorly against the challenge of large disparity in sparse light
fields, while general CNNs have difficulty extracting spatial and angular
features coupled together in 4D light fields. The light field disentangling
mechanism transforms the 4D light field into 2D image format, which is more
favorable for CNN for feature extraction. In this paper, we propose a Deep
Disentangling Mechanism, which inherits the principle of the light field
disentangling mechanism and further develops the design of the feature
extractor and adds advanced network structure. We design a light-field
reconstruction network (i.e., DDASR) on the basis of the Deep Disentangling
Mechanism, and achieve SOTA performance in the experiments. In addition, we
design a Block Traversal Angular Super-Resolution Strategy for the practical
application of depth estimation enhancement where the input views is often
higher than 2x2 in the experiments resulting in a high memory usage, which can
reduce the memory usage while having a better reconstruction performance
Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs
Human visual system relies on both binocular stereo cues and monocular
focusness cues to gain effective 3D perception. In computer vision, the two
problems are traditionally solved in separate tracks. In this paper, we present
a unified learning-based technique that simultaneously uses both types of cues
for depth inference. Specifically, we use a pair of focal stacks as input to
emulate human perception. We first construct a comprehensive focal stack
training dataset synthesized by depth-guided light field rendering. We then
construct three individual networks: a Focus-Net to extract depth from a single
focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from
the focal stack, and a Stereo-Net to conduct stereo matching. We show how to
integrate them into a unified BDfF-Net to obtain high-quality depth maps.
Comprehensive experiments show that our approach outperforms the
state-of-the-art in both accuracy and speed and effectively emulates human
vision systems
Probabilistic-based Feature Embedding of 4-D Light Fields for Compressive Imaging and Denoising
The high-dimensional nature of the 4-D light field (LF) poses great
challenges in achieving efficient and effective feature embedding, that
severely impacts the performance of downstream tasks. To tackle this crucial
issue, in contrast to existing methods with empirically-designed architectures,
we propose a probabilistic-based feature embedding (PFE), which learns a
feature embedding architecture by assembling various low-dimensional
convolution patterns in a probability space for fully capturing spatial-angular
information. Building upon the proposed PFE, we then leverage the intrinsic
linear imaging model of the coded aperture camera to construct a
cycle-consistent 4-D LF reconstruction network from coded measurements.
Moreover, we incorporate PFE into an iterative optimization framework for 4-D
LF denoising. Our extensive experiments demonstrate the significant superiority
of our methods on both real-world and synthetic 4-D LF images, both
quantitatively and qualitatively, when compared with state-of-the-art methods.
The source code will be publicly available at
https://github.com/lyuxianqiang/LFCA-CR-NET
Accurate Light Field Depth Estimation with Superpixel Regularization over Partially Occluded Regions
Depth estimation is a fundamental problem for light field photography
applications. Numerous methods have been proposed in recent years, which either
focus on crafting cost terms for more robust matching, or on analyzing the
geometry of scene structures embedded in the epipolar-plane images. Significant
improvements have been made in terms of overall depth estimation error;
however, current state-of-the-art methods still show limitations in handling
intricate occluding structures and complex scenes with multiple occlusions. To
address these challenging issues, we propose a very effective depth estimation
framework which focuses on regularizing the initial label confidence map and
edge strength weights. Specifically, we first detect partially occluded
boundary regions (POBR) via superpixel based regularization. Series of
shrinkage/reinforcement operations are then applied on the label confidence map
and edge strength weights over the POBR. We show that after weight
manipulations, even a low-complexity weighted least squares model can produce
much better depth estimation than state-of-the-art methods in terms of average
disparity error rate, occlusion boundary precision-recall rate, and the
preservation of intricate visual features
- …