8,820 research outputs found
Spatial and Angular Resolution Enhancement of Light Fields Using Convolutional Neural Networks
Light field imaging extends the traditional photography by capturing both
spatial and angular distribution of light, which enables new capabilities,
including post-capture refocusing, post-capture aperture control, and depth
estimation from a single shot. Micro-lens array (MLA) based light field cameras
offer a cost-effective approach to capture light field. A major drawback of MLA
based light field cameras is low spatial resolution, which is due to the fact
that a single image sensor is shared to capture both spatial and angular
information. In this paper, we present a learning based light field enhancement
approach. Both spatial and angular resolution of captured light field is
enhanced using convolutional neural networks. The proposed method is tested
with real light field data captured with a Lytro light field camera, clearly
demonstrating spatial and angular resolution improvement
PlaneDepth: Plane-Based Self-Supervised Monocular Depth Estimation
Self-supervised monocular depth estimation refers to training a monocular
depth estimation (MDE) network using only RGB images to overcome the difficulty
of collecting dense ground truth depth. Many previous works addressed this
problem using depth classification or depth regression. However, depth
classification tends to fall into local minima due to the bilinear
interpolation search on the target view. Depth classification overcomes this
problem using pre-divided depth bins, but those depth candidates lead to
discontinuities in the final depth result, and using the same probability for
weighted summation of color and depth is ambiguous. To overcome these
limitations, we use some predefined planes that are parallel to the ground,
allowing us to automatically segment the ground and predict continuous depth
for it. We further model depth as a mixture Laplace distribution, which
provides a more certain objective for optimization. Previous works have shown
that MDE networks only use the vertical image position of objects to estimate
the depth and ignore relative sizes. We address this problem for the first time
in both stereo and monocular training using resize cropping data augmentation.
Based on our analysis of resize cropping, we combine it with our plane
definition and improve our training strategy so that the network could learn
the relationship between depth and both the vertical image position and
relative size of objects. We further combine the self-distillation stage with
post-processing to provide more accurate supervision and save extra time in
post-processing. We conduct extensive experiments to demonstrate the
effectiveness of our analysis and improvements.Comment: 12 pages, 7 figure
- …