122 research outputs found
EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images
Light field cameras capture both the spatial and the angular properties of
light rays in space. Due to its property, one can compute the depth from light
fields in uncontrolled lighting environments, which is a big advantage over
active sensing devices. Depth computed from light fields can be used for many
applications including 3D modelling and refocusing. However, light field images
from hand-held cameras have very narrow baselines with noise, making the depth
estimation difficult. any approaches have been proposed to overcome these
limitations for the light field depth estimation, but there is a clear
trade-off between the accuracy and the speed in these methods. In this paper,
we introduce a fast and accurate light field depth estimation method based on a
fully-convolutional neural network. Our network is designed by considering the
light field geometry and we also overcome the lack of training data by
proposing light field specific data augmentation methods. We achieved the top
rank in the HCI 4D Light Field Benchmark on most metrics, and we also
demonstrate the effectiveness of the proposed method on real-world light-field
images.Comment: Accepted to CVPR 2018, Total 10 page
Light Field Depth Estimation Based on Stitched-EPI
Depth estimation is one of the most essential problems for light field
applications. In EPI-based methods, the slope computation usually suffers low
accuracy due to the discretization error and low angular resolution. In
addition, recent methods work well in most regions but often struggle with
blurry edges over occluded regions and ambiguity over texture-less regions. To
address these challenging issues, we first propose the stitched-EPI and
half-stitched-EPI algorithms for non-occluded and occluded regions,
respectively. The algorithms improve slope computation by shifting and
concatenating lines in different EPIs but related to the same point in 3D
scene, while the half-stitched-EPI only uses non-occluded part of lines.
Combined with the joint photo-consistency cost proposed by us, the more
accurate and robust depth map can be obtained in both occluded and non-occluded
regions. Furthermore, to improve the depth estimation in texture-less regions,
we propose a depth propagation strategy that determines their depth from the
edge to interior, from accurate regions to coarse regions. Experimental and
ablation results demonstrate that the proposed method achieves accurate and
robust depth maps in all regions effectively.Comment: 15 page
VommaNet: an End-to-End Network for Disparity Estimation from Reflective and Texture-less Light Field Images
The precise combination of image sensor and micro-lens array enables lenslet
light field cameras to record both angular and spatial information of incoming
light, therefore, one can calculate disparity and depth from light field
images. In turn, 3D models of the recorded objects can be recovered, which is a
great advantage over other imaging system. However, reflective and texture-less
areas in light field images have complicated conditions, making it hard to
correctly calculate disparity with existing algorithms. To tackle this problem,
we introduce a novel end-to-end network VommaNet to retrieve multi-scale
features from reflective and texture-less regions for accurate disparity
estimation. Meanwhile, our network has achieved similar or better performance
in other regions for both synthetic light field images and real-world data
compared to the state-of-the-art algorithms. Currently, we achieve the best
score for mean squared error (MSE) on HCI 4D Light Field Benchmark
Light field image processing: an overview
Light field imaging has emerged as a technology allowing to capture richer visual information from our world. As opposed to traditional photography, which captures a 2D projection of the light in the scene integrating the angular domain, light fields collect radiance from rays in all directions, demultiplexing the angular information lost in conventional photography. On the one hand, this higher dimensional representation of visual data offers powerful capabilities for scene understanding, and substantially improves the performance of traditional computer vision problems such as depth sensing, post-capture refocusing, segmentation, video stabilization, material classification, etc. On the other hand, the high-dimensionality of light fields also brings up new challenges in terms of data capture, data compression, content editing, and display. Taking these two elements together, research in light field image processing has become increasingly popular in the computer vision, computer graphics, and signal processing communities. In this paper, we present a comprehensive overview and discussion of research in this field over the past 20 years. We focus on all aspects of light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data
Light field reconstruction from multi-view images
Kang Han studied recovering the 3D world from multi-view images. He proposed several algorithms to deal with occlusions in depth estimation and effective representations in view rendering. the proposed algorithms can be used for many innovative applications based on machine intelligence, such as autonomous driving and Metaverse
Unsupervised Light Field Depth Estimation via Multi-view Feature Matching with Occlusion Prediction
Depth estimation from light field (LF) images is a fundamental step for some
applications. Recently, learning-based methods have achieved higher accuracy
and efficiency than the traditional methods. However, it is costly to obtain
sufficient depth labels for supervised training. In this paper, we propose an
unsupervised framework to estimate depth from LF images. First, we design a
disparity estimation network (DispNet) with a coarse-to-fine structure to
predict disparity maps from different view combinations by performing
multi-view feature matching to learn the correspondences more effectively. As
occlusions may cause the violation of photo-consistency, we design an occlusion
prediction network (OccNet) to predict the occlusion maps, which are used as
the element-wise weights of photometric loss to solve the occlusion issue and
assist the disparity learning. With the disparity maps estimated by multiple
input combinations, we propose a disparity fusion strategy based on the
estimated errors with effective occlusion handling to obtain the final
disparity map. Experimental results demonstrate that our method achieves
superior performance on both the dense and sparse LF images, and also has
better generalization ability to the real-world LF images
- …