755 research outputs found
Deep Depth From Focus
Depth from focus (DFF) is one of the classical ill-posed inverse problems in
computer vision. Most approaches recover the depth at each pixel based on the
focal setting which exhibits maximal sharpness. Yet, it is not obvious how to
reliably estimate the sharpness level, particularly in low-textured areas. In
this paper, we propose `Deep Depth From Focus (DDFF)' as the first end-to-end
learning approach to this problem. One of the main challenges we face is the
hunger for data of deep neural networks. In order to obtain a significant
amount of focal stacks with corresponding groundtruth depth, we propose to
leverage a light-field camera with a co-calibrated RGB-D sensor. This allows us
to digitally create focal stacks of varying sizes. Compared to existing
benchmarks our dataset is 25 times larger, enabling the use of machine learning
for this inverse problem. We compare our results with state-of-the-art DFF
methods and we also analyze the effect of several key deep architectural
components. These experiments show that our proposed method `DDFFNet' achieves
state-of-the-art performance in all scenes, reducing depth error by more than
75% compared to the classical DFF methods.Comment: accepted to Asian Conference on Computer Vision (ACCV) 201
A 4D Light-Field Dataset and CNN Architectures for Material Recognition
We introduce a new light-field dataset of materials, and take advantage of
the recent success of deep learning to perform material recognition on the 4D
light-field. Our dataset contains 12 material categories, each with 100 images
taken with a Lytro Illum, from which we extract about 30,000 patches in total.
To the best of our knowledge, this is the first mid-size dataset for
light-field images. Our main goal is to investigate whether the additional
information in a light-field (such as multiple sub-aperture views and
view-dependent reflectance effects) can aid material recognition. Since
recognition networks have not been trained on 4D images before, we propose and
compare several novel CNN architectures to train on light-field images. In our
experiments, the best performing CNN architecture achieves a 7% boost compared
with 2D image classification (70% to 77%). These results constitute important
baselines that can spur further research in the use of CNNs for light-field
applications. Upon publication, our dataset also enables other novel
applications of light-fields, including object detection, image segmentation
and view interpolation.Comment: European Conference on Computer Vision (ECCV) 201
Variational Disparity Estimation Framework for Plenoptic Image
This paper presents a computational framework for accurately estimating the
disparity map of plenoptic images. The proposed framework is based on the
variational principle and provides intrinsic sub-pixel precision. The
light-field motion tensor introduced in the framework allows us to combine
advanced robust data terms as well as provides explicit treatments for
different color channels. A warping strategy is embedded in our framework for
tackling the large displacement problem. We also show that by applying a simple
regularization term and a guided median filtering, the accuracy of displacement
field at occluded area could be greatly enhanced. We demonstrate the excellent
performance of the proposed framework by intensive comparisons with the Lytro
software and contemporary approaches on both synthetic and real-world datasets
Hyperpixels: Flexible 4D over-segmentation for dense and sparse light fields
4D Light Field (LF) imaging, since it conveys both spatial and angular scene information, can facilitate computer vision tasks and generate immersive experiences for end-users. A key challenge in 4D LF imaging is to flexibly and adaptively represent the included spatio-angular information to facilitate subsequent computer vision applications. Recently, image over-segmentation into homogenous regions with perceptually meaningful information has been exploited to represent 4D LFs. However, existing methods assume densely sampled LFs and do not adequately deal with sparse LFs with large occlusions. Furthermore, the spatio-angular LF cues are not fully exploited in the existing methods. In this paper, the concept of hyperpixels is defined and a flexible, automatic, and adaptive representation
for both dense and sparse 4D LFs is proposed. Initially, disparity maps are estimated for all views to enhance over-segmentation accuracy and consistency. Afterwards, a modified weighted K-means clustering using robust spatio-angular features is performed in 4D Euclidean space. Experimental results on several dense and sparse 4D LF datasets show competitive and outperforming performance in terms of over-segmentation accuracy, shape regularity and view consistency against state-of-the-art methods.info:eu-repo/semantics/publishedVersio
- …