8,368 research outputs found
Light field reconstruction from multi-view images
Kang Han studied recovering the 3D world from multi-view images. He proposed several algorithms to deal with occlusions in depth estimation and effective representations in view rendering. the proposed algorithms can be used for many innovative applications based on machine intelligence, such as autonomous driving and Metaverse
Hyperpixels: Flexible 4D over-segmentation for dense and sparse light fields
4D Light Field (LF) imaging, since it conveys both spatial and angular scene information, can facilitate computer vision tasks and generate immersive experiences for end-users. A key challenge in 4D LF imaging is to flexibly and adaptively represent the included spatio-angular information to facilitate subsequent computer vision applications. Recently, image over-segmentation into homogenous regions with perceptually meaningful information has been exploited to represent 4D LFs. However, existing methods assume densely sampled LFs and do not adequately deal with sparse LFs with large occlusions. Furthermore, the spatio-angular LF cues are not fully exploited in the existing methods. In this paper, the concept of hyperpixels is defined and a flexible, automatic, and adaptive representation
for both dense and sparse 4D LFs is proposed. Initially, disparity maps are estimated for all views to enhance over-segmentation accuracy and consistency. Afterwards, a modified weighted K-means clustering using robust spatio-angular features is performed in 4D Euclidean space. Experimental results on several dense and sparse 4D LF datasets show competitive and outperforming performance in terms of over-segmentation accuracy, shape regularity and view consistency against state-of-the-art methods.info:eu-repo/semantics/publishedVersio
Building with Drones: Accurate 3D Facade Reconstruction using MAVs
Automatic reconstruction of 3D models from images using multi-view
Structure-from-Motion methods has been one of the most fruitful outcomes of
computer vision. These advances combined with the growing popularity of Micro
Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools
ubiquitous for large number of Architecture, Engineering and Construction
applications among audiences, mostly unskilled in computer vision. However, to
obtain high-resolution and accurate reconstructions from a large-scale object
using SfM, there are many critical constraints on the quality of image data,
which often become sources of inaccuracy as the current 3D reconstruction
pipelines do not facilitate the users to determine the fidelity of input data
during the image acquisition. In this paper, we present and advocate a
closed-loop interactive approach that performs incremental reconstruction in
real-time and gives users an online feedback about the quality parameters like
Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We
also propose a novel multi-scale camera network design to prevent scene drift
caused by incremental map building, and release the first multi-scale image
sequence dataset as a benchmark. Further, we evaluate our system on real
outdoor scenes, and show that our interactive pipeline combined with a
multi-scale camera network approach provides compelling accuracy in multi-view
reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and
Automation (ICRA '15), Seattle, WA, US
Light Field Super-Resolution Via Graph-Based Regularization
Light field cameras capture the 3D information in a scene with a single
exposure. This special feature makes light field cameras very appealing for a
variety of applications: from post-capture refocus, to depth estimation and
image-based rendering. However, light field cameras suffer by design from
strong limitations in their spatial resolution, which should therefore be
augmented by computational methods. On the one hand, off-the-shelf single-frame
and multi-frame super-resolution algorithms are not ideal for light field data,
as they do not consider its particular structure. On the other hand, the few
super-resolution algorithms explicitly tailored for light field data exhibit
significant limitations, such as the need to estimate an explicit disparity map
at each view. In this work we propose a new light field super-resolution
algorithm meant to address these limitations. We adopt a multi-frame alike
super-resolution approach, where the complementary information in the different
light field views is used to augment the spatial resolution of the whole light
field. We show that coupling the multi-frame approach with a graph regularizer,
that enforces the light field structure via nonlocal self similarities, permits
to avoid the costly and challenging disparity estimation step for all the
views. Extensive experiments show that the new algorithm compares favorably to
the other state-of-the-art methods for light field super-resolution, both in
terms of PSNR and visual quality.Comment: This new version includes more material. In particular, we added: a
new section on the computational complexity of the proposed algorithm,
experimental comparisons with a CNN-based super-resolution algorithm, and new
experiments on a third datase
Unsupervised Light Field Depth Estimation via Multi-view Feature Matching with Occlusion Prediction
Depth estimation from light field (LF) images is a fundamental step for some
applications. Recently, learning-based methods have achieved higher accuracy
and efficiency than the traditional methods. However, it is costly to obtain
sufficient depth labels for supervised training. In this paper, we propose an
unsupervised framework to estimate depth from LF images. First, we design a
disparity estimation network (DispNet) with a coarse-to-fine structure to
predict disparity maps from different view combinations by performing
multi-view feature matching to learn the correspondences more effectively. As
occlusions may cause the violation of photo-consistency, we design an occlusion
prediction network (OccNet) to predict the occlusion maps, which are used as
the element-wise weights of photometric loss to solve the occlusion issue and
assist the disparity learning. With the disparity maps estimated by multiple
input combinations, we propose a disparity fusion strategy based on the
estimated errors with effective occlusion handling to obtain the final
disparity map. Experimental results demonstrate that our method achieves
superior performance on both the dense and sparse LF images, and also has
better generalization ability to the real-world LF images
Deep Learning for Single Image Super-Resolution: A Brief Review
Single image super-resolution (SISR) is a notoriously challenging ill-posed
problem, which aims to obtain a high-resolution (HR) output from one of its
low-resolution (LR) versions. To solve the SISR problem, recently powerful deep
learning algorithms have been employed and achieved the state-of-the-art
performance. In this survey, we review representative deep learning-based SISR
methods, and group them into two categories according to their major
contributions to two essential aspects of SISR: the exploration of efficient
neural network architectures for SISR, and the development of effective
optimization objectives for deep SISR learning. For each category, a baseline
is firstly established and several critical limitations of the baseline are
summarized. Then representative works on overcoming these limitations are
presented based on their original contents as well as our critical
understandings and analyses, and relevant comparisons are conducted from a
variety of perspectives. Finally we conclude this review with some vital
current challenges and future trends in SISR leveraging deep learning
algorithms.Comment: Accepted by IEEE Transactions on Multimedia (TMM
- …