16,946 research outputs found
3D Capturing with Monoscopic Camera
This article presents a new concept of using the auto-focus function of the monoscopic camera sensor to estimate depth map information, which avoids not only using auxiliary equipment or human interaction, but also the introduced computational complexity of SfM or depth analysis. The system architecture that supports both stereo image and video data capturing, processing and display is discussed. A novel stereo image pair generation algorithm by using Z-buffer-based 3D surface recovery is proposed. Based on the depth map, we are able to calculate the disparity map (the distance in pixels between the image points in both views) for the image. The presented algorithm uses a single image with depth information (e.g. z-buffer) as an input and produces two images for left and right eye
Unsupervised Learning of Depth and Ego-Motion from Video
We present an unsupervised learning framework for the task of monocular depth
and camera motion estimation from unstructured video sequences. We achieve this
by simultaneously training depth and camera pose estimation networks using the
task of view synthesis as the supervisory signal. The networks are thus coupled
via the view synthesis objective during training, but can be applied
independently at test time. Empirical evaluation on the KITTI dataset
demonstrates the effectiveness of our approach: 1) monocular depth performing
comparably with supervised methods that use either ground-truth pose or depth
for training, and 2) pose estimation performing favorably with established SLAM
systems under comparable input settings.Comment: Accepted to CVPR 2017. Project webpage:
https://people.eecs.berkeley.edu/~tinghuiz/projects/SfMLearner
Learning to Synthesize a 4D RGBD Light Field from a Single Image
We present a machine learning algorithm that takes as input a 2D RGB image
and synthesizes a 4D RGBD light field (color and depth of the scene in each ray
direction). For training, we introduce the largest public light field dataset,
consisting of over 3300 plenoptic camera light fields of scenes containing
flowers and plants. Our synthesis pipeline consists of a convolutional neural
network (CNN) that estimates scene geometry, a stage that renders a Lambertian
light field using that geometry, and a second CNN that predicts occluded rays
and non-Lambertian effects. Our algorithm builds on recent view synthesis
methods, but is unique in predicting RGBD for each light field ray and
improving unsupervised single image depth estimation by enforcing consistency
of ray depths that should intersect the same scene point. Please see our
supplementary video at https://youtu.be/yLCvWoQLnmsComment: International Conference on Computer Vision (ICCV) 201
Generalized Video Deblurring for Dynamic Scenes
Several state-of-the-art video deblurring methods are based on a strong
assumption that the captured scenes are static. These methods fail to deblur
blurry videos in dynamic scenes. We propose a video deblurring method to deal
with general blurs inherent in dynamic scenes, contrary to other methods. To
handle locally varying and general blurs caused by various sources, such as
camera shake, moving objects, and depth variation in a scene, we approximate
pixel-wise kernel with bidirectional optical flows. Therefore, we propose a
single energy model that simultaneously estimates optical flows and latent
frames to solve our deblurring problem. We also provide a framework and
efficient solvers to optimize the energy model. By minimizing the proposed
energy function, we achieve significant improvements in removing blurs and
estimating accurate optical flows in blurry frames. Extensive experimental
results demonstrate the superiority of the proposed method in real and
challenging videos that state-of-the-art methods fail in either deblurring or
optical flow estimation.Comment: CVPR 2015 ora
- …