70,267 research outputs found
Depth Assisted Full Resolution Network for Single Image-based View Synthesis
Researches in novel viewpoint synthesis majorly focus on interpolation from
multi-view input images. In this paper, we focus on a more challenging and
ill-posed problem that is to synthesize novel viewpoints from one single input
image. To achieve this goal, we propose a novel deep learning-based technique.
We design a full resolution network that extracts local image features with the
same resolution of the input, which contributes to derive high resolution and
prevent blurry artifacts in the final synthesized images. We also involve a
pre-trained depth estimation network into our system, and thus 3D information
is able to be utilized to infer the flow field between the input and the target
image. Since the depth network is trained by depth order information between
arbitrary pairs of points in the scene, global image features are also involved
into our system. Finally, a synthesis layer is used to not only warp the
observed pixels to the desired positions but also hallucinate the missing
pixels with recorded pixels. Experiments show that our technique performs well
on images of various scenes, and outperforms the state-of-the-art techniques
Single-image Tomography: 3D Volumes from 2D Cranial X-Rays
As many different 3D volumes could produce the same 2D x-ray image, inverting
this process is challenging. We show that recent deep learning-based
convolutional neural networks can solve this task. As the main challenge in
learning is the sheer amount of data created when extending the 2D image into a
3D volume, we suggest firstly to learn a coarse, fixed-resolution volume which
is then fused in a second step with the input x-ray into a high-resolution
volume. To train and validate our approach we introduce a new dataset that
comprises of close to half a million computer-simulated 2D x-ray images of 3D
volumes scanned from 175 mammalian species. Applications of our approach
include stereoscopic rendering of legacy x-ray images, re-rendering of x-rays
including changes of illumination, view pose or geometry. Our evaluation
includes comparison to previous tomography work, previous learning methods
using our data, a user study and application to a set of real x-rays
Learning Shape Priors for Single-View 3D Completion and Reconstruction
The problem of single-view 3D shape completion or reconstruction is
challenging, because among the many possible shapes that explain an
observation, most are implausible and do not correspond to natural objects.
Recent research in the field has tackled this problem by exploiting the
expressiveness of deep convolutional networks. In fact, there is another level
of ambiguity that is often overlooked: among plausible shapes, there are still
multiple shapes that fit the 2D image equally well; i.e., the ground truth
shape is non-deterministic given a single-view input. Existing fully supervised
approaches fail to address this issue, and often produce blurry mean shapes
with smooth surfaces but no fine details.
In this paper, we propose ShapeHD, pushing the limit of single-view shape
completion and reconstruction by integrating deep generative models with
adversarially learned shape priors. The learned priors serve as a regularizer,
penalizing the model only if its output is unrealistic, not if it deviates from
the ground truth. Our design thus overcomes both levels of ambiguity
aforementioned. Experiments demonstrate that ShapeHD outperforms state of the
art by a large margin in both shape completion and shape reconstruction on
multiple real datasets.Comment: ECCV 2018. The first two authors contributed equally to this work.
Project page: http://shapehd.csail.mit.edu
Unsupervised Monocular Depth Estimation with Left-Right Consistency
Learning based methods have shown very promising results for the task of
depth estimation in single images. However, most existing approaches treat
depth prediction as a supervised regression problem and as a result, require
vast quantities of corresponding ground truth depth data for training. Just
recording quality depth data in a range of environments is a challenging
problem. In this paper, we innovate beyond existing approaches, replacing the
use of explicit depth data during training with easier-to-obtain binocular
stereo footage.
We propose a novel training objective that enables our convolutional neural
network to learn to perform single image depth estimation, despite the absence
of ground truth depth data. Exploiting epipolar geometry constraints, we
generate disparity images by training our network with an image reconstruction
loss. We show that solving for image reconstruction alone results in poor
quality depth images. To overcome this problem, we propose a novel training
loss that enforces consistency between the disparities produced relative to
both the left and right images, leading to improved performance and robustness
compared to existing approaches. Our method produces state of the art results
for monocular depth estimation on the KITTI driving dataset, even outperforming
supervised methods that have been trained with ground truth depth.Comment: CVPR 2017 ora
Dense 3D Object Reconstruction from a Single Depth View
In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs
the complete 3D structure of a given object from a single arbitrary depth view
using generative adversarial networks. Unlike existing work which typically
requires multiple views of the same object or class labels to recover the full
3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation
of a depth view of the object as input, and is able to generate the complete 3D
occupancy grid with a high resolution of 256^3 by recovering the
occluded/missing regions. The key idea is to combine the generative
capabilities of autoencoders and the conditional Generative Adversarial
Networks (GAN) framework, to infer accurate and fine-grained 3D structures of
objects in high-dimensional voxel space. Extensive experiments on large
synthetic datasets and real-world Kinect datasets show that the proposed
3D-RecGAN++ significantly outperforms the state of the art in single view 3D
object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at:
https://github.com/Yang7879/3D-RecGAN-extended. This article extends from
arXiv:1708.0796
- …