18,202 research outputs found
Depth Assisted Full Resolution Network for Single Image-based View Synthesis
Researches in novel viewpoint synthesis majorly focus on interpolation from
multi-view input images. In this paper, we focus on a more challenging and
ill-posed problem that is to synthesize novel viewpoints from one single input
image. To achieve this goal, we propose a novel deep learning-based technique.
We design a full resolution network that extracts local image features with the
same resolution of the input, which contributes to derive high resolution and
prevent blurry artifacts in the final synthesized images. We also involve a
pre-trained depth estimation network into our system, and thus 3D information
is able to be utilized to infer the flow field between the input and the target
image. Since the depth network is trained by depth order information between
arbitrary pairs of points in the scene, global image features are also involved
into our system. Finally, a synthesis layer is used to not only warp the
observed pixels to the desired positions but also hallucinate the missing
pixels with recorded pixels. Experiments show that our technique performs well
on images of various scenes, and outperforms the state-of-the-art techniques
Single-image Tomography: 3D Volumes from 2D Cranial X-Rays
As many different 3D volumes could produce the same 2D x-ray image, inverting
this process is challenging. We show that recent deep learning-based
convolutional neural networks can solve this task. As the main challenge in
learning is the sheer amount of data created when extending the 2D image into a
3D volume, we suggest firstly to learn a coarse, fixed-resolution volume which
is then fused in a second step with the input x-ray into a high-resolution
volume. To train and validate our approach we introduce a new dataset that
comprises of close to half a million computer-simulated 2D x-ray images of 3D
volumes scanned from 175 mammalian species. Applications of our approach
include stereoscopic rendering of legacy x-ray images, re-rendering of x-rays
including changes of illumination, view pose or geometry. Our evaluation
includes comparison to previous tomography work, previous learning methods
using our data, a user study and application to a set of real x-rays
Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency
In this paper, we introduce a novel unsupervised domain adaptation technique
for the task of 3D keypoint prediction from a single depth scan or image. Our
key idea is to utilize the fact that predictions from different views of the
same or similar objects should be consistent with each other. Such view
consistency can provide effective regularization for keypoint prediction on
unlabeled instances. In addition, we introduce a geometric alignment term to
regularize predictions in the target domain. The resulting loss function can be
effectively optimized via alternating minimization. We demonstrate the
effectiveness of our approach on real datasets and present experimental results
showing that our approach is superior to state-of-the-art general-purpose
domain adaptation techniques.Comment: ECCV 201
Learning Shape Priors for Single-View 3D Completion and Reconstruction
The problem of single-view 3D shape completion or reconstruction is
challenging, because among the many possible shapes that explain an
observation, most are implausible and do not correspond to natural objects.
Recent research in the field has tackled this problem by exploiting the
expressiveness of deep convolutional networks. In fact, there is another level
of ambiguity that is often overlooked: among plausible shapes, there are still
multiple shapes that fit the 2D image equally well; i.e., the ground truth
shape is non-deterministic given a single-view input. Existing fully supervised
approaches fail to address this issue, and often produce blurry mean shapes
with smooth surfaces but no fine details.
In this paper, we propose ShapeHD, pushing the limit of single-view shape
completion and reconstruction by integrating deep generative models with
adversarially learned shape priors. The learned priors serve as a regularizer,
penalizing the model only if its output is unrealistic, not if it deviates from
the ground truth. Our design thus overcomes both levels of ambiguity
aforementioned. Experiments demonstrate that ShapeHD outperforms state of the
art by a large margin in both shape completion and shape reconstruction on
multiple real datasets.Comment: ECCV 2018. The first two authors contributed equally to this work.
Project page: http://shapehd.csail.mit.edu
Visual Feature Attribution using Wasserstein GANs
Attributing the pixels of an input image to a certain category is an
important and well-studied problem in computer vision, with applications
ranging from weakly supervised localisation to understanding hidden effects in
the data. In recent years, approaches based on interpreting a previously
trained neural network classifier have become the de facto state-of-the-art and
are commonly used on medical as well as natural image datasets. In this paper,
we discuss a limitation of these approaches which may lead to only a subset of
the category specific features being detected. To address this problem we
develop a novel feature attribution technique based on Wasserstein Generative
Adversarial Networks (WGAN), which does not suffer from this limitation. We
show that our proposed method performs substantially better than the
state-of-the-art for visual attribution on a synthetic dataset and on real 3D
neuroimaging data from patients with mild cognitive impairment (MCI) and
Alzheimer's disease (AD). For AD patients the method produces compellingly
realistic disease effect maps which are very close to the observed effects.Comment: Accepted to CVPR 201
Dense 3D Object Reconstruction from a Single Depth View
In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs
the complete 3D structure of a given object from a single arbitrary depth view
using generative adversarial networks. Unlike existing work which typically
requires multiple views of the same object or class labels to recover the full
3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation
of a depth view of the object as input, and is able to generate the complete 3D
occupancy grid with a high resolution of 256^3 by recovering the
occluded/missing regions. The key idea is to combine the generative
capabilities of autoencoders and the conditional Generative Adversarial
Networks (GAN) framework, to infer accurate and fine-grained 3D structures of
objects in high-dimensional voxel space. Extensive experiments on large
synthetic datasets and real-world Kinect datasets show that the proposed
3D-RecGAN++ significantly outperforms the state of the art in single view 3D
object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at:
https://github.com/Yang7879/3D-RecGAN-extended. This article extends from
arXiv:1708.0796
Recommended from our members
PDE Face: A Novel 3D Face Model
YesWe introduce a novel approach to face models, which
exploits the use of Partial Differential Equations (PDE) to
generate the 3D face. This addresses some common
problems of existing face models. The PDE face benefits
from seamless merging of surface patches by using only a
relatively small number of parameters based on boundary
curves. The PDE face also provides users with a great
degree of freedom to individualise the 3D face by
adjusting a set of facial boundary curves. Furthermore, we
introduce a uv-mesh texture mapping method. By
associating the texels of the texture map with the vertices
of the uv mesh in the PDE face, the new texture mapping
method eliminates the 3D-to-2D association routine in
texture mapping. Any specific PDE face can be textured
without the need for the facial expression in the texture
map to match exactly that of the 3D face model
- …