9,421 research outputs found
Unsupervised Adversarial Depth Estimation using Cycled Generative Networks
While recent deep monocular depth estimation approaches based on supervised
regression have achieved remarkable performance, costly ground truth
annotations are required during training. To cope with this issue, in this
paper we present a novel unsupervised deep learning approach for predicting
depth maps and show that the depth estimation task can be effectively tackled
within an adversarial learning framework. Specifically, we propose a deep
generative network that learns to predict the correspondence field i.e. the
disparity map between two image views in a calibrated stereo camera setting.
The proposed architecture consists of two generative sub-networks jointly
trained with adversarial learning for reconstructing the disparity map and
organized in a cycle such as to provide mutual constraints and supervision to
each other. Extensive experiments on the publicly available datasets KITTI and
Cityscapes demonstrate the effectiveness of the proposed model and competitive
results with state of the art methods. The code and trained model are available
on https://github.com/andrea-pilzer/unsup-stereo-depthGAN.Comment: To appear in 3DV 2018. Code is available on GitHu
DeepVoxels: Learning Persistent 3D Feature Embeddings
In this work, we address the lack of 3D understanding of generative neural
networks by introducing a persistent 3D feature embedding for view synthesis.
To this end, we propose DeepVoxels, a learned representation that encodes the
view-dependent appearance of a 3D scene without having to explicitly model its
geometry. At its core, our approach is based on a Cartesian 3D grid of
persistent embedded features that learn to make use of the underlying 3D scene
structure. Our approach combines insights from 3D geometric computer vision
with recent advances in learning image-to-image mappings based on adversarial
loss functions. DeepVoxels is supervised, without requiring a 3D reconstruction
of the scene, using a 2D re-rendering loss and enforces perspective and
multi-view geometry in a principled manner. We apply our persistent 3D scene
representation to the problem of novel view synthesis demonstrating
high-quality results for a variety of challenging scenes.Comment: Video: https://www.youtube.com/watch?v=HM_WsZhoGXw Supplemental
material:
https://drive.google.com/file/d/1BnZRyNcVUty6-LxAstN83H79ktUq8Cjp/view?usp=sharing
Code: https://github.com/vsitzmann/deepvoxels Project page:
https://vsitzmann.github.io/deepvoxels
Hierarchy Composition GAN for High-fidelity Image Synthesis
Despite the rapid progress of generative adversarial networks (GANs) in image
synthesis in recent years, the existing image synthesis approaches work in
either geometry domain or appearance domain alone which often introduces
various synthesis artifacts. This paper presents an innovative Hierarchical
Composition GAN (HIC-GAN) that incorporates image synthesis in geometry and
appearance domains into an end-to-end trainable network and achieves superior
synthesis realism in both domains simultaneously. We design an innovative
hierarchical composition mechanism that is capable of learning realistic
composition geometry and handling occlusions while multiple foreground objects
are involved in image composition. In addition, we introduce a novel attention
mask mechanism that guides to adapt the appearance of foreground objects which
also helps to provide better training reference for learning in geometry
domain. Extensive experiments on scene text image synthesis, portrait editing
and indoor rendering tasks show that the proposed HIC-GAN achieves superior
synthesis performance qualitatively and quantitatively.Comment: 11 pages, 8 figure
- …