9,119 research outputs found
SfSNet: Learning Shape, Reflectance and Illuminance of Faces in the Wild
We present SfSNet, an end-to-end learning framework for producing an accurate
decomposition of an unconstrained human face image into shape, reflectance and
illuminance. SfSNet is designed to reflect a physical lambertian rendering
model. SfSNet learns from a mixture of labeled synthetic and unlabeled real
world images. This allows the network to capture low frequency variations from
synthetic and high frequency details from real images through the photometric
reconstruction loss. SfSNet consists of a new decomposition architecture with
residual blocks that learns a complete separation of albedo and normal. This is
used along with the original image to predict lighting. SfSNet produces
significantly better quantitative and qualitative results than state-of-the-art
methods for inverse rendering and independent normal and illumination
estimation.Comment: Accepted to CVPR 2018 (Spotlight
Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes
The success of deep learning in computer vision is based on availability of
large annotated datasets. To lower the need for hand labeled images, virtually
rendered 3D worlds have recently gained popularity. Creating realistic 3D
content is challenging on its own and requires significant human effort. In
this work, we propose an alternative paradigm which combines real and synthetic
data for learning semantic instance segmentation and object detection models.
Exploiting the fact that not all aspects of the scene are equally important for
this task, we propose to augment real-world imagery with virtual objects of the
target category. Capturing real-world images at large scale is easy and cheap,
and directly provides real background appearances without the need for creating
complex 3D models of the environment. We present an efficient procedure to
augment real images with virtual objects. This allows us to create realistic
composite images which exhibit both realistic background appearance and a large
number of complex object arrangements. In contrast to modeling complete 3D
environments, our augmentation approach requires only a few user interactions
in combination with 3D shapes of the target object. Through extensive
experimentation, we conclude the right set of parameters to produce augmented
data which can maximally enhance the performance of instance segmentation
models. Further, we demonstrate the utility of our approach on training
standard deep models for semantic instance segmentation and object detection of
cars in outdoor driving scenes. We test the models trained on our augmented
data on the KITTI 2015 dataset, which we have annotated with pixel-accurate
ground truth, and on Cityscapes dataset. Our experiments demonstrate that
models trained on augmented imagery generalize better than those trained on
synthetic data or models trained on limited amount of annotated real data
A Generative Model of People in Clothing
We present the first image-based generative model of people in clothing for
the full body. We sidestep the commonly used complex graphics rendering
pipeline and the need for high-quality 3D scans of dressed people. Instead, we
learn generative models from a large image database. The main challenge is to
cope with the high variance in human pose, shape and appearance. For this
reason, pure image-based approaches have not been considered so far. We show
that this challenge can be overcome by splitting the generating process in two
parts. First, we learn to generate a semantic segmentation of the body and
clothing. Second, we learn a conditional model on the resulting segments that
creates realistic images. The full model is differentiable and can be
conditioned on pose, shape or color. The result are samples of people in
different clothing items and styles. The proposed model can generate entirely
new people with realistic clothing. In several experiments we present
encouraging results that suggest an entirely data-driven approach to people
generation is possible
- …