4,140 research outputs found
Unsupervised Texture Transfer from Images to Model Collections
Large 3D model repositories of common objects are now ubiquitous and are increasingly being used in computer graphics and computer vision for both analysis and synthesis tasks. However, images of objects in the real world have a richness of appearance that these repositories do not capture, largely because most existing 3D models are untextured. In this work we develop an automated pipeline capable of transporting texture information from images of real objects to 3D models of similar objects. This is a challenging problem, as an object's texture as seen in a photograph is distorted by many factors, including pose, geometry, and illumination. These geometric and photometric distortions must be undone in order to transfer the pure underlying texture to a new object --- the 3D model. Instead of using problematic dense correspondences, we factorize the problem into the reconstruction of a set of base textures (materials) and an illumination model for the object in the image. By exploiting the geometry of the similar 3D model, we reconstruct certain reliable texture regions and correct for the illumination, from which a full texture map can be recovered and applied to the model. Our method allows for large-scale unsupervised production of richly textured 3D models directly from image data, providing high quality virtual objects for 3D scene design or photo editing applications, as well as a wealth of data for training machine learning algorithms for various inference tasks in graphics and vision
Unsupervised Learning of Artistic Styles with Archetypal Style Analysis
In this paper, we introduce an unsupervised learning approach to
automatically discover, summarize, and manipulate artistic styles from large
collections of paintings. Our method is based on archetypal analysis, which is
an unsupervised learning technique akin to sparse coding with a geometric
interpretation. When applied to deep image representations from a collection of
artworks, it learns a dictionary of archetypal styles, which can be easily
visualized. After training the model, the style of a new image, which is
characterized by local statistics of deep visual features, is approximated by a
sparse convex combination of archetypes. This enables us to interpret which
archetypal styles are present in the input image, and in which proportion.
Finally, our approach allows us to manipulate the coefficients of the latent
archetypal decomposition, and achieve various special effects such as style
enhancement, transfer, and interpolation between multiple archetypes.Comment: Accepted at NIPS 2018, Montr\'eal, Canad
Visual Object Networks: Image Generation with Disentangled 3D Representation
Recent progress in deep generative models has led to tremendous breakthroughs
in image generation. However, while existing models can synthesize
photorealistic images, they lack an understanding of our underlying 3D world.
We present a new generative model, Visual Object Networks (VON), synthesizing
natural images of objects with a disentangled 3D representation. Inspired by
classic graphics rendering pipelines, we unravel our image formation process
into three conditionally independent factors---shape, viewpoint, and
texture---and present an end-to-end adversarial learning framework that jointly
models 3D shapes and 2D images. Our model first learns to synthesize 3D shapes
that are indistinguishable from real shapes. It then renders the object's 2.5D
sketches (i.e., silhouette and depth map) from its shape under a sampled
viewpoint. Finally, it learns to add realistic texture to these 2.5D sketches
to generate natural images. The VON not only generates images that are more
realistic than state-of-the-art 2D image synthesis methods, but also enables
many 3D operations such as changing the viewpoint of a generated image, editing
of shape and texture, linear interpolation in texture and shape space, and
transferring appearance across different objects and viewpoints.Comment: NeurIPS 2018. Code: https://github.com/junyanz/VON Website:
http://von.csail.mit.edu
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction
In this work we propose a novel model-based deep convolutional autoencoder
that addresses the highly challenging problem of reconstructing a 3D human face
from a single in-the-wild color image. To this end, we combine a convolutional
encoder network with an expert-designed generative model that serves as
decoder. The core innovation is our new differentiable parametric decoder that
encapsulates image formation analytically based on a generative model. Our
decoder takes as input a code vector with exactly defined semantic meaning that
encodes detailed face pose, shape, expression, skin reflectance and scene
illumination. Due to this new way of combining CNN-based with model-based face
reconstruction, the CNN-based encoder learns to extract semantically meaningful
parameters from a single monocular input image. For the first time, a CNN
encoder and an expert-designed generative model can be trained end-to-end in an
unsupervised manner, which renders training on very large (unlabeled) real
world data feasible. The obtained reconstructions compare favorably to current
state-of-the-art approaches in terms of quality and richness of representation.Comment: International Conference on Computer Vision (ICCV) 2017 (Oral), 13
page
- …