240,400 research outputs found
Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
We introduce a data-driven approach to complete partial 3D shapes through a
combination of volumetric deep neural networks and 3D shape synthesis. From a
partially-scanned input shape, our method first infers a low-resolution -- but
complete -- output. To this end, we introduce a 3D-Encoder-Predictor Network
(3D-EPN) which is composed of 3D convolutional layers. The network is trained
to predict and fill in missing data, and operates on an implicit surface
representation that encodes both known and unknown space. This allows us to
predict global structure in unknown areas at high accuracy. We then correlate
these intermediary results with 3D geometry from a shape database at test time.
In a final pass, we propose a patch-based 3D shape synthesis method that
imposes the 3D geometry from these retrieved shapes as constraints on the
coarsely-completed mesh. This synthesis process enables us to reconstruct
fine-scale detail and generate high-resolution output while respecting the
global mesh structure obtained by the 3D-EPN. Although our 3D-EPN outperforms
state-of-the-art completion method, the main contribution in our work lies in
the combination of a data-driven shape predictor and analytic 3D shape
synthesis. In our results, we show extensive evaluations on a newly-introduced
shape completion benchmark for both real-world and synthetic data
GRASS: Generative Recursive Autoencoders for Shape Structures
We introduce a novel neural network architecture for encoding and synthesis
of 3D shapes, particularly their structures. Our key insight is that 3D shapes
are effectively characterized by their hierarchical organization of parts,
which reflects fundamental intra-shape relationships such as adjacency and
symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a
flat, unlabeled, arbitrary part layout to a compact code. The code effectively
captures hierarchical structures of man-made 3D objects of varying structural
complexities despite being fixed-dimensional: an associated decoder maps a code
back to a full hierarchy. The learned bidirectional mapping is further tuned
using an adversarial setup to yield a generative model of plausible structures,
from which novel structures can be sampled. Finally, our structure synthesis
framework is augmented by a second trained module that produces fine-grained
part geometry, conditioned on global and local structural context, leading to a
full generative pipeline for 3D shapes. We demonstrate that without
supervision, our network learns meaningful structural hierarchies adhering to
perceptual grouping principles, produces compact codes which enable
applications such as shape classification and partial matching, and supports
shape synthesis and interpolation with significant variations in topology and
geometry.Comment: Corresponding author: Kai Xu ([email protected]
Survey on Controlable Image Synthesis with Deep Learning
Image synthesis has attracted emerging research interests in academic and
industry communities. Deep learning technologies especially the generative
models greatly inspired controllable image synthesis approaches and
applications, which aim to generate particular visual contents with latent
prompts. In order to further investigate low-level controllable image synthesis
problem which is crucial for fine image rendering and editing tasks, we present
a survey of some recent works on 3D controllable image synthesis using deep
learning. We first introduce the datasets and evaluation indicators for 3D
controllable image synthesis. Then, we review the state-of-the-art research for
geometrically controllable image synthesis in two aspects: 1)
Viewpoint/pose-controllable image synthesis; 2) Structure/shape-controllable
image synthesis. Furthermore, the photometrically controllable image synthesis
approaches are also reviewed for 3D re-lighting researches. While the emphasis
is on 3D controllable image synthesis algorithms, the related applications,
products and resources are also briefly summarized for practitioners.Comment: 19 pages, 17 figure
TextCraft: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Text
Language is one of the primary means by which we describe the 3D world around
us. While rapid progress has been made in text-to-2D-image synthesis, similar
progress in text-to-3D-shape synthesis has been hindered by the lack of paired
(text, shape) data. Moreover, extant methods for text-to-shape generation have
limited shape diversity and fidelity. We introduce TextCraft, a method to
address these limitations by producing high-fidelity and diverse 3D shapes
without the need for (text, shape) pairs for training. TextCraft achieves this
by using CLIP and using a multi-resolution approach by first generating in a
low-dimensional latent space and then upscaling to a higher resolution,
improving the fidelity of the generated shape. To improve shape diversity, we
use a discrete latent space which is modelled using a bidirectional transformer
conditioned on the interchangeable image-text embedding space induced by CLIP.
Moreover, we present a novel variant of classifier-free guidance, which further
improves the accuracy-diversity trade-off. Finally, we perform extensive
experiments that demonstrate that TextCraft outperforms state-of-the-art
baselines
Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields
Capitalizing on the recent advances in image generation models, existing
controllable face image synthesis methods are able to generate high-fidelity
images with some levels of controllability, e.g., controlling the shapes,
expressions, textures, and poses of the generated face images. However, these
methods focus on 2D image generative models, which are prone to producing
inconsistent face images under large expression and pose changes. In this
paper, we propose a new NeRF-based conditional 3D face synthesis framework,
which enables 3D controllability over the generated face images by imposing
explicit 3D conditions from 3D face priors. At its core is a conditional
Generative Occupancy Field (cGOF) that effectively enforces the shape of the
generated face to commit to a given 3D Morphable Model (3DMM) mesh. To achieve
accurate control over fine-grained 3D face shapes of the synthesized image, we
additionally incorporate a 3D landmark loss as well as a volume warping loss
into our synthesis algorithm. Experiments validate the effectiveness of the
proposed method, which is able to generate high-fidelity face images and shows
more precise 3D controllability than state-of-the-art 2D-based controllable
face synthesis methods. Find code and demo at
https://keqiangsun.github.io/projects/cgof
- …