3,470 research outputs found
Efficient modeling of entangled details for natural scenes
Proceedings of Pacific Graphics 2016 (Okinawa)International audienceDigital landscape realism often comes from the multitude of details that are hard to model such as fallen leaves, rock piles orentangled fallen branches. In this article, we present a method for augmenting natural scenes with a huge amount of details suchas grass tufts, stones, leaves or twigs. Our approach takes advantage of the observation that those details can be approximatedby replications of a few similar objects and therefore relies on mass-instancing. We propose an original structure, the GhostTile, that stores a huge number of overlapping candidate objects in a tile, along with a pre-computed collision graph. Detailsare created by traversing the scene with the Ghost Tile and generating instances according to user-defined density fields thatallow to sculpt layers and piles of entangled objects while providing control over their density and distribution
SEGCloud: Semantic Segmentation of 3D Point Clouds
3D semantic scene labeling is fundamental to agents operating in the real
world. In particular, labeling raw 3D point sets from sensors provides
fine-grained semantics. Recent works leverage the capabilities of Neural
Networks (NNs), but are limited to coarse voxel predictions and do not
explicitly enforce global consistency. We present SEGCloud, an end-to-end
framework to obtain 3D point-level segmentation that combines the advantages of
NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields
(FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are
transferred back to the raw 3D points via trilinear interpolation. Then the
FC-CRF enforces global consistency and provides fine-grained semantics on the
points. We implement the latter as a differentiable Recurrent NN to allow joint
optimization. We evaluate the framework on two indoor and two outdoor 3D
datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance
comparable or superior to the state-of-the-art on all datasets.Comment: Accepted as a spotlight at the International Conference of 3D Vision
(3DV 2017
FaceLit: Neural 3D Relightable Faces
We propose a generative framework, FaceLit, capable of generating a 3D face
that can be rendered at various user-defined lighting conditions and views,
learned purely from 2D images in-the-wild without any manual annotation. Unlike
existing works that require careful capture setup or human labor, we rely on
off-the-shelf pose and illumination estimators. With these estimates, we
incorporate the Phong reflectance model in the neural volume rendering
framework. Our model learns to generate shape and material properties of a face
such that, when rendered according to the natural statistics of pose and
illumination, produces photorealistic face images with multiview 3D and
illumination consistency. Our method enables photorealistic generation of faces
with explicit illumination and view controls on multiple datasets - FFHQ,
MetFaces and CelebA-HQ. We show state-of-the-art photorealism among 3D aware
GANs on FFHQ dataset achieving an FID score of 3.5.Comment: CVPR 202
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
While dynamic Neural Radiance Fields (NeRF) have shown success in
high-fidelity 3D modeling of talking portraits, the slow training and inference
speed severely obstruct their potential usage. In this paper, we propose an
efficient NeRF-based framework that enables real-time synthesizing of talking
portraits and faster convergence by leveraging the recent success of grid-based
NeRF. Our key insight is to decompose the inherently high-dimensional talking
portrait representation into three low-dimensional feature grids. Specifically,
a Decomposed Audio-spatial Encoding Module models the dynamic head with a 3D
spatial grid and a 2D audio grid. The torso is handled with another 2D grid in
a lightweight Pseudo-3D Deformable Module. Both modules focus on efficiency
under the premise of good rendering quality. Extensive experiments demonstrate
that our method can generate realistic and audio-lips synchronized talking
portrait videos, while also being highly efficient compared to previous
methods.Comment: Project page: https://me.kiui.moe/radnerf
BLiRF: Bandlimited Radiance Fields for Dynamic Scene Modeling
Reasoning the 3D structure of a non-rigid dynamic scene from a single moving
camera is an under-constrained problem. Inspired by the remarkable progress of
neural radiance fields (NeRFs) in photo-realistic novel view synthesis of
static scenes, extensions have been proposed for dynamic settings. These
methods heavily rely on neural priors in order to regularize the problem. In
this work, we take a step back and reinvestigate how current implementations
may entail deleterious effects, including limited expressiveness, entanglement
of light and density fields, and sub-optimal motion localization. As a remedy,
we advocate for a bridge between classic non-rigid-structure-from-motion
(\nrsfm) and NeRF, enabling the well-studied priors of the former to constrain
the latter. To this end, we propose a framework that factorizes time and space
by formulating a scene as a composition of bandlimited, high-dimensional
signals. We demonstrate compelling results across complex dynamic scenes that
involve changes in lighting, texture and long-range dynamics
- …