6 research outputs found
{D-NeRF}: {N}eural Radiance Fields for Dynamic Scenes
Trabajo presentado en la IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), celebrada de forma virtual desde Nashville, TN (Estados Unidos), del 20 al 25 de junio de 2021Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. Among these, stands out the Neural radiance fields (NeRF), which trains a deep network to map 5D input coordinates (representing spatial location and viewing direction) into a volume density and view-dependent emitted radiance. However, despite achieving an unprecedented level of photorealism on the generated images, NeRF is only applicable to static scenes, where the same spatial location can be queried from different images. In this paper we introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions. For this purpose we consider time as an additional input to the system, and split the learning process in two main stages: one that encodes the scene into a canonical space and another that maps this canonical representation into the deformed scene at a particular time. Both mappings are learned using fully-connected networks. Once the networks are trained, D-NeRF can render novel images, controlling both the camera view and the time variable, and thus, the object movement. We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions.Peer reviewe
Neural 3D Video Synthesis
We propose a novel approach for 3D video synthesis that is able to represent
multi-view video recordings of a dynamic real-world scene in a compact, yet
expressive representation that enables high-quality view synthesis and motion
interpolation. Our approach takes the high quality and compactness of static
neural radiance fields in a new direction: to a model-free, dynamic setting. At
the core of our approach is a novel time-conditioned neural radiance fields
that represents scene dynamics using a set of compact latent codes. To exploit
the fact that changes between adjacent frames of a video are typically small
and locally consistent, we propose two novel strategies for efficient training
of our neural network: 1) An efficient hierarchical training scheme, and 2) an
importance sampling strategy that selects the next rays for training based on
the temporal variation of the input videos. In combination, these two
strategies significantly boost the training speed, lead to fast convergence of
the training process, and enable high quality results. Our learned
representation is highly compact and able to represent a 10 second 30 FPS
multi-view video recording by 18 cameras with a model size of just 28MB. We
demonstrate that our method can render high-fidelity wide-angle novel views at
over 1K resolution, even for highly complex and dynamic scenes. We perform an
extensive qualitative and quantitative evaluation that shows that our approach
outperforms the current state of the art. We include additional video and
information at: https://neural-3d-video.github.io/Comment: Project website: https://neural-3d-video.github.io