Novel View Synthesis with Diffusion Models

Chan, William; Ho, Jonathan; Martin-Brualla, Ricardo; Norouzi, Mohammad; Tagliasacchi, Andrea; Watson, Daniel

Novel View Synthesis with Diffusion Models

Authors: William Chan
Jonathan Ho
Ricardo Martin-Brualla
Mohammad Norouzi
Andrea Tagliasacchi
Daniel Watson
Publication date: 6 October 2022
Publisher

Abstract

We present 3DiM, a diffusion model for 3D novel view synthesis, which is able to translate a single input view into consistent and sharp completions across many views. The core component of 3DiM is a pose-conditional image-to-image diffusion model, which takes a source view and its pose as inputs, and generates a novel view for a target pose as output. 3DiM can generate multiple views that are 3D consistent using a novel technique called stochastic conditioning. The output views are generated autoregressively, and during the generation of each novel view, one selects a random conditioning view from the set of available views at each denoising step. We demonstrate that stochastic conditioning significantly improves the 3D consistency of a naive sampler for an image-to-image diffusion model, which involves conditioning on a single fixed view. We compare 3DiM to prior work on the SRN ShapeNet dataset, demonstrating that 3DiM's generated completions from a single view achieve much higher fidelity, while being approximately 3D consistent. We also introduce a new evaluation methodology, 3D consistency scoring, to measure the 3D consistency of a generated object by training a neural field on the model's output views. 3DiM is geometry free, does not rely on hyper-networks or test-time optimization for novel view synthesis, and allows a single model to easily scale to a large number of scenes

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2210.04628

Last time updated on 24/11/2022