While current multi-frame restoration methods combine information from
multiple input images using 2D alignment techniques, recent advances in novel
view synthesis are paving the way for a new paradigm relying on volumetric
scene representations. In this work, we introduce the first 3D-based
multi-frame denoising method that significantly outperforms its 2D-based
counterparts with lower computational requirements. Our method extends the
multiplane image (MPI) framework for novel view synthesis by introducing a
learnable encoder-renderer pair manipulating multiplane representations in
feature space. The encoder fuses information across views and operates in a
depth-wise manner while the renderer fuses information across depths and
operates in a view-wise manner. The two modules are trained end-to-end and
learn to separate depths in an unsupervised way, giving rise to Multiplane
Feature (MPF) representations. Experiments on the Spaces and Real
Forward-Facing datasets as well as on raw burst data validate our approach for
view synthesis, multi-frame denoising, and view synthesis under noisy
conditions.Comment: Accepted at CVPR 202