Temporal interpolation often plays a crucial role to learn meaningful
representations in dynamic scenes. In this paper, we propose a novel method to
train spatiotemporal neural radiance fields of dynamic scenes based on temporal
interpolation of feature vectors. Two feature interpolation methods are
suggested depending on underlying representations, neural networks or grids. In
the neural representation, we extract features from space-time inputs via
multiple neural network modules and interpolate them based on time frames. The
proposed multi-level feature interpolation network effectively captures
features of both short-term and long-term time ranges. In the grid
representation, space-time features are learned via four-dimensional hash
grids, which remarkably reduces training time. The grid representation shows
more than 100 times faster training speed than the previous neural-net-based
methods while maintaining the rendering quality. Concatenating static and
dynamic features and adding a simple smoothness term further improve the
performance of our proposed models. Despite the simplicity of the model
architectures, our method achieved state-of-the-art performance both in
rendering quality for the neural representation and in training speed for the
grid representation.Comment: CVPR 2023. Project page:
https://sungheonpark.github.io/tempinterpner