In this paper, we tackle the problem of active robotic 3D reconstruction of
an object. In particular, we study how a mobile robot with an arm-held camera
can select a favorable number of views to recover an object's 3D shape
efficiently. Contrary to the existing solution to this problem, we leverage the
popular neural radiance fields-based object representation, which has recently
shown impressive results for various computer vision tasks. However, it is not
straightforward to directly reason about an object's explicit 3D geometric
details using such a representation, making the next-best-view selection
problem for dense 3D reconstruction challenging. This paper introduces a
ray-based volumetric uncertainty estimator, which computes the entropy of the
weight distribution of the color samples along each ray of the object's
implicit neural representation. We show that it is possible to infer the
uncertainty of the underlying 3D geometry given a novel view with the proposed
estimator. We then present a next-best-view selection policy guided by the
ray-based volumetric uncertainty in neural radiance fields-based
representations. Encouraging experimental results on synthetic and real-world
data suggest that the approach presented in this paper can enable a new
research direction of using an implicit 3D object representation for the
next-best-view problem in robot vision applications, distinguishing our
approach from the existing approaches that rely on explicit 3D geometric
modeling.Comment: 8 pages, 9 figure; Accepted for publication at IEEE Robotics and
Automation Letters (RA-L) 202