We present a system for collision-free control of a robot manipulator that
uses only RGB views of the world. Perceptual input of a tabletop scene is
provided by multiple images of an RGB camera (without depth) that is either
handheld or mounted on the robot end effector. A NeRF-like process is used to
reconstruct the 3D geometry of the scene, from which the Euclidean full signed
distance function (ESDF) is computed. A model predictive control algorithm is
then used to control the manipulator to reach a desired pose while avoiding
obstacles in the ESDF. We show results on a real dataset collected and
annotated in our lab.Comment: ICRA 2023. Project page at https://ngp-mpc.github.io