We propose a method for editing NeRF scenes with text-instructions. Given a
NeRF of a scene and the collection of images used to reconstruct it, our method
uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit
the input images while optimizing the underlying scene, resulting in an
optimized 3D scene that respects the edit instruction. We demonstrate that our
proposed method is able to edit large-scale, real-world scenes, and is able to
accomplish more realistic, targeted edits than prior work.Comment: Project website: https://instruct-nerf2nerf.github.io; v1. Revisions
to related work and discussio