Compact and accurate representations of 3D shapes are central to many
perception and robotics tasks. State-of-the-art learning-based methods can
reconstruct single objects but scale poorly to large datasets. We present a
novel recursive implicit representation to efficiently and accurately encode
large datasets of complex 3D shapes by recursively traversing an implicit
octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD)
learns a hierarchically structured latent space enabling state-of-the-art
reconstruction results at a compression ratio above 99%. We also propose an
efficient curriculum learning scheme that naturally exploits the coarse-to-fine
properties of the underlying octree spatial representation. We explore the
scaling law relating latent space dimension, dataset size, and reconstruction
accuracy, showing that increasing the latent space dimension is enough to scale
to large shape datasets. Finally, we show that our learned latent space encodes
a coarse-to-fine hierarchical structure yielding reusable latents across
different levels of details, and we provide qualitative evidence of
generalization to novel shapes outside the training set.Comment: Accepted to Conference on Robot Learning (CoRL), 202