The development of algorithms that learn behavioral driving models using
human demonstrations has led to increasingly realistic simulations. In general,
such models learn to jointly predict trajectories for all controlled agents by
exploiting road context information such as drivable lanes obtained from
manually annotated high-definition (HD) maps. Recent studies show that these
models can greatly benefit from increasing the amount of human data available
for training. However, the manual annotation of HD maps which is necessary for
every new location puts a bottleneck on efficiently scaling up human traffic
datasets. We propose a drone birdview image-based map (DBM) representation that
requires minimal annotation and provides rich road context information. We
evaluate multi-agent trajectory prediction using the DBM by incorporating it
into a differentiable driving simulator as an image-texture-based
differentiable rendering module. Our results demonstrate competitive
multi-agent trajectory prediction performance when using our DBM representation
as compared to models trained with rasterized HD maps