Optical flow estimation is very challenging in situations with transparent or
occluded objects. In this work, we address these challenges at the task level
by introducing Amodal Optical Flow, which integrates optical flow with amodal
perception. Instead of only representing the visible regions, we define amodal
optical flow as a multi-layered pixel-level motion field that encompasses both
visible and occluded regions of the scene. To facilitate research on this new
task, we extend the AmodalSynthDrive dataset to include pixel-level labels for
amodal optical flow estimation. We present several strong baselines, along with
the Amodal Flow Quality metric to quantify the performance in an interpretable
manner. Furthermore, we propose the novel AmodalFlowNet as an initial step
toward addressing this task. AmodalFlowNet consists of a transformer-based
cost-volume encoder paired with a recurrent transformer decoder which
facilitates recurrent hierarchical feature propagation and amodal semantic
grounding. We demonstrate the tractability of amodal optical flow in extensive
experiments and show its utility for downstream tasks such as panoptic
tracking. We make the dataset, code, and trained models publicly available at
http://amodal-flow.cs.uni-freiburg.de