Monocular Depth Estimation (MDE) is a fundamental problem in computer vision
with numerous applications. Recently, LIDAR-supervised methods have achieved
remarkable per-pixel depth accuracy in outdoor scenes. However, significant
errors are typically found in the proximity of depth discontinuities, i.e.,
depth edges, which often hinder the performance of depth-dependent applications
that are sensitive to such inaccuracies, e.g., novel view synthesis and
augmented reality. Since direct supervision for the location of depth edges is
typically unavailable in sparse LIDAR-based scenes, encouraging the MDE model
to produce correct depth edges is not straightforward. In this work we propose
to learn to detect the location of depth edges from densely-supervised
synthetic data, and use it to generate supervision for the depth edges in the
MDE training. %Despite the 'domain gap' between synthetic and real data, we
show that depth edges that are estimated directly are significantly more
accurate than the ones that emerge indirectly from the MDE training. To
quantitatively evaluate our approach, and due to the lack of depth edges ground
truth in LIDAR-based scenes, we manually annotated subsets of the KITTI and the
DDAD datasets with depth edges ground truth. We demonstrate significant gains
in the accuracy of the depth edges with comparable per-pixel depth accuracy on
several challenging datasets