Most previous works of outdoor instance segmentation for images only use
color information. We explore a novel direction of sensor fusion to exploit
stereo cameras. Geometric information from disparities helps separate
overlapping objects of the same or different classes. Moreover, geometric
information penalizes region proposals with unlikely 3D shapes thus suppressing
false positive detections. Mask regression is based on 2D, 2.5D, and 3D ROI
using the pseudo-lidar and image-based representations. These mask predictions
are fused by a mask scoring process. However, public datasets only adopt stereo
systems with shorter baseline and focal legnth, which limit measuring ranges of
stereo cameras. We collect and utilize High-Quality Driving Stereo (HQDS)
dataset, using much longer baseline and focal length with higher resolution.
Our performance attains state of the art. Please refer to our project page. The
full paper is available here.Comment: CVPR 2020 Workshop of Scalability in Autonomous Driving (WSAD).
Please refer to WSAD site for detail