Multi-camera systems are an important sensor platform for intelligent systems
such as self-driving cars. Pattern-based calibration techniques can be used to
calibrate the intrinsics of the cameras individually. However, extrinsic
calibration of systems with little to no visual overlap between the cameras is
a challenge. Given the camera intrinsics, infrastucture-based calibration
techniques are able to estimate the extrinsics using 3D maps pre-built via SLAM
or Structure-from-Motion. In this paper, we propose to fully calibrate a
multi-camera system from scratch using an infrastructure-based approach.
Assuming that the distortion is mainly radial, we introduce a two-stage
approach. We first estimate the camera-rig extrinsics up to a single unknown
translation component per camera. Next, we solve for both the intrinsic
parameters and the missing translation components. Extensive experiments on
multiple indoor and outdoor scenes with multiple multi-camera systems show that
our calibration method achieves high accuracy and robustness. In particular,
our approach is more robust than the naive approach of first estimating
intrinsic parameters and pose per camera before refining the extrinsic
parameters of the system. The implementation is available at
https://github.com/youkely/InfrasCal.Comment: ECCV 202