We propose an accurate and robust initialization approach for stereo
visual-inertial SLAM systems. Unlike the current state-of-the-art method, which
heavily relies on the accuracy of a pure visual SLAM system to estimate
inertial variables without updating camera poses, potentially compromising
accuracy and robustness, our approach offers a different solution. We realize
the crucial impact of precise gyroscope bias estimation on rotation accuracy.
This, in turn, affects trajectory accuracy due to the accumulation of
translation errors. To address this, we first independently estimate the
gyroscope bias and use it to formulate a maximum a posteriori problem for
further refinement. After this refinement, we proceed to update the rotation
estimation by performing IMU integration with gyroscope bias removed from
gyroscope measurements. We then leverage robust and accurate rotation estimates
to enhance translation estimation via 3-DoF bundle adjustment. Moreover, we
introduce a novel approach for determining the success of the initialization by
evaluating the residual of the normal epipolar constraint. Extensive
evaluations on the EuRoC dataset illustrate that our method excels in accuracy
and robustness. It outperforms ORB-SLAM3, the current leading stereo
visual-inertial initialization method, in terms of absolute trajectory error
and relative rotation error, while maintaining competitive computational speed.
Notably, even with 5 keyframes for initialization, our method consistently
surpasses the state-of-the-art approach using 10 keyframes in rotation
accuracy