24 research outputs found
DFVS: Deep Flow Guided Scene Agnostic Image Based Visual Servoing
Existing deep learning based visual servoing approaches regress the relative
camera pose between a pair of images. Therefore, they require a huge amount of
training data and sometimes fine-tuning for adaptation to a novel scene.
Furthermore, current approaches do not consider underlying geometry of the
scene and rely on direct estimation of camera pose. Thus, inaccuracies in
prediction of the camera pose, especially for distant goals, lead to a
degradation in the servoing performance. In this paper, we propose a two-fold
solution: (i) We consider optical flow as our visual features, which are
predicted using a deep neural network. (ii) These flow features are then
systematically integrated with depth estimates provided by another neural
network using interaction matrix. We further present an extensive benchmark in
a photo-realistic 3D simulation across diverse scenes to study the convergence
and generalisation of visual servoing approaches. We show convergence for over
3m and 40 degrees while maintaining precise positioning of under 2cm and 1
degree on our challenging benchmark where the existing approaches that are
unable to converge for majority of scenarios for over 1.5m and 20 degrees.
Furthermore, we also evaluate our approach for a real scenario on an aerial
robot. Our approach generalizes to novel scenarios producing precise and robust
servoing performance for 6 degrees of freedom positioning tasks with even large
camera transformations without any retraining or fine-tuning.Comment: Accepted in International Conference on Robotics and Automation
(ICRA) 2020, IEE