We present a deep neural network-based method to perform high-precision,
robust and real-time 6 DOF visual servoing. The paper describes how to create a
dataset simulating various perturbations (occlusions and lighting conditions)
from a single real-world image of the scene. A convolutional neural network is
fine-tuned using this dataset to estimate the relative pose between two images
of the same scene. The output of the network is then employed in a visual
servoing control scheme. The method converges robustly even in difficult
real-world settings with strong lighting variations and occlusions.A
positioning error of less than one millimeter is obtained in experiments with a
6 DOF robot.Comment: fixed authors lis