46 research outputs found
Sim2Real View Invariant Visual Servoing by Recurrent Control
Humans are remarkably proficient at controlling their limbs and tools from a
wide range of viewpoints and angles, even in the presence of optical
distortions. In robotics, this ability is referred to as visual servoing:
moving a tool or end-point to a desired location using primarily visual
feedback. In this paper, we study how viewpoint-invariant visual servoing
skills can be learned automatically in a robotic manipulation scenario. To this
end, we train a deep recurrent controller that can automatically determine
which actions move the end-point of a robotic arm to a desired object. The
problem that must be solved by this controller is fundamentally ambiguous:
under severe variation in viewpoint, it may be impossible to determine the
actions in a single feedforward operation. Instead, our visual servoing system
must use its memory of past movements to understand how the actions affect the
robot motion from the current viewpoint, correcting mistakes and gradually
moving closer to the target. This ability is in stark contrast to most visual
servoing methods, which either assume known dynamics or require a calibration
phase. We show how we can learn this recurrent controller using simulated data
and a reinforcement learning objective. We then describe how the resulting
model can be transferred to a real-world robot by disentangling perception from
control and only adapting the visual layers. The adapted model can servo to
previously unseen objects from novel viewpoints on a real-world Kuka IIWA
robotic arm. For supplementary videos, see:
https://fsadeghi.github.io/Sim2RealViewInvariantServoComment: Supplementary video:
https://fsadeghi.github.io/Sim2RealViewInvariantServ
DFVS: Deep Flow Guided Scene Agnostic Image Based Visual Servoing
Existing deep learning based visual servoing approaches regress the relative
camera pose between a pair of images. Therefore, they require a huge amount of
training data and sometimes fine-tuning for adaptation to a novel scene.
Furthermore, current approaches do not consider underlying geometry of the
scene and rely on direct estimation of camera pose. Thus, inaccuracies in
prediction of the camera pose, especially for distant goals, lead to a
degradation in the servoing performance. In this paper, we propose a two-fold
solution: (i) We consider optical flow as our visual features, which are
predicted using a deep neural network. (ii) These flow features are then
systematically integrated with depth estimates provided by another neural
network using interaction matrix. We further present an extensive benchmark in
a photo-realistic 3D simulation across diverse scenes to study the convergence
and generalisation of visual servoing approaches. We show convergence for over
3m and 40 degrees while maintaining precise positioning of under 2cm and 1
degree on our challenging benchmark where the existing approaches that are
unable to converge for majority of scenarios for over 1.5m and 20 degrees.
Furthermore, we also evaluate our approach for a real scenario on an aerial
robot. Our approach generalizes to novel scenarios producing precise and robust
servoing performance for 6 degrees of freedom positioning tasks with even large
camera transformations without any retraining or fine-tuning.Comment: Accepted in International Conference on Robotics and Automation
(ICRA) 2020, IEE
Deep Drone Racing: From Simulation to Reality with Domain Randomization
Dynamically changing environments, unreliable state estimation, and operation
under severe resource constraints are fundamental challenges that limit the
deployment of small autonomous drones. We address these challenges in the
context of autonomous, vision-based drone racing in dynamic environments. A
racing drone must traverse a track with possibly moving gates at high speed. We
enable this functionality by combining the performance of a state-of-the-art
planning and control system with the perceptual awareness of a convolutional
neural network (CNN). The resulting modular system is both platform- and
domain-independent: it is trained in simulation and deployed on a physical
quadrotor without any fine-tuning. The abundance of simulated data, generated
via domain randomization, makes our system robust to changes of illumination
and gate appearance. To the best of our knowledge, our approach is the first to
demonstrate zero-shot sim-to-real transfer on the task of agile drone flight.
We extensively test the precision and robustness of our system, both in
simulation and on a physical platform, and show significant improvements over
the state of the art.Comment: Accepted as a Regular Paper to the IEEE Transactions on Robotics
Journal. arXiv admin note: substantial text overlap with arXiv:1806.0854