409 research outputs found
Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning
Obstacle avoidance is a fundamental requirement for autonomous robots which
operate in, and interact with, the real world. When perception is limited to
monocular vision avoiding collision becomes significantly more challenging due
to the lack of 3D information. Conventional path planners for obstacle
avoidance require tuning a number of parameters and do not have the ability to
directly benefit from large datasets and continuous use. In this paper, a
dueling architecture based deep double-Q network (D3QN) is proposed for
obstacle avoidance, using only monocular RGB vision. Based on the dueling and
double-Q mechanisms, D3QN can efficiently learn how to avoid obstacles in a
simulator even with very noisy depth information predicted from RGB image.
Extensive experiments show that D3QN enables twofold acceleration on learning
compared with a normal deep Q network and the models trained solely in virtual
environments can be directly transferred to real robots, generalizing well to
various new environments with previously unseen dynamic objects.Comment: Accepted by RSS 2017 workshop New Frontiers for Deep Learning in
Robotic
Increasing the Efficiency of 6-DoF Visual Localization Using Multi-Modal Sensory Data
Localization is a key requirement for mobile robot autonomy and human-robot
interaction. Vision-based localization is accurate and flexible, however, it
incurs a high computational burden which limits its application on many
resource-constrained platforms. In this paper, we address the problem of
performing real-time localization in large-scale 3D point cloud maps of
ever-growing size. While most systems using multi-modal information reduce
localization time by employing side-channel information in a coarse manner (eg.
WiFi for a rough prior position estimate), we propose to inter-weave the map
with rich sensory data. This multi-modal approach achieves two key goals
simultaneously. First, it enables us to harness additional sensory data to
localise against a map covering a vast area in real-time; and secondly, it also
allows us to roughly localise devices which are not equipped with a camera. The
key to our approach is a localization policy based on a sequential Monte Carlo
estimator. The localiser uses this policy to attempt point-matching only in
nodes where it is likely to succeed, significantly increasing the efficiency of
the localization process. The proposed multi-modal localization system is
evaluated extensively in a large museum building. The results show that our
multi-modal approach not only increases the localization accuracy but
significantly reduces computational time.Comment: Presented at IEEE-RAS International Conference on Humanoid Robots
(Humanoids) 201
Robot-assisted discovery of evacuation routes in emergency scenarios
Abstract — When an emergency occurs within a building, it is crucial to guide victims towards emergency exits or human responders towards the locations of victims and hazards. The objective of this work is thus to devise distributed algorithms that allow agents to dynamically discover and maintain short evacuation routes connecting emergency exits to critical cells in the area. We propose two Evacuation Route Discovery mechanisms, Agent2Tag-ERD and Tag2Tag-ERD, and show how they can be seamlessly integrated with existing exploration algorithms, like Ants, MDFS and Brick&Mortar. We then examine the interplay between the tasks of area exploration and evacuation route discovery; our goal is to assess whether the exploration algorithm influences the length of evacuation paths and the time that they are first discovered. Finally, we perform an extensive simulation to assess the impact of the area topology on the quality of discovered evacuation paths. I
Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning
Deep Reinforcement Learning (DRL) has been applied successfully to many
robotic applications. However, the large number of trials needed for training
is a key issue. Most of existing techniques developed to improve training
efficiency (e.g. imitation) target on general tasks rather than being tailored
for robot applications, which have their specific context to benefit from. We
propose a novel framework, Assisted Reinforcement Learning, where a classical
controller (e.g. a PID controller) is used as an alternative, switchable policy
to speed up training of DRL for local planning and navigation problems. The
core idea is that the simple control law allows the robot to rapidly learn
sensible primitives, like driving in a straight line, instead of random
exploration. As the actor network becomes more advanced, it can then take over
to perform more complex actions, like obstacle avoidance. Eventually, the
simple controller can be discarded entirely. We show that not only does this
technique train faster, it also is less sensitive to the structure of the DRL
network and consistently outperforms a standard Deep Deterministic Policy
Gradient network. We demonstrate the results in both simulation and real-world
experiments.Comment: Published in ICRA2018. The code is now available at
https://github.com/xie9187/AsDDP
Dense 3D Object Reconstruction from a Single Depth View
In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs
the complete 3D structure of a given object from a single arbitrary depth view
using generative adversarial networks. Unlike existing work which typically
requires multiple views of the same object or class labels to recover the full
3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation
of a depth view of the object as input, and is able to generate the complete 3D
occupancy grid with a high resolution of 256^3 by recovering the
occluded/missing regions. The key idea is to combine the generative
capabilities of autoencoders and the conditional Generative Adversarial
Networks (GAN) framework, to infer accurate and fine-grained 3D structures of
objects in high-dimensional voxel space. Extensive experiments on large
synthetic datasets and real-world Kinect datasets show that the proposed
3D-RecGAN++ significantly outperforms the state of the art in single view 3D
object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at:
https://github.com/Yang7879/3D-RecGAN-extended. This article extends from
arXiv:1708.0796
3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations
The ability to interact and understand the environment is a fundamental
prerequisite for a wide range of applications from robotics to augmented
reality. In particular, predicting how deformable objects will react to applied
forces in real time is a significant challenge. This is further confounded by
the fact that shape information about encountered objects in the real world is
often impaired by occlusions, noise and missing regions e.g. a robot
manipulating an object will only be able to observe a partial view of the
entire solid. In this work we present a framework, 3D-PhysNet, which is able to
predict how a three-dimensional solid will deform under an applied force using
intuitive physics modelling. In particular, we propose a new method to encode
the physical properties of the material and the applied force, enabling
generalisation over materials. The key is to combine deep variational
autoencoders with adversarial training, conditioned on the applied force and
the material properties. We further propose a cascaded architecture that takes
a single 2.5D depth view of the object and predicts its deformation. Training
data is provided by a physics simulator. The network is fast enough to be used
in real-time applications from partial views. Experimental results show the
viability and the generalisation properties of the proposed architecture.Comment: in IJCAI 201
- …