5,647 research outputs found
Towards Visual Ego-motion Learning in Robots
Many model-based Visual Odometry (VO) algorithms have been proposed in the
past decade, often restricted to the type of camera optics, or the underlying
motion manifold observed. We envision robots to be able to learn and perform
these tasks, in a minimally supervised setting, as they gain more experience.
To this end, we propose a fully trainable solution to visual ego-motion
estimation for varied camera optics. We propose a visual ego-motion learning
architecture that maps observed optical flow vectors to an ego-motion density
estimate via a Mixture Density Network (MDN). By modeling the architecture as a
Conditional Variational Autoencoder (C-VAE), our model is able to provide
introspective reasoning and prediction for ego-motion induced scene-flow.
Additionally, our proposed model is especially amenable to bootstrapped
ego-motion learning in robots where the supervision in ego-motion estimation
for a particular camera sensor can be obtained from standard navigation-based
sensor fusion strategies (GPS/INS and wheel-odometry fusion). Through
experiments, we show the utility of our proposed approach in enabling the
concept of self-supervised learning for visual ego-motion estimation in
autonomous robots.Comment: Conference paper; Submitted to IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS) 2017, Vancouver CA; 8 pages, 8 figures,
2 table
A Framework for Interactive Teaching of Virtual Borders to Mobile Robots
The increasing number of robots in home environments leads to an emerging
coexistence between humans and robots. Robots undertake common tasks and
support the residents in their everyday life. People appreciate the presence of
robots in their environment as long as they keep the control over them. One
important aspect is the control of a robot's workspace. Therefore, we introduce
virtual borders to precisely and flexibly define the workspace of mobile
robots. First, we propose a novel framework that allows a person to
interactively restrict a mobile robot's workspace. To show the validity of this
framework, a concrete implementation based on visual markers is implemented.
Afterwards, the mobile robot is capable of performing its tasks while
respecting the new virtual borders. The approach is accurate, flexible and less
time consuming than explicit robot programming. Hence, even non-experts are
able to teach virtual borders to their robots which is especially interesting
in domains like vacuuming or service robots in home environments.Comment: 7 pages, 6 figure
GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks
In the last decade, supervised deep learning approaches have been extensively
employed in visual odometry (VO) applications, which is not feasible in
environments where labelled data is not abundant. On the other hand,
unsupervised deep learning approaches for localization and mapping in unknown
environments from unlabelled data have received comparatively less attention in
VO research. In this study, we propose a generative unsupervised learning
framework that predicts 6-DoF pose camera motion and monocular depth map of the
scene from unlabelled RGB image sequences, using deep convolutional Generative
Adversarial Networks (GANs). We create a supervisory signal by warping view
sequences and assigning the re-projection minimization to the objective loss
function that is adopted in multi-view pose estimation and single-view depth
generation network. Detailed quantitative and qualitative evaluations of the
proposed framework on the KITTI and Cityscapes datasets show that the proposed
method outperforms both existing traditional and unsupervised deep VO methods
providing better results for both pose estimation and depth recovery.Comment: ICRA 2019 - accepte
Deep Lidar CNN to Understand the Dynamics of Moving Vehicles
Perception technologies in Autonomous Driving are experiencing their golden
age due to the advances in Deep Learning. Yet, most of these systems rely on
the semantically rich information of RGB images. Deep Learning solutions
applied to the data of other sensors typically mounted on autonomous cars (e.g.
lidars or radars) are not explored much. In this paper we propose a novel
solution to understand the dynamics of moving vehicles of the scene from only
lidar information. The main challenge of this problem stems from the fact that
we need to disambiguate the proprio-motion of the 'observer' vehicle from that
of the external 'observed' vehicles. For this purpose, we devise a CNN
architecture which at testing time is fed with pairs of consecutive lidar
scans. However, in order to properly learn the parameters of this network,
during training we introduce a series of so-called pretext tasks which also
leverage on image data. These tasks include semantic information about
vehicleness and a novel lidar-flow feature which combines standard image-based
optical flow with lidar scans. We obtain very promising results and show that
including distilled image information only during training, allows improving
the inference results of the network at test time, even when image data is no
longer used.Comment: Presented in IEEE ICRA 2018. IEEE Copyrights: Personal use of this
material is permitted. Permission from IEEE must be obtained for all other
uses. (V2 just corrected comments on arxiv submission
- …