Search CORE

9,924 research outputs found

Towards Visual Ego-motion Learning in Robots

Author: Leonard John J.
Pillai Sudeep
Publication venue
Publication date: 29/05/2017
Field of study

Many model-based Visual Odometry (VO) algorithms have been proposed in the past decade, often restricted to the type of camera optics, or the underlying motion manifold observed. We envision robots to be able to learn and perform these tasks, in a minimally supervised setting, as they gain more experience. To this end, we propose a fully trainable solution to visual ego-motion estimation for varied camera optics. We propose a visual ego-motion learning architecture that maps observed optical flow vectors to an ego-motion density estimate via a Mixture Density Network (MDN). By modeling the architecture as a Conditional Variational Autoencoder (C-VAE), our model is able to provide introspective reasoning and prediction for ego-motion induced scene-flow. Additionally, our proposed model is especially amenable to bootstrapped ego-motion learning in robots where the supervision in ego-motion estimation for a particular camera sensor can be obtained from standard navigation-based sensor fusion strategies (GPS/INS and wheel-odometry fusion). Through experiments, we show the utility of our proposed approach in enabling the concept of self-supervised learning for visual ego-motion estimation in autonomous robots.Comment: Conference paper; Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017, Vancouver CA; 8 pages, 8 figures, 2 table

arXiv.org e-Print Archive

Motion from Fixation

Author: Perona Pietro
Soatto Stefano
Publication venue: 'California Institute of Technology Library'
Publication date: 20/02/1995
Field of study

We study the problem of estimating rigid motion from a sequence of monocular perspective images obtained by navigating around an object while fixating a particular feature point. The motivation comes from the mechanics of the buman eye, which either pursuits smoothly some fixation point in the scene, or "saccades" between different fixation points. In particular, we are interested in understanding whether fixation helps the process of estimating motion in the sense that it makes it more robust, better conditioned or simpler to solve. We cast the problem in the framework of "dynamic epipolar geometry", and propose an implicit dynamical model for recursively estimating motion from fixation. This allows us to compare directly the quality of the estimates of motion obtained by imposing the fixation constraint, or by assuming a general rigid motion, simply by changing the geometry of the parameter space while maintaining the same structure of the recursive estimator. We also present a closed-form static solution from two views, and a recursive estimator of the absolute attitude between the viewer and the scene. One important issue is how do the estimates degrade in presence of disturbances in the tracking procedure. We describe a simple fixation control that converges exponentially, which is complemented by a image shift-registration for achieving sub-pixel accuracy, and assess how small deviations from perfect tracking affect the estimates of motion

CiteSeerX

Caltech Authors

Calibration by correlation using metric embedding from non-metric similarities

Author: Censi Andrea
Scaramuzza Davide
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

This paper presents a new intrinsic calibration method that allows us to calibrate a generic single-view point camera just by waving it around. From the video sequence obtained while the camera undergoes random motion, we compute the pairwise time correlation of the luminance signal for a subset of the pixels. We show that, if the camera undergoes a random uniform motion, then the pairwise correlation of any pixels pair is a function of the distance between the pixel directions on the visual sphere. This leads to formalizing calibration as a problem of metric embedding from non-metric measurements: we want to find the disposition of pixels on the visual sphere from similarities that are an unknown function of the distances. This problem is a generalization of multidimensional scaling (MDS) that has so far resisted a comprehensive observability analysis (can we reconstruct a metrically accurate embedding?) and a solid generic solution (how to do so?). We show that the observability depends both on the local geometric properties (curvature) as well as on the global topological properties (connectedness) of the target manifold. We show that, in contrast to the Euclidean case, on the sphere we can recover the scale of the points distribution, therefore obtaining a metrically accurate solution from non-metric measurements. We describe an algorithm that is robust across manifolds and can recover a metrically accurate solution when the metric information is observable. We demonstrate the performance of the algorithm for several cameras (pin-hole, fish-eye, omnidirectional), and we obtain results comparable to calibration using classical methods. Additional synthetic benchmarks show that the algorithm performs as theoretically predicted for all corner cases of the observability analysis

Caltech Authors

Alternating Back-Propagation for Generator Network

Author: Han Tian
Lu Yang
Wu Ying Nian
Zhu Song-Chun
Publication venue
Publication date: 05/12/2016
Field of study

This paper proposes an alternating back-propagation algorithm for learning the generator network model. The model is a non-linear generalization of factor analysis. In this model, the mapping from the continuous latent factors to the observed signal is parametrized by a convolutional neural network. The alternating back-propagation algorithm iterates the following two steps: (1) Inferential back-propagation, which infers the latent factors by Langevin dynamics or gradient descent. (2) Learning back-propagation, which updates the parameters given the inferred latent factors by gradient descent. The gradient computations in both steps are powered by back-propagation, and they share most of their code in common. We show that the alternating back-propagation algorithm can learn realistic generator models of natural images, video sequences, and sounds. Moreover, it can also be used to learn from incomplete or indirect training data

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications