21 research outputs found

    Learning General Optical Flow Subspaces for Egomotion Estimation and Detection of Motion Anomalies

    Get PDF
    ©2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Presented at the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 20-25 June 2009, Miami, FL.DOI: 10.1109/CVPR.2009.5206538This paper deals with estimation of dense optical flow and ego-motion in a generalized imaging system by exploiting probabilistic linear subspace constraints on the flow. We deal with the extended motion of the imaging system through an environment that we assume to have some degree of statistical regularity. For example, in autonomous ground vehicles the structure of the environment around the vehicle is far from arbitrary, and the depth at each pixel is often approximately constant. The subspace constraints hold not only for perspective cameras, but in fact for a very general class of imaging systems, including catadioptric and multiple-view systems. Using minimal assumptions about the imaging system, we learn a probabilistic subspace constraint that captures the statistical regularity of the scene geometry relative to an imaging system. We propose an extension to probabilistic PCA (Tipping and Bishop, 1999) as a way to robustly learn this subspace from recorded imagery, and demonstrate its use in conjunction with a sparse optical flow algorithm. To deal with the sparseness of the input flow, we use a generative model to estimate the subspace using only the observed flow measurements. Additionally, to identify and cope with image regions that violate subspace constraints, such as moving objects, objects that violate the depth regularity, or gross flow estimation errors, we employ a per-pixel Gaussian mixture outlier process. We demonstrate results of finding the optical flow subspaces and employing them to estimate dense flow and to recover camera motion for a variety of imaging systems in several different environments

    Optical flow templates for mobile robot environment understanding

    Get PDF
    In this work we develop optical flow templates. In doing so, we introduce a practical tool for inferring robot egomotion and semantic superpixel labeling using optical flow in imaging systems with arbitrary optics. In order to do this we develop valuable understanding of geometric relationships and mathematical methods that are useful in interpreting optical flow to the robotics and computer vision communities. This work is motivated by what we perceive as directions for advancing the current state of the art in obstacle detection and scene understanding for mobile robots. Specifically, many existing methods build 3D point clouds, which are not directly useful for autonomous navigation and require further processing. Both the step of building the point clouds and the later processing steps are challenging and computationally intensive. Additionally, many current methods require a calibrated camera, which introduces calibration challenges and places limitations on the types of camera optics that may be used. Wide-angle lenses, systems with mirrors, and multiple cameras all require different calibration models and can be difficult or impossible to calibrate at all. Finally, current pixel and superpixel obstacle labeling algorithms typically rely on image appearance. While image appearance is informative, image motion is a direct effect of the scene structure that determines whether a region of the environment is an obstacle. The egomotion estimation and obstacle labeling methods we develop here based on optical flow templates require very little computation per frame and do not require building point clouds. Additionally, they do not require any specific type of camera optics, nor a calibrated camera. Finally, they label obstacles using optical flow alone without image appearance. In this thesis we start with optical flow subspaces for egomotion estimation and detection of “motion anomalies”. We then extend this to multiple subspaces and develop mathematical reasoning to select between them, comprising optical flow templates. Using these we classify environment shapes and label superpixels. Finally, we show how performing all learning and inference directly from image spatio-temporal gradients greatly improves computation time and accuracy.Ph.D

    Towards Visual Ego-motion Learning in Robots

    Full text link
    Many model-based Visual Odometry (VO) algorithms have been proposed in the past decade, often restricted to the type of camera optics, or the underlying motion manifold observed. We envision robots to be able to learn and perform these tasks, in a minimally supervised setting, as they gain more experience. To this end, we propose a fully trainable solution to visual ego-motion estimation for varied camera optics. We propose a visual ego-motion learning architecture that maps observed optical flow vectors to an ego-motion density estimate via a Mixture Density Network (MDN). By modeling the architecture as a Conditional Variational Autoencoder (C-VAE), our model is able to provide introspective reasoning and prediction for ego-motion induced scene-flow. Additionally, our proposed model is especially amenable to bootstrapped ego-motion learning in robots where the supervision in ego-motion estimation for a particular camera sensor can be obtained from standard navigation-based sensor fusion strategies (GPS/INS and wheel-odometry fusion). Through experiments, we show the utility of our proposed approach in enabling the concept of self-supervised learning for visual ego-motion estimation in autonomous robots.Comment: Conference paper; Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017, Vancouver CA; 8 pages, 8 figures, 2 table

    Independent Motion Detection with Event-driven Cameras

    Full text link
    Unlike standard cameras that send intensity images at a constant frame rate, event-driven cameras asynchronously report pixel-level brightness changes, offering low latency and high temporal resolution (both in the order of micro-seconds). As such, they have great potential for fast and low power vision algorithms for robots. Visual tracking, for example, is easily achieved even for very fast stimuli, as only moving objects cause brightness changes. However, cameras mounted on a moving robot are typically non-stationary and the same tracking problem becomes confounded by background clutter events due to the robot ego-motion. In this paper, we propose a method for segmenting the motion of an independently moving object for event-driven cameras. Our method detects and tracks corners in the event stream and learns the statistics of their motion as a function of the robot's joint velocities when no independently moving objects are present. During robot operation, independently moving objects are identified by discrepancies between the predicted corner velocities from ego-motion and the measured corner velocities. We validate the algorithm on data collected from the neuromorphic iCub robot. We achieve a precision of ~ 90 % and show that the method is robust to changes in speed of both the head and the target.Comment: 7 pages, 6 figure

    Learning Rank Reduced Interpolation with Principal Component Analysis

    Full text link
    In computer vision most iterative optimization algorithms, both sparse and dense, rely on a coarse and reliable dense initialization to bootstrap their optimization procedure. For example, dense optical flow algorithms profit massively in speed and robustness if they are initialized well in the basin of convergence of the used loss function. The same holds true for methods as sparse feature tracking when initial flow or depth information for new features at arbitrary positions is needed. This makes it extremely important to have techniques at hand that allow to obtain from only very few available measurements a dense but still approximative sketch of a desired 2D structure (e.g. depth maps, optical flow, disparity maps, etc.). The 2D map is regarded as sample from a 2D random process. The method presented here exploits the complete information given by the principal component analysis (PCA) of that process, the principal basis and its prior distribution. The method is able to determine a dense reconstruction from sparse measurement. When facing situations with only very sparse measurements, typically the number of principal components is further reduced which results in a loss of expressiveness of the basis. We overcome this problem and inject prior knowledge in a maximum a posterior (MAP) approach. We test our approach on the KITTI and the virtual KITTI datasets and focus on the interpolation of depth maps for driving scenes. The evaluation of the results show good agreement to the ground truth and are clearly better than results of interpolation by the nearest neighbor method which disregards statistical information.Comment: Accepted at Intelligent Vehicles Symposium (IV), Los Angeles, USA, June 201

    Learning monocular visual odometry with dense 3D mapping from dense 3D flow

    Get PDF
    This paper introduces a fully deep learning approach to monocular SLAM, which can perform simultaneous localization using a neural network for learning visual odometry (L-VO) and dense 3D mapping. Dense 2D flow and a depth image are generated from monocular images by sub-networks, which are then used by a 3D flow associated layer in the L-VO network to generate dense 3D flow. Given this 3D flow, the dual-stream L-VO network can then predict the 6DOF relative pose and furthermore reconstruct the vehicle trajectory. In order to learn the correlation between motion directions, the Bivariate Gaussian modelling is employed in the loss function. The L-VO network achieves an overall performance of 2.68% for average translational error and 0.0143 deg/m for average rotational error on the KITTI odometry benchmark. Moreover, the learned depth is fully leveraged to generate a dense 3D map. As a result, an entire visual SLAM system, that is, learning monocular odometry combined with dense 3D mapping, is achieved.Comment: International Conference on Intelligent Robots and Systems(IROS 2018

    A group-theoretic approach to formalizing bootstrapping problems

    Get PDF
    The bootstrapping problem consists in designing agents that learn a model of themselves and the world, and utilize it to achieve useful tasks. It is different from other learning problems as the agent starts with uninterpreted observations and commands, and with minimal prior information about the world. In this paper, we give a mathematical formalization of this aspect of the problem. We argue that the vague constraint of having "no prior information" can be recast as a precise algebraic condition on the agent: that its behavior is invariant to particular classes of nuisances on the world, which we show can be well represented by actions of groups (diffeomorphisms, permutations, linear transformations) on observations and commands. We then introduce the class of bilinear gradient dynamics sensors (BGDS) as a candidate for learning generic robotic sensorimotor cascades. We show how framing the problem as rejection of group nuisances allows a compact and modular analysis of typical preprocessing stages, such as learning the topology of the sensors. We demonstrate learning and using such models on real-world range-finder and camera data from publicly available datasets

    Application of computer vision for roller operation management

    Get PDF
    Compaction is the last and possibly the most important phase in construction of asphalt concrete (AC) pavements. Compaction densifies the loose (AC) mat, producing a stable surface with low permeability. The process strongly affects the AC performance properties. Too much compaction may cause aggregate degradation and low air void content facilitating bleeding and rutting. On the other hand too little compaction may result in higher air void content facilitating oxidation and water permeability issues, rutting due to further densification by traffic and reduced fatigue life. Therefore, compaction is a critical issue in AC pavement construction.;The common practice for compacting a mat is to establish a roller pattern that determines the number of passes and coverages needed to achieve the desired density. Once the pattern is established, the roller\u27s operator must maintain the roller pattern uniformly over the entire mat.;Despite the importance of uniform compaction to achieve the expected durability and performance of AC pavements, having the roller operator as the only mean to manage the operation can involve human errors.;With the advancement of technology in recent years, the concept of intelligent compaction (IC) was developed to assist the roller operators and improve the construction quality. Commercial IC packages for construction rollers are available from different manufacturers. They can provide precise mapping of a roller\u27s location and provide the roller operator with feedback during the compaction process.;Although, the IC packages are able to track the roller passes with impressive results, there are also major hindrances. The high cost of acquisition and potential negative impact on productivity has inhibited implementation of IC.;This study applied computer vision technology to build a versatile and affordable system to count and map roller passes. An infrared camera is mounted on top of the roller to capture the operator view. Then, in a near real-time process, image features were extracted and tracked to estimate the incremental rotation and translation of the roller. Image featured are categorized into near and distant features based on the user defined horizon. The optical flow is estimated for near features located in the region below the horizon. The change in roller\u27s heading is constantly estimated from the distant features located in the sky region. Using the roller\u27s rotation angle, the incremental translation between two frames will be calculated from the optical flow. The roller\u27s incremental rotation and translation will put together to develop a tracking map.;During system development, it was noted that in environments with thermal uniformity, the background of the IR images exhibit less featured as compared to images captured with optical cameras which are insensitive to temperature. This issue is more significant overnight, since nature elements are not able to reflect the heat energy from sun. Therefore to improve roller\u27s heading estimation where less features are available in the sky region a unique methodology that allows heading detection based on the asphalt mat edges was developed for this research. The heading measurements based on the slope of the asphalt hot edges will be added to the pool of the headings measured from sky region. The median of all heading measurements will be used as the incremental roller\u27s rotation for the tracking analysis.;The record of tracking data is used for QC/QA purposes and verifying the proper implementation of the roller pattern throughout a job constructed under the roller pass specifications.;The system developed during this research was successful in mapping roller location for few projects tested. However the system should be independently validated
    corecore