148 research outputs found

    Exploiting Points and Lines in Regression Forests for RGB-D Camera Relocalization

    Full text link
    Camera relocalization plays a vital role in many robotics and computer vision tasks, such as global localization, recovery from tracking failure and loop closure detection. Recent random forests based methods exploit randomly sampled pixel comparison features to predict 3D world locations for 2D image locations to guide the camera pose optimization. However, these image features are only sampled randomly in the images, without considering the spatial structures or geometric information, leading to large errors or failure cases with the existence of poorly textured areas or in motion blur. Line segment features are more robust in these environments. In this work, we propose to jointly exploit points and lines within the framework of uncertainty driven regression forests. The proposed approach is thoroughly evaluated on three publicly available datasets against several strong state-of-the-art baselines in terms of several different error metrics. Experimental results prove the efficacy of our method, showing superior or on-par state-of-the-art performance.Comment: published as a conference paper at 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS

    Robust Visual Self-localization and Navigation in Outdoor Environments Using Slow Feature Analysis

    Get PDF
    Metka B. Robust Visual Self-localization and Navigation in Outdoor Environments Using Slow Feature Analysis. Bielefeld: Universität Bielefeld; 2019.Self-localization and navigation in outdoor environments are fundamental problems a mobile robot has to solve in order to autonomously execute tasks in a spatial environ- ment. Techniques based on the Global Positioning System (GPS) or laser-range finders have been well established but suffer from the drawbacks of limited satellite availability or high hardware effort and costs. Vision-based methods can provide an interesting al- ternative, but are still a field of active research due to the challenges of visual perception such as illumination and weather changes or long-term seasonal effects. This thesis approaches the problem of robust visual self-localization and navigation using a biologically motivated model based on unsupervised Slow Feature Analysis (SFA). It is inspired by the discovery of neurons in a rat’s brain that form a neural representation of the animal’s spatial attributes. A similar hierarchical SFA network has been shown to learn representations of either the position or the orientation directly from the visual input of a virtual rat depending on the movement statistics during training. An extension to the hierarchical SFA network is introduced that allows to learn an orientation invariant representation of the position by manipulating the perceived im- age statistics exploiting the properties of panoramic vision. The model is applied on a mobile robot in real world open field experiments obtaining localization accuracies comparable to state-of-the-art approaches. The self-localization performance can be fur- ther improved by incorporating wheel odometry into the purely vision based approach. To achieve this, a method for the unsupervised learning of a mapping from slow fea- ture to metric space is developed. Robustness w.r.t. short- and long-term appearance changes is tackled by re-structuring the temporal order of the training image sequence based on the identification of crossings in the training trajectory. Re-inserting images of the same place in different conditions into the training sequence increases the temporal variation of environmental effects and thereby improves invariance due to the slowness objective of SFA. Finally, a straightforward method for navigation in slow feature space is presented. Navigation can be performed efficiently by following the SFA-gradient, approximated from distance measurements between the slow feature values at the target and the current location. It is shown that the properties of the learned representations enable complex navigation behaviors without explicit trajectory planning

    Semantic Localization and Mapping in Robot Vision

    Get PDF
    Integration of human semantics plays an increasing role in robotics tasks such as mapping, localization and detection. Increased use of semantics serves multiple purposes, including giving computers the ability to process and present data containing human meaningful concepts, allowing computers to employ human reasoning to accomplish tasks. This dissertation presents three solutions which incorporate semantics onto visual data in order to address these problems. First, on the problem of constructing topological maps from sequence of images. The proposed solution includes a novel image similarity score which uses dynamic programming to match images using both appearance and relative positions of local features simultaneously. An MRF is constructed to model the probability of loop-closures and a locally optimal labeling is found using Loopy-BP. The recovered loop closures are then used to generate a topological map. Results are presented on four urban sequences and one indoor sequence. The second system uses video and annotated maps to solve localization. Data association is achieved through detection of object classes, annotated in prior maps, rather than through detection of visual features. To avoid the caveats of object recognition, a new representation of query images is introduced consisting of a vector of detection scores for each object class. Using soft object detections, hypotheses about pose are refined through particle filtering. Experiments include both small office spaces, and a large open urban rail station with semantically ambiguous places. This approach showcases a representation that is both robust and can exploit the plethora of existing prior maps for GPS-denied environments while avoiding the data association problems encountered when matching point clouds or visual features. Finally, a purely vision-based approach for constructing semantic maps given camera pose and simple object exemplar images. Object response heatmaps are combined with known pose to back-project detection information onto the world. These update the world model, integrating information over time as the camera moves. The approach avoids making hard decisions on object recognition, and aggregates evidence about objects in the world coordinate system. These solutions simultaneously showcase the contribution of semantics in robotics and provide state of the art solutions to these fundamental problems

    Enhancing 3D Visual Odometry with Single-Camera Stereo Omnidirectional Systems

    Full text link
    We explore low-cost solutions for efficiently improving the 3D pose estimation problem of a single camera moving in an unfamiliar environment. The visual odometry (VO) task -- as it is called when using computer vision to estimate egomotion -- is of particular interest to mobile robots as well as humans with visual impairments. The payload capacity of small robots like micro-aerial vehicles (drones) requires the use of portable perception equipment, which is constrained by size, weight, energy consumption, and processing power. Using a single camera as the passive sensor for the VO task satisfies these requirements, and it motivates the proposed solutions presented in this thesis. To deliver the portability goal with a single off-the-shelf camera, we have taken two approaches: The first one, and the most extensively studied here, revolves around an unorthodox camera-mirrors configuration (catadioptrics) achieving a stereo omnidirectional system (SOS). The second approach relies on expanding the visual features from the scene into higher dimensionalities to track the pose of a conventional camera in a photogrammetric fashion. The first goal has many interdependent challenges, which we address as part of this thesis: SOS design, projection model, adequate calibration procedure, and application to VO. We show several practical advantages for the single-camera SOS due to its complete 360-degree stereo views, that other conventional 3D sensors lack due to their limited field of view. Since our omnidirectional stereo (omnistereo) views are captured by a single camera, a truly instantaneous pair of panoramic images is possible for 3D perception tasks. Finally, we address the VO problem as a direct multichannel tracking approach, which increases the pose estimation accuracy of the baseline method (i.e., using only grayscale or color information) under the photometric error minimization as the heart of the “direct” tracking algorithm. Currently, this solution has been tested on standard monocular cameras, but it could also be applied to an SOS. We believe the challenges that we attempted to solve have not been considered previously with the level of detail needed for successfully performing VO with a single camera as the ultimate goal in both real-life and simulated scenes

    マルチタスク学習を用いたシーン理解とデータ拡張による複合現実感の向上

    Get PDF
    早大学位記番号:新9140早稲田大

    WSR: A WiFi Sensor for Collaborative Robotics

    Full text link
    In this paper we derive a new capability for robots to measure relative direction, or Angle-of-Arrival (AOA), to other robots operating in non-line-of-sight and unmapped environments with occlusions, without requiring external infrastructure. We do so by capturing all of the paths that a WiFi signal traverses as it travels from a transmitting to a receiving robot, which we term an AOA profile. The key intuition is to "emulate antenna arrays in the air" as the robots move in 3D space, a method akin to Synthetic Aperture Radar (SAR). The main contributions include development of i) a framework to accommodate arbitrary 3D trajectories, as well as continuous mobility all robots, while computing AOA profiles and ii) an accompanying analysis that provides a lower bound on variance of AOA estimation as a function of robot trajectory geometry based on the Cramer Rao Bound. This is a critical distinction with previous work on SAR that restricts robot mobility to prescribed motion patterns, does not generalize to 3D space, and/or requires transmitting robots to be static during data acquisition periods. Our method results in more accurate AOA profiles and thus better AOA estimation, and formally characterizes this observation as the informativeness of the trajectory; a computable quantity for which we derive a closed form. All theoretical developments are substantiated by extensive simulation and hardware experiments. We also show that our formulation can be used with an off-the-shelf trajectory estimation sensor. Finally, we demonstrate the performance of our system on a multi-robot dynamic rendezvous task.Comment: 28 pages, 25 figures, *co-primary author

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    Distributed scene reconstruction from multiple mobile platforms

    Get PDF
    Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed
    corecore