3,055 research outputs found
Pushbroom Stereo for High-Speed Navigation in Cluttered Environments
We present a novel stereo vision algorithm that is capable of obstacle
detection on a mobile-CPU processor at 120 frames per second. Our system
performs a subset of standard block-matching stereo processing, searching only
for obstacles at a single depth. By using an onboard IMU and state-estimator,
we can recover the position of obstacles at all other depths, building and
updating a full depth-map at framerate.
Here, we describe both the algorithm and our implementation on a high-speed,
small UAV, flying at over 20 MPH (9 m/s) close to obstacles. The system
requires no external sensing or computation and is, to the best of our
knowledge, the first high-framerate stereo detection system running onboard a
small UAV
Incremental disparity space image computation for automotive applications
Abstract—Generating a depth map from a pair of stereo images is a challenging task, which is often further complicated by the additional restrictions imposed by the target application; in the automotive field, for example, real-time environment reconstruction is essential for safety and autonomous navigation systems, thus requiring reduced processing times, often at the expense of a somewhat limited degree of accuracy in the results. Nevertheless, a-priori knowledge on the intended use of the algorithm can also be exploited to improve its performance, both in terms of precision and computational burden. This paper presents three different approaches to incremen-tal Disparity Space Image (DSI) computation, which leverage the properties of a stereo-vision system installed on a vehicle to produce accurate depth maps at sustained frame rates on commodity hardware. I
Semi-Global Stereo Matching with Surface Orientation Priors
Semi-Global Matching (SGM) is a widely-used efficient stereo matching
technique. It works well for textured scenes, but fails on untextured slanted
surfaces due to its fronto-parallel smoothness assumption. To remedy this
problem, we propose a simple extension, termed SGM-P, to utilize precomputed
surface orientation priors. Such priors favor different surface slants in
different 2D image regions or 3D scene regions and can be derived in various
ways. In this paper we evaluate plane orientation priors derived from stereo
matching at a coarser resolution and show that such priors can yield
significant performance gains for difficult weakly-textured scenes. We also
explore surface normal priors derived from Manhattan-world assumptions, and we
analyze the potential performance gains using oracle priors derived from
ground-truth data. SGM-P only adds a minor computational overhead to SGM and is
an attractive alternative to more complex methods employing higher-order
smoothness terms.Comment: extended draft of 3DV 2017 (spotlight) pape
Computer vision as a navigation aid for visually impaired
Accomplishing many of the every-day activities present a significant challenge to visually impaired people. One of such challenges is navigation in an unsupervised environment, which might be well addressed with technological solutions. The recent technological advances have made computational devices with significant processing power a widely affordable commodity. In parallel, the advances in computer-vision-based systems allow scene reconstruction in real time paving the way for realistic systems for obstacle detection. As the field of robotics is quickly advancing with the help of such technologies, we can use it to help visually impaired people. In this work we propose a computer-vision-based system for navigation of visually-impaired in unsupervised environment. The system is composed of environment analysis sub-system and the navigation sub-system. A direct monocular SLAM is used to estimate a model of the user's environment. A new algorithm for analysis of this model and detection of potential obstacles is proposed. Based on the obstacle detection result, the system selects the most appropriate direction of motion to minimize the probability of collision. This information is provided to the user by a sound-based interface -- a stereo headset. The direction of motion is provided to the user by placing sources of sounds in the virtual three-dimensional space that user perceives through the stereo headset. The system was evaluated in a real environment. The results have shown that the subjects wearing the system were six times less likely to collide with the obstacles in their environment than the subjects not wearing the system. Albeit basic, the experiments indicate a great potential of the proposed system for navigation in realistic environments
Learning Deployable Navigation Policies at Kilometer Scale from a Single Traversal
Model-free reinforcement learning has recently been shown to be effective at
learning navigation policies from complex image input. However, these
algorithms tend to require large amounts of interaction with the environment,
which can be prohibitively costly to obtain on robots in the real world. We
present an approach for efficiently learning goal-directed navigation policies
on a mobile robot, from only a single coverage traversal of recorded data. The
navigation agent learns an effective policy over a diverse action space in a
large heterogeneous environment consisting of more than 2km of travel, through
buildings and outdoor regions that collectively exhibit large variations in
visual appearance, self-similarity, and connectivity. We compare pretrained
visual encoders that enable precomputation of visual embeddings to achieve a
throughput of tens of thousands of transitions per second at training time on a
commodity desktop computer, allowing agents to learn from millions of
trajectories of experience in a matter of hours. We propose multiple forms of
computationally efficient stochastic augmentation to enable the learned policy
to generalise beyond these precomputed embeddings, and demonstrate successful
deployment of the learned policy on the real robot without fine tuning, despite
environmental appearance differences at test time. The dataset and code
required to reproduce these results and apply the technique to other datasets
and robots is made publicly available at rl-navigation.github.io/deployable
- …