13,262 research outputs found
Pushbroom Stereo for High-Speed Navigation in Cluttered Environments
We present a novel stereo vision algorithm that is capable of obstacle
detection on a mobile-CPU processor at 120 frames per second. Our system
performs a subset of standard block-matching stereo processing, searching only
for obstacles at a single depth. By using an onboard IMU and state-estimator,
we can recover the position of obstacles at all other depths, building and
updating a full depth-map at framerate.
Here, we describe both the algorithm and our implementation on a high-speed,
small UAV, flying at over 20 MPH (9 m/s) close to obstacles. The system
requires no external sensing or computation and is, to the best of our
knowledge, the first high-framerate stereo detection system running onboard a
small UAV
Improved depth recovery in consumer depth cameras via disparity space fusion within cross-spectral stereo.
We address the issue of improving depth coverage in consumer depth cameras based on the combined use of cross-spectral stereo and near infra-red structured light sensing. Specifically we show that fusion of disparity over these modalities, within the disparity space image, prior to disparity optimization facilitates the recovery of scene depth information in regions where structured light sensing fails. We show that this joint approach, leveraging disparity information from both structured light and cross-spectral sensing, facilitates the joint recovery of global scene depth comprising both texture-less object depth, where conventional stereo otherwise fails, and highly reflective object depth, where structured light (and similar) active sensing commonly fails. The proposed solution is illustrated using dense gradient feature matching and shown to outperform prior approaches that use late-stage fused cross-spectral stereo depth as a facet of improved sensing for consumer depth cameras
Cross-Scale Cost Aggregation for Stereo Matching
Human beings process stereoscopic correspondence across multiple scales.
However, this bio-inspiration is ignored by state-of-the-art cost aggregation
methods for dense stereo correspondence. In this paper, a generic cross-scale
cost aggregation framework is proposed to allow multi-scale interaction in cost
aggregation. We firstly reformulate cost aggregation from a unified
optimization perspective and show that different cost aggregation methods
essentially differ in the choices of similarity kernels. Then, an inter-scale
regularizer is introduced into optimization and solving this new optimization
problem leads to the proposed framework. Since the regularization term is
independent of the similarity kernel, various cost aggregation methods can be
integrated into the proposed general framework. We show that the cross-scale
framework is important as it effectively and efficiently expands
state-of-the-art cost aggregation methods and leads to significant
improvements, when evaluated on Middlebury, KITTI and New Tsukuba datasets.Comment: To Appear in 2013 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). 2014 (poster, 29.88%
A Framework for SAR-Optical Stereogrammetry over Urban Areas
Currently, numerous remote sensing satellites provide a huge volume of
diverse earth observation data. As these data show different features regarding
resolution, accuracy, coverage, and spectral imaging ability, fusion techniques
are required to integrate the different properties of each sensor and produce
useful information. For example, synthetic aperture radar (SAR) data can be
fused with optical imagery to produce 3D information using stereogrammetric
methods. The main focus of this study is to investigate the possibility of
applying a stereogrammetry pipeline to very-high-resolution (VHR) SAR-optical
image pairs. For this purpose, the applicability of semi-global matching is
investigated in this unconventional multi-sensor setting. To support the image
matching by reducing the search space and accelerating the identification of
correct, reliable matches, the possibility of establishing an epipolarity
constraint for VHR SAR-optical image pairs is investigated as well. In
addition, it is shown that the absolute geolocation accuracy of VHR optical
imagery with respect to VHR SAR imagery such as provided by TerraSAR-X can be
improved by a multi-sensor block adjustment formulation based on rational
polynomial coefficients. Finally, the feasibility of generating point clouds
with a median accuracy of about 2m is demonstrated and confirms the potential
of 3D reconstruction from SAR-optical image pairs over urban areas.Comment: This is the pre-acceptance version, to read the final version, please
go to ISPRS Journal of Photogrammetry and Remote Sensing on ScienceDirec
Learned Multi-Patch Similarity
Estimating a depth map from multiple views of a scene is a fundamental task
in computer vision. As soon as more than two viewpoints are available, one
faces the very basic question how to measure similarity across >2 image
patches. Surprisingly, no direct solution exists, instead it is common to fall
back to more or less robust averaging of two-view similarities. Encouraged by
the success of machine learning, and in particular convolutional neural
networks, we propose to learn a matching function which directly maps multiple
image patches to a scalar similarity score. Experiments on several multi-view
datasets demonstrate that this approach has advantages over methods based on
pairwise patch similarity.Comment: 10 pages, 7 figures, Accepted at ICCV 201
- …