737 research outputs found
RSGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems
Stereo depth estimation is used for many computer vision applications. Though
many popular methods strive solely for depth quality, for real-time mobile
applications (e.g. prosthetic glasses or micro-UAVs), speed and power
efficiency are equally, if not more, important. Many real-world systems rely on
Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but
power efficiency is hard to achieve with conventional hardware, making the use
of embedded devices such as FPGAs attractive for low-power applications.
However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so
most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA
context, the accuracy of SGM has been improved by More Global Matching (MGM),
which also helps tackle the streaking artifacts that afflict SGM. In this
paper, we propose a novel, resource-efficient method that is inspired by MGM's
techniques for improving depth quality, but which can be implemented to run in
real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI
and Middlebury), we show that in comparison to other real-time capable stereo
approaches, we can achieve a state-of-the-art balance between accuracy, power
efficiency and speed, making our approach highly desirable for use in real-time
systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4
table
Layered Interpretation of Street View Images
We propose a layered street view model to encode both depth and semantic
information on street view images for autonomous driving. Recently, stixels,
stix-mantics, and tiered scene labeling methods have been proposed to model
street view images. We propose a 4-layer street view model, a compact
representation over the recently proposed stix-mantics model. Our layers encode
semantic classes like ground, pedestrians, vehicles, buildings, and sky in
addition to the depths. The only input to our algorithm is a pair of stereo
images. We use a deep neural network to extract the appearance features for
semantic classes. We use a simple and an efficient inference algorithm to
jointly estimate both semantic classes and layered depth values. Our method
outperforms other competing approaches in Daimler urban scene segmentation
dataset. Our algorithm is massively parallelizable, allowing a GPU
implementation with a processing speed about 9 fps.Comment: The paper will be presented in the 2015 Robotics: Science and Systems
Conference (RSS
Live Demonstration: On the distance estimation of moving targets with a Stereo-Vision AER system
Distance calculation is always one of the most
important goals in a digital stereoscopic vision system. In an
AER system this goal is very important too, but it cannot be
calculated as accurately as we would like. This demonstration
shows a first approximation in this field, using a disparity
algorithm between both retinas. The system can make a distance
approach about a moving object, more specifically, a qualitative
estimation. Taking into account the stereo vision system
features, the previous retina positioning and the very important
Hold&Fire building block, we are able to make a correlation
between the spike rate of the disparity and the distance.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0
High-Performance and Tunable Stereo Reconstruction
Traditional stereo algorithms have focused their efforts on reconstruction
quality and have largely avoided prioritizing for run time performance. Robots,
on the other hand, require quick maneuverability and effective computation to
observe its immediate environment and perform tasks within it. In this work, we
propose a high-performance and tunable stereo disparity estimation method, with
a peak frame-rate of 120Hz (VGA resolution, on a single CPU-thread), that can
potentially enable robots to quickly reconstruct their immediate surroundings
and maneuver at high-speeds. Our key contribution is a disparity estimation
algorithm that iteratively approximates the scene depth via a piece-wise planar
mesh from stereo imagery, with a fast depth validation step for semi-dense
reconstruction. The mesh is initially seeded with sparsely matched keypoints,
and is recursively tessellated and refined as needed (via a resampling stage),
to provide the desired stereo disparity accuracy. The inherent simplicity and
speed of our approach, with the ability to tune it to a desired reconstruction
quality and runtime performance makes it a compelling solution for applications
in high-speed vehicles.Comment: Accepted to International Conference on Robotics and Automation
(ICRA) 2016; 8 pages, 5 figure
Real-time FPGA implementation of the Semi-Global Matching stereo vision algorithm for a 4K/UHD video stream
In this paper, we propose a real-time FPGA implementation of the Semi-Global
Matching (SGM) stereo vision algorithm. The designed module supports a 4K/Ultra
HD (3840 x 2160 pixels @ 30 frames per second) video stream in a 4 pixel per
clock (ppc) format and a 64-pixel disparity range. The baseline SGM
implementation had to be modified to process pixels in the 4ppc format and meet
the timing constrains, however, our version provides results comparable to the
original design. The solution has been positively evaluated on the Xilinx VC707
development board with a Virtex-7 FPGA device.Comment: Paper accepted for the DASIP 2023 workshop in conjunction with HiPEAC
202
- …