737 research outputs found

    R3^3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems

    Full text link
    Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is hard to achieve with conventional hardware, making the use of embedded devices such as FPGAs attractive for low-power applications. However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA context, the accuracy of SGM has been improved by More Global Matching (MGM), which also helps tackle the streaking artifacts that afflict SGM. In this paper, we propose a novel, resource-efficient method that is inspired by MGM's techniques for improving depth quality, but which can be implemented to run in real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI and Middlebury), we show that in comparison to other real-time capable stereo approaches, we can achieve a state-of-the-art balance between accuracy, power efficiency and speed, making our approach highly desirable for use in real-time systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4 table

    Layered Interpretation of Street View Images

    Full text link
    We propose a layered street view model to encode both depth and semantic information on street view images for autonomous driving. Recently, stixels, stix-mantics, and tiered scene labeling methods have been proposed to model street view images. We propose a 4-layer street view model, a compact representation over the recently proposed stix-mantics model. Our layers encode semantic classes like ground, pedestrians, vehicles, buildings, and sky in addition to the depths. The only input to our algorithm is a pair of stereo images. We use a deep neural network to extract the appearance features for semantic classes. We use a simple and an efficient inference algorithm to jointly estimate both semantic classes and layered depth values. Our method outperforms other competing approaches in Daimler urban scene segmentation dataset. Our algorithm is massively parallelizable, allowing a GPU implementation with a processing speed about 9 fps.Comment: The paper will be presented in the 2015 Robotics: Science and Systems Conference (RSS

    Live Demonstration: On the distance estimation of moving targets with a Stereo-Vision AER system

    Get PDF
    Distance calculation is always one of the most important goals in a digital stereoscopic vision system. In an AER system this goal is very important too, but it cannot be calculated as accurately as we would like. This demonstration shows a first approximation in this field, using a disparity algorithm between both retinas. The system can make a distance approach about a moving object, more specifically, a qualitative estimation. Taking into account the stereo vision system features, the previous retina positioning and the very important Hold&Fire building block, we are able to make a correlation between the spike rate of the disparity and the distance.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0

    High-Performance and Tunable Stereo Reconstruction

    Get PDF
    Traditional stereo algorithms have focused their efforts on reconstruction quality and have largely avoided prioritizing for run time performance. Robots, on the other hand, require quick maneuverability and effective computation to observe its immediate environment and perform tasks within it. In this work, we propose a high-performance and tunable stereo disparity estimation method, with a peak frame-rate of 120Hz (VGA resolution, on a single CPU-thread), that can potentially enable robots to quickly reconstruct their immediate surroundings and maneuver at high-speeds. Our key contribution is a disparity estimation algorithm that iteratively approximates the scene depth via a piece-wise planar mesh from stereo imagery, with a fast depth validation step for semi-dense reconstruction. The mesh is initially seeded with sparsely matched keypoints, and is recursively tessellated and refined as needed (via a resampling stage), to provide the desired stereo disparity accuracy. The inherent simplicity and speed of our approach, with the ability to tune it to a desired reconstruction quality and runtime performance makes it a compelling solution for applications in high-speed vehicles.Comment: Accepted to International Conference on Robotics and Automation (ICRA) 2016; 8 pages, 5 figure

    Real-time FPGA implementation of the Semi-Global Matching stereo vision algorithm for a 4K/UHD video stream

    Full text link
    In this paper, we propose a real-time FPGA implementation of the Semi-Global Matching (SGM) stereo vision algorithm. The designed module supports a 4K/Ultra HD (3840 x 2160 pixels @ 30 frames per second) video stream in a 4 pixel per clock (ppc) format and a 64-pixel disparity range. The baseline SGM implementation had to be modified to process pixels in the 4ppc format and meet the timing constrains, however, our version provides results comparable to the original design. The solution has been positively evaluated on the Xilinx VC707 development board with a Virtex-7 FPGA device.Comment: Paper accepted for the DASIP 2023 workshop in conjunction with HiPEAC 202
    corecore