2,396 research outputs found

    Real-Time Dense Stereo Matching With ELAS on FPGA Accelerated Embedded Devices

    Full text link
    For many applications in low-power real-time robotics, stereo cameras are the sensors of choice for depth perception as they are typically cheaper and more versatile than their active counterparts. Their biggest drawback, however, is that they do not directly sense depth maps; instead, these must be estimated through data-intensive processes. Therefore, appropriate algorithm selection plays an important role in achieving the desired performance characteristics. Motivated by applications in space and mobile robotics, we implement and evaluate a FPGA-accelerated adaptation of the ELAS algorithm. Despite offering one of the best trade-offs between efficiency and accuracy, ELAS has only been shown to run at 1.5-3 fps on a high-end CPU. Our system preserves all intriguing properties of the original algorithm, such as the slanted plane priors, but can achieve a frame rate of 47fps whilst consuming under 4W of power. Unlike previous FPGA based designs, we take advantage of both components on the CPU/FPGA System-on-Chip to showcase the strategy necessary to accelerate more complex and computationally diverse algorithms for such low power, real-time systems.Comment: 8 pages, 7 figures, 2 table

    Stereo Vision System Module for Low-Cost FPGAs for Autonomous Mobile Robots

    Get PDF
    Stereo vision uses two adjacent cameras to create a 3D image of the world. A depth map can be created by comparing the offset of the corresponding pixels from the two cameras. However, for real-time stereo vision, the image data needs to be processed at a reasonable frame rate. Real-time stereo vision allows for mobile robots to more easily navigate terrain and interact with objects by providing both the images from the cameras and the depth of the objects. Fortunately, the image processing can be parallelized in order to increase the processing speed. Field-programmable gate arrays (FPGAs) are highly parallelizable and lend themselves well to this problem. This thesis presents a stereo vision module which uses the Sum of Absolute Differences (SAD) algorithm. The SAD algorithm uses regions of pixels called windows to compare pixels to find matching pairs for determining depth. Two implementations are presented that utilize the SAD algorithm differently. The first implementation uses a 9x9 window for comparison and is able to process 4 pixels simultaneously. The second implementation uses a 7x7 window and processes 2 pixels simultaneously, but parallelizes each SAD algorithm for faster processing. The 9x9 implementation creates a better depth image with less noise, but the 7x7 implementation processes images at a higher frame rate. It has been shown through simulation that the 9x9 and 7x7 are able to process an image size of 640x480 at a frame rate of 15.73 and 29.32, respectively

    Implementation of a motion estimation algorithm for Intel FPGAs using OpenCL

    Get PDF
    Producción CientíficaMotion Estimation is one of the main tasks behind any video encoder. It is a compu- tationally costly task; therefore, it is usually delegated to specific or reconfigurable hardware, such as FPGAs. Over the years, multiple FPGA implementations have been developed, mainly using hardware description languages such as Verilog or VHDL. Since programming using hardware description languages is a complex task, it is desirable to use higher-level languages to develop FPGA applications.The aim of this work is to evaluate OpenCL, in terms of expressiveness, as a tool for devel- oping this kind of FPGA applications. To do so, we present and evaluate a parallel implementation of the Block Matching Motion Estimation process using OpenCL for Intel FPGAs, usable and tested on an Intel Stratix 10 FPGA. The implementa- tion efficiently processes Full HD frames completely inside the FPGA. In this work, we show the resource utilization when synthesizing the code on an Intel Stratix 10 FPGA, as well as a performance comparison with multiple CPU implementations with varying levels of optimization and vectorization capabilities. We also compare the proposed OpenCL implementation, in terms of resource utilization and perfor- mance, with estimations obtained from an equivalent VHDL implementation.Junta de Castilla y León - Consejería de Educación de la Proyecto PROPHET-2 (VA226P20)Ministerio de Economía, Industria y Competitividad: (PID2019- 104834 GB-I00) and European Regional Development Fund (ERDF) program: Project PCAS (TIN2017-88614-R)Ministerio de Ciencia e Innovación (PID2019-104184RB-I00 / AEI / 10.13039/501100011033)Xunta de Galicia y fondos FEDER de la UE (Centro de Investigación de Galicia acreditación 2019-2022, ref. ED431G 2019/01; Consolidation Program of Competitive Reference Groups, ref. ED431C 2021/30Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación y “European Union NextGenerationEU/PRTR” : (MCIN/ AEI/10.13039/501100011033) - grant TED2021-130367B-I00Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL

    High-performance hardware accelerators for image processing in space applications

    Get PDF
    Mars is a hard place to reach. While there have been many notable success stories in getting probes to the Red Planet, the historical record is full of bad news. The success rate for actually landing on the Martian surface is even worse, roughly 30%. This low success rate must be mainly credited to the Mars environment characteristics. In the Mars atmosphere strong winds frequently breath. This phenomena usually modifies the lander descending trajectory diverging it from the target one. Moreover, the Mars surface is not the best place where performing a safe land. It is pitched by many and close craters and huge stones, and characterized by huge mountains and hills (e.g., Olympus Mons is 648 km in diameter and 27 km tall). For these reasons a mission failure due to a landing in huge craters, on big stones or on part of the surface characterized by a high slope is highly probable. In the last years, all space agencies have increased their research efforts in order to enhance the success rate of Mars missions. In particular, the two hottest research topics are: the active debris removal and the guided landing on Mars. The former aims at finding new methods to remove space debris exploiting unmanned spacecrafts. These must be able to autonomously: detect a debris, analyses it, in order to extract its characteristics in terms of weight, speed and dimension, and, eventually, rendezvous with it. In order to perform these tasks, the spacecraft must have high vision capabilities. In other words, it must be able to take pictures and process them with very complex image processing algorithms in order to detect, track and analyse the debris. The latter aims at increasing the landing point precision (i.e., landing ellipse) on Mars. Future space-missions will increasingly adopt Video Based Navigation systems to assist the entry, descent and landing (EDL) phase of space modules (e.g., spacecrafts), enhancing the precision of automatic EDL navigation systems. For instance, recent space exploration missions, e.g., Spirity, Oppurtunity, and Curiosity, made use of an EDL procedure aiming at following a fixed and precomputed descending trajectory to reach a precise landing point. This approach guarantees a maximum landing point precision of 20 km. By comparing this data with the Mars environment characteristics, it is possible to understand how the mission failure probability still remains really high. A very challenging problem is to design an autonomous-guided EDL system able to even more reduce the landing ellipse, guaranteeing to avoid the landing in dangerous area of Mars surface (e.g., huge craters or big stones) that could lead to the mission failure. The autonomous behaviour of the system is mandatory since a manual driven approach is not feasible due to the distance between Earth and Mars. Since this distance varies from 56 to 100 million of km approximately due to the orbit eccentricity, even if a signal transmission at the light speed could be possible, in the best case the transmission time would be around 31 minutes, exceeding so the overall duration of the EDL phase. In both applications, algorithms must guarantee self-adaptability to the environmental conditions. Since the Mars (and in general the space) harsh conditions are difficult to be predicted at design time, these algorithms must be able to automatically tune the internal parameters depending on the current conditions. Moreover, real-time performances are another key factor. Since a software implementation of these computational intensive tasks cannot reach the required performances, these algorithms must be accelerated via hardware. For this reasons, this thesis presents my research work done on advanced image processing algorithms for space applications and the associated hardware accelerators. My research activity has been focused on both the algorithm and their hardware implementations. Concerning the first aspect, I mainly focused my research effort to integrate self-adaptability features in the existing algorithms. While concerning the second, I studied and validated a methodology to efficiently develop, verify and validate hardware components aimed at accelerating video-based applications. This approach allowed me to develop and test high performance hardware accelerators that strongly overcome the performances of the actual state-of-the-art implementations. The thesis is organized in four main chapters. Chapter 2 starts with a brief introduction about the story of digital image processing. The main content of this chapter is the description of space missions in which digital image processing has a key role. A major effort has been spent on the missions in which my research activity has a substantial impact. In particular, for these missions, this chapter deeply analizes and evaluates the state-of-the-art approaches and algorithms. Chapter 3 analyzes and compares the two technologies used to implement high performances hardware accelerators, i.e., Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs). Thanks to this information the reader may understand the main reasons behind the decision of space agencies to exploit FPGAs instead of ASICs for high-performance hardware accelerators in space missions, even if FPGAs are more sensible to Single Event Upsets (i.e., transient error induced on hardware component by alpha particles and solar radiation in space). Moreover, this chapter deeply describes the three available space-grade FPGA technologies (i.e., One-time Programmable, Flash-based, and SRAM-based), and the main fault-mitigation techniques against SEUs that are mandatory for employing space-grade FPGAs in actual missions. Chapter 4 describes one of the main contribution of my research work: a library of high-performance hardware accelerators for image processing in space applications. The basic idea behind this library is to offer to designers a set of validated hardware components able to strongly speed up the basic image processing operations commonly used in an image processing chain. In other words, these components can be directly used as elementary building blocks to easily create a complex image processing system, without wasting time in the debug and validation phase. This library groups the proposed hardware accelerators in IP-core families. The components contained in a same family share the same provided functionality and input/output interface. This harmonization in the I/O interface enables to substitute, inside a complex image processing system, components of the same family without requiring modifications to the system communication infrastructure. In addition to the analysis of the internal architecture of the proposed components, another important aspect of this chapter is the methodology used to develop, verify and validate the proposed high performance image processing hardware accelerators. This methodology involves the usage of different programming and hardware description languages in order to support the designer from the algorithm modelling up to the hardware implementation and validation. Chapter 5 presents the proposed complex image processing systems. In particular, it exploits a set of actual case studies, associated with the most recent space agency needs, to show how the hardware accelerator components can be assembled to build a complex image processing system. In addition to the hardware accelerators contained in the library, the described complex system embeds innovative ad-hoc hardware components and software routines able to provide high performance and self-adaptable image processing functionalities. To prove the benefits of the proposed methodology, each case study is concluded with a comparison with the current state-of-the-art implementations, highlighting the benefits in terms of performances and self-adaptability to the environmental conditions
    corecore