4 research outputs found
Acceleration of stereo-matching on multi-core CPU and GPU
This paper presents an accelerated version of a
dense stereo-correspondence algorithm for two different parallelism
enabled architectures, multi-core CPU and GPU. The
algorithm is part of the vision system developed for a binocular
robot-head in the context of the CloPeMa 1 research project.
This research project focuses on the conception of a new clothes
folding robot with real-time and high resolution requirements
for the vision system. The performance analysis shows that
the parallelised stereo-matching algorithm has been significantly
accelerated, maintaining 12x and 176x speed-up respectively
for multi-core CPU and GPU, compared with non-SIMD singlethread
CPU. To analyse the origin of the speed-up and gain
deeper understanding about the choice of the optimal hardware,
the algorithm was broken into key sub-tasks and the performance
was tested for four different hardware architectures
Acceleration of stereo-matching on multi-core CPU and GPU
This paper presents an accelerated version of a
dense stereo-correspondence algorithm for two different parallelism
enabled architectures, multi-core CPU and GPU. The
algorithm is part of the vision system developed for a binocular
robot-head in the context of the CloPeMa 1 research project.
This research project focuses on the conception of a new clothes
folding robot with real-time and high resolution requirements
for the vision system. The performance analysis shows that
the parallelised stereo-matching algorithm has been significantly
accelerated, maintaining 12x and 176x speed-up respectively
for multi-core CPU and GPU, compared with non-SIMD singlethread
CPU. To analyse the origin of the speed-up and gain
deeper understanding about the choice of the optimal hardware,
the algorithm was broken into key sub-tasks and the performance
was tested for four different hardware architectures
A pilot study on aeronautical surveillance system for drone delivery using heterogeneous software defined radio framework.
This paper presents a heterogeneous computing framework to interface single board computers (SBC) to (i) distinct type of computing nodes, (ii) distinct operating systems, and (iii) distinct software applications for aeronautical surveillance system for drone delivery. The implementation platform selected is the Beagle Bone Black (BBB) having the operating system (OS) Linux Ubuntu 14. The computing nodes the BBB interfaces to are: (i) a personal laptop (MacBook Pro), (ii) a virtual machine, and (iii) two servers with distinct OSs. The software applications the BBB interfaces to are: (i) Gqrx, (ii) GNURadio, (iii) Google Earth, (iv) systems took kit (STK), and (v) Matlab. This heterogeneous computing framework, with the potential for incorporating specialized processing and networking capabilities, allows scalability for system integration to existing surveillance system for manned aircrafts. The proposed system successfully decodes the location of aircraft in real-time
Implementation of a motion estimation algorithm for Intel FPGAs using OpenCL
Producción CientíficaMotion Estimation is one of the main tasks behind any video encoder. It is a compu-
tationally costly task; therefore, it is usually delegated to specific or reconfigurable
hardware, such as FPGAs. Over the years, multiple FPGA implementations have
been developed, mainly using hardware description languages such as Verilog or
VHDL. Since programming using hardware description languages is a complex task,
it is desirable to use higher-level languages to develop FPGA applications.The aim
of this work is to evaluate OpenCL, in terms of expressiveness, as a tool for devel-
oping this kind of FPGA applications. To do so, we present and evaluate a parallel
implementation of the Block Matching Motion Estimation process using OpenCL
for Intel FPGAs, usable and tested on an Intel Stratix 10 FPGA. The implementa-
tion efficiently processes Full HD frames completely inside the FPGA. In this work,
we show the resource utilization when synthesizing the code on an Intel Stratix 10
FPGA, as well as a performance comparison with multiple CPU implementations
with varying levels of optimization and vectorization capabilities. We also compare
the proposed OpenCL implementation, in terms of resource utilization and perfor-
mance, with estimations obtained from an equivalent VHDL implementation.Junta de Castilla y León - Consejería de Educación de la Proyecto PROPHET-2 (VA226P20)Ministerio de Economía, Industria y Competitividad: (PID2019- 104834 GB-I00) and European Regional Development Fund (ERDF) program: Project PCAS (TIN2017-88614-R)Ministerio de Ciencia e Innovación (PID2019-104184RB-I00 / AEI / 10.13039/501100011033)Xunta de Galicia y fondos FEDER de la UE (Centro de Investigación de Galicia acreditación 2019-2022, ref. ED431G 2019/01; Consolidation Program of Competitive Reference Groups, ref. ED431C 2021/30Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación y “European Union NextGenerationEU/PRTR” : (MCIN/ AEI/10.13039/501100011033) - grant TED2021-130367B-I00Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL