Search CORE

66 research outputs found

Workload distribution and balancing in FPGAs and CPUs with OpenCL and TBB

Author: Asenjo Rafael
Navarro Angeles
Nunez-Yanez Jose
Rodriguez Andres
Publication venue: 'IOS Press'
Publication date: 01/09/2015
Field of study

Explore Bristol Research

Workload Partitioning Strategy for Improved Parallelism on FPGA-CPU Heterogeneous Chips

Author: Amiri Sam
Asenjo Rafael
Hosseinabady Mohammad
Navarro Angeles
Nunez-Yanez Jose
Rodríguez Andrés
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2018
Field of study

Crossref

Coventry University Pure Portal

High-Performance Simultaneous Multiprocessing for Heterogeneous System-on-Chip

Author: Asenjo Rafael
Hosseinabady Mohammad
Navarro Angeles
Nikov Kris
Nunez-Yanez Jose
Rodríguezz Andrés
Publication venue
Publication date: 20/01/2020
Field of study

This paper presents a methodology for simultaneous heterogeneous computing, named ENEAC, where a quad core ARM Cortex-A53 CPU works in tandem with a preprogrammed on-board FPGA accelerator. A heterogeneous scheduler distributes the tasks optimally among all the resources and all compute units run asynchronously, which allows for improved performance for irregular workloads. ENEAC achieves up to 17\% performance improvement \ignore{and 14\% energy usage reduction,} when using all platform resources compared to just using the FPGA accelerators and up to 865\% performance increase \ignore{and up to 89\% energy usage decrease} when using just the CPU. The workflow uses existing commercial tools and C/C++ as a single programming language for both accelerator design and CPU programming for improved productivity and ease of verification.Comment: 7 pages, 5 figures, 1 table Presented at the 13th International Workshop on Programmability and Architectures for Heterogeneous Multicores, 2020 (arXiv:2005.07619

arXiv.org e-Print Archive

Explore Bristol Research

Parallel Multiprocessing and Scheduling on the Heterogeneous Xeon+FPGA Platform

Author: Asenjo Rafael
Corbera Francisco
Gran Ruben
Navarro Angeles
Nunez-Yanez Jose
Rodríguez Andrés
Suarez Dario
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/06/2019
Field of study

Explore Bristol Research

Exploring heterogeneous scheduling for edge computing with CPU and FPGA MPSoCs

Author: Asenjo Rafael
Corbera Francisco
Gran Rubén
Navarro Angeles
Nunez-Yanez Jose
Rodríguez Andrés
Suárez Darío
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

This paper presents a framework targeted to low-cost and low-power heterogeneous MultiProcessors that exploits FPGAs and multicore CPUs, with the overarching goal of providing developers with a productive programming model and runtime support to fully use all the processing resources available. FPGA productivity is achieved using a high-level programming model based on OpenCL, the standard for cross-platform parallel heterogeneous programming. In this work, we focus on the parallel for pattern, and as part of the runtime support for this pattern, we leverage a new scheduler that strives to maximize the number of iterations per joule by dynamically and adaptively partitioning the iteration space between the multicore and the accelerator when working simultaneously. A total of 7 benchmarks are ported and optimized for a low-cost DE1 board. The results show that the heterogeneous solution can improve performance up to 2.9x and increases energy efficiency up to 2.7x compared tothe traditional approach of keeping all the CPU cores idle while the accelerator computes the workload. Our results also demonstrate two interesting insights: First, an adaptive scheduler able to find at runtime the right chunk size for each type of application and device configuration is an essential component for these kinds of heterogeneous platforms, and second, device configurations that provide higher throughput do not always achieve better energy eciency when only the running power (excluding the idle power component) is considered

Repositorio Universidad de Zaragoza

Explore Bristol Research

Correction to: Simultaneous multiprocessing in a software-defined heterogeneous FPGA

Author: Amiri Sam
Asenjo Rafael
Gran Ruben
Hosseinabady Mohammad
Navarro Angeles
Nunez-Yanez Jose
Rodríguez Andrés
Suarez Dario
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/05/2018
Field of study

Crossref

Coventry University Pure Portal

Explore Bristol Research

Simultaneous Multiprocessing on a FPGA+CPU Heterogeneous System-On-Chip

Author: Asenjo Rafael
Gran-Tejero Rubén
Hosseinabady Mohammad
Navarro Angeles
Nunez-Yanez Jose
Rodríguez Andrés
Suárez-Gracia Darió
Publication venue: 'IOS Press'
Publication date: 07/03/2018
Field of study

Explore Bristol Research

Lightweight asynchronous scheduling in heterogeneous reconfigurable systems

Author: Asenjo Rafael
Gonzalez-Navarro Maria Angeles
Gran-Tejero Rubén
Nikov Kris
Nunez-Yanez José
Rodriguez-Moreno Andres
Suárez Gracia Darío
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

The trend for heterogeneous embedded systems is the integration of accelerators and general-purpose CPU cores on the same die. In these integrated architectures, like the Zynq UltraScale+ board (CPU+FPGA) that we target in this work, hardware support for shared memory and low-overhead synchronization between the accelerator and the CPU cores make the case for exploring strategies that exploit a tight collaboration between the CPUs and the accelerator. In this paper we propose a novel lightweight scheduling strategy, FastFit, targeted to FPGA accelerators, and a new scheduler based on it, named MultiFastFit, which asynchronously tackles heterogeneous systems comprised of a variety of CPU cores and FPGA IPs. Our strategy significantly reduces the overhead to automatically compute the near-optimal chunksizes when compared to a previous state-of-the-art auto-tuned approach, which makes our approach more suitable for fine-grained applications. Additionally, our scheduler MultiFastFit has been designed to enable the efficient co-execution of work among compute devices in such a way that all the devices are busy while minimizing the load unbalance. Our approaches have been evaluated using four benchmarks carefully tuned for the low-power UltraScale+ platform. Our experiments demonstrate that the FastFit strategy always finds the near-optimal FPGA chunksize for any device configuration at a reasonable cost, even for fine-grained and irregular applications, and that heterogeneous CPU+FPGA co-executions that exploit all the compute devices are usually faster and more energy efficient than the CPU-only and FPGA-only executions. We have also compared MultiFastFit with other state-of-the-art scheduling strategies, finding that it outperforms other auto-tuned approach up to 2x and it achieves similar results to manually-tuned schedulers without requiring an offline search of the ideal CPU-FPGA partition or FPGA chunk granularity.This work was partially supported by the Spanish projects PID2019-105396RB-I00, UMA18-FEDERJA-108, and UK EPSRC projects ENEAC (EP/N002539/1), HOPWARE (EP/V040863/1) and RS MINET (INF\R2\192044). Funding for open access charge: Universidad de Málaga / CBUA

Repositorio Universidad de Zaragoza

Repositorio Institucional Universidad de Málaga

Explore Bristol Research

Drug-Resistant Temporal Lobe Epilepsy Alters the Expression and Functional Coupling to Gαi/o Proteins of CB1 and CB2 Receptors in the Microvasculature of the Human Brain

Author: Castaneda-Cabral Jose Luis
Deli Mária Anna
Nunez-Lumbreras Maria de los Angeles
Orozco-Suárez Sandra
Sánchez-Vall Vicente
Valle-Dorado María Guadalupe
Walter Fruzsina
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2021
Field of study

Repository of the Academy's Library