Search CORE

35 research outputs found

Adaptive Voltage Scaling with In-Situ Detectors in Commercial FPGAs

Author: Nunez-Yanez Jose Luis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Crossref

Explore Bristol Research

Energy proportional computing with OpenCL on a FPGA-based overlay architecture

Author: Nunez-Yanez Jose Luis
Sani Awais Hussain
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/12/2016
Field of study

Crossref

Explore Bristol Research

Energy proportional computing with OpenCL on a FPGA-based overlay architecture

Author: Sani Awais Hussain
Nunez-Yanez Jose Luis
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 01/01/2016
Field of study

Crossref

Biblioteca Digital de la Comunidad de Madrid

Explore Bristol Research

Extending the PCIe Interface with Parallel Compression/Decompression Hardware for Energy and Performance Optimization

Author: Mohd Amiruddin Zainol, Jose Luis Nunez-Yanez
Publication venue: Auricle Global Society of Education and Research
Publication date: 26/02/2018
Field of study

PCIe is a high-performing interface used to move data from a central host PC to an accelerator such as Field Programmable Gate Arrays (FPGA). This interface allows a system to perform fast data transfers in High-Performance Computing (HPC) and provide a performance boost. However, HPC systems normally require large datasets, and in these situations PCIe can become a bottleneck. To address this issue, we propose an open-source hardware compression/decompression system that can be used to adapt with continuously-streamed data with low latency and high throughput. We implement a compressor and decompressor engines on FPGA, scale up with multiple engines working in parallel, and evaluate the energy reduction and performance with different numbers of multiple engines. To alleviate the performance bottleneck in the processor acting as a controller, we propose a hardware scheduler to fairly distribute the datasets among the engines. Our design reduces the transmission time in PCIe, and the results show an energy reduction of up to 48% in the PCIe transfers, thanks to the decrease in the number of bits that have to be transmitted. The overhead in terms of latency is maintained to a minimum and user selectable depending on the tolerances of the intended application

International Journal on Future Revolution in Computer Science & Communication Engineering

Pipelined Streaming Computation of Histogram in FPGA OpenCL

Author: Hosseinabady Mohammad
Nunez-Yanez Jose Luis
Publication venue: 'IOS Press'
Publication date: 07/03/2018
Field of study

Explore Bristol Research

A systematic approach to design and optimise streaming applications on FPGA using high-level synthesis

Author: Hosseinabady Mohammad
Nunez-Yanez Jose Luis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2017
Field of study

Crossref

Explore Bristol Research

Adaptive Voltage Scaling with In-Situ Detectors in Commercial FPGAs

Author: Jose Luis Nunez-Yanez
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Run-time power gating in hybrid ARM-FPGA devices

Author: Hosseinabady Mohammad
Nunez-Yanez Jose Luis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Crossref

Explore Bristol Research

Entropy-Based Early-Exit in a FPGA-Based Low-Precision Neural Network

Author: Kong Minxuan
Nunez-Yanez Jose Luis
Publication venue
Publication date: 27/10/2023
Field of study

In this paper, we investigate the application of early-exit strategies to fully quantized neural networks, mapped to low-complexity FPGA SoC devices. The challenge of accuracy drop with low bitwidth quantized first convolutional layer and fully connected layers has been resolved. We apply an early-exit strategy to a network model that combines weights and activation with extremely low bitwidth and binary arithmetic precision based on the ImageNet dataset. We use entropy calculations to decide which branch of the early-exit network to take. The experiments show an improvement in inferred speed of

1.52\times

1.52×using an early-exit system, compared with using a single primary neural network, with a slight accuracy decrease of 1.64%

Publikationer från Linköpings universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line