Search CORE

3 research outputs found

Mapping for Maximum Performance on FPGA DSP Blocks

Author: Bajaj Ronak
Suhaib A. Fahmy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning from computer architecture to approximate computing, computational models, and machine learning algorithms. Several methodologies and tools have been proposed to design accelerators for Deep Learning, including hardware-software co-design approaches, high-level synthesis methods, specific customized compilers, and methodologies for design space exploration, modeling, and simulation. These methodologies aim to maximize the exploitable parallelism and minimize data movement to achieve high performance and energy efficiency. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field. In particular, this work complements the previous survey proposed by the same authors in [203], which focuses on Deep Learning hardware accelerators for heterogeneous HPC platforms

arXiv.org e-Print Archive

Diseño e implementación de una Unidad Aritmética de Coma Flotante (FPU) genérica y fexible

Author: Sequeira-Rojas Jorge Esteban
Publication venue: 'Instituto Tecnologico de Costa Rica'
Publication date: 01/01/2016
Field of study

Proyecto de Graduación (Licenciatura en Ingeniería en Electrónica) Instituto Tecnológico de Costa Rica. Escuela de Ingeniería Electrónica, 2016.A methodology that measures the DSP performance of a ASP with low-power target applications is presented. Additionally, the timing, area and power synthesis results for an adder, a multiplier and a CORDIC oating point units in Artix 7 FPGA family and 0.13 m technology for single and double precision in various system frequencies, are presented. The pipelined adder achieves a maximum frequency of 350MHz, the multiplier (with a simple Karatsuba signi cand multiplication) reaches 243MHz, and lastly, the standalone CORDIC oating point operator reaches 537MHz

Repositorio Institucional del Instituto Tecnologico de Costa Rica