3 research outputs found
A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures
In recent years, the field of Deep Learning has seen many disruptive and
impactful advancements. Given the increasing complexity of deep neural
networks, the need for efficient hardware accelerators has become more and more
pressing to design heterogeneous HPC platforms. The design of Deep Learning
accelerators requires a multidisciplinary approach, combining expertise from
several areas, spanning from computer architecture to approximate computing,
computational models, and machine learning algorithms. Several methodologies
and tools have been proposed to design accelerators for Deep Learning,
including hardware-software co-design approaches, high-level synthesis methods,
specific customized compilers, and methodologies for design space exploration,
modeling, and simulation. These methodologies aim to maximize the exploitable
parallelism and minimize data movement to achieve high performance and energy
efficiency. This survey provides a holistic review of the most influential
design methodologies and EDA tools proposed in recent years to implement Deep
Learning accelerators, offering the reader a wide perspective in this rapidly
evolving field. In particular, this work complements the previous survey
proposed by the same authors in [203], which focuses on Deep Learning hardware
accelerators for heterogeneous HPC platforms
Dise帽o e implementaci贸n de una Unidad Aritm茅tica de Coma Flotante (FPU) gen茅rica y fexible
Proyecto de Graduaci贸n (Licenciatura en Ingenier铆a en Electr贸nica) Instituto Tecnol贸gico de Costa Rica. Escuela de Ingenier铆a Electr贸nica, 2016.A methodology that measures the DSP performance of a ASP with low-power target applications
is presented.
Additionally, the timing, area and power synthesis results for an adder, a multiplier and a
CORDIC
oating point units in Artix 7 FPGA family and 0.13 m technology for single and
double precision in various system frequencies, are presented.
The pipelined adder achieves a maximum frequency of 350MHz, the multiplier (with a simple
Karatsuba signi cand multiplication) reaches 243MHz, and lastly, the standalone CORDIC
oating point operator reaches 537MHz