Search CORE

1,068 research outputs found

Software-defined Design Space Exploration for an Efficient DNN Accelerator Architecture

Author: Che Shuai
Jha Niraj K.
Li Yingmin
Yu Ye
Zhang Weifeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/01/2020
Field of study

Deep neural networks (DNNs) have been shown to outperform conventional machine learning algorithms across a wide range of applications, e.g., image recognition, object detection, robotics, and natural language processing. However, the high computational complexity of DNNs often necessitates extremely fast and efficient hardware. The problem gets worse as the size of neural networks grows exponentially. As a result, customized hardware accelerators have been developed to accelerate DNN processing without sacrificing model accuracy. However, previous accelerator design studies have not fully considered the characteristics of the target applications, which may lead to sub-optimal architecture designs. On the other hand, new DNN models have been developed for better accuracy, but their compatibility with the underlying hardware accelerator is often overlooked. In this article, we propose an application-driven framework for architectural design space exploration of DNN accelerators. This framework is based on a hardware analytical model of individual DNN operations. It models the accelerator design task as a multi-dimensional optimization problem. We demonstrate that it can be efficaciously used in application-driven accelerator architecture design. Given a target DNN, the framework can generate efficient accelerator design solutions with optimized performance and area. Furthermore, we explore the opportunity to use the framework for accelerator configuration optimization under simultaneous diverse DNN applications. The framework is also capable of improving neural network models to best fit the underlying hardware resources

arXiv.org e-Print Archive

Princeton University Open Access Repository

Harnessing Reconfigurable Hardware to Design Heterogeneous Systems

Author: Iordanou Konstantinos
Publication venue
Publication date: 01/08/2023
Field of study

The University of Manchester - Institutional Repository

An Extended Model for Multi-Criteria Software Component Allocation on a Heterogeneous Embedded Platform

Author: Ivan Švogor
Ivica Crnkovic
Neven Vrcek
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2013
Field of study

A recent development of heterogeneous platforms (i.e. those containing different types of computational units such as multicore CPUs, GPUs, and FPGAs) has enabled significant improvements in performance for real-time data processing. This potential, however, is still not fully utilized due to the lack of methods for optimal configuration of software; the allocation of different software components to different computational unit types is crucial for getting the maximal utilization of the platform, but for more complex systems it is difficult to find ad-hoc a good enough or the best configuration. With respect to system and user defined constraints, in this paper we are applying analytical hierarchical process and a genetic algorithm to find feasible, locally optimal solution for allocating software components to computational units

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Optimization of Discrete-parameter Multiprocessor Systems using a Novel Ergodic Interpolation Technique

Author: Desai Madhav P.
Karanjkar Neha V.
Publication venue
Publication date: 14/07/2015
Field of study

Modern multi-core systems have a large number of design parameters, most of which are discrete-valued, and this number is likely to keep increasing as chip complexity rises. Further, the accurate evaluation of a potential design choice is computationally expensive because it requires detailed cycle-accurate system simulation. If the discrete parameter space can be embedded into a larger continuous parameter space, then continuous space techniques can, in principle, be applied to the system optimization problem. Such continuous space techniques often scale well with the number of parameters. We propose a novel technique for embedding the discrete parameter space into an extended continuous space so that continuous space techniques can be applied to the embedded problem using cycle accurate simulation for evaluating the objective function. This embedding is implemented using simulation-based ergodic interpolation, which, unlike spatial interpolation, produces the interpolated value within a single simulation run irrespective of the number of parameters. We have implemented this interpolation scheme in a cycle-based system simulator. In a characterization study, we observe that the interpolated performance curves are continuous, piece-wise smooth, and have low statistical error. We use the ergodic interpolation-based approach to solve a large multi-core design optimization problem with 31 design parameters. Our results indicate that continuous space optimization using ergodic interpolation-based embedding can be a viable approach for large multi-core design optimization problems.Comment: A short version of this paper will be published in the proceedings of IEEE MASCOTS 2015 conferenc

arXiv.org e-Print Archive

CiteSeerX

Automatic Nested Loop Acceleration on FPGAs Using Soft CGRA Overlay

Author: Liu C
Ng HC
So HKH
Publication venue: United Kingdom
Publication date: 01/01/2015
Field of study

Session 1: HLS Toolingpostprin

HKU Scholars Hub

High-Level Synthesis Hardware Design for FPGA-Based Accelerators: Models, Methodologies, and Frameworks

Author: Crespo Maria Liz
Gil-Costa Veronica
Molina Romina Soledad
Ramponi Giovanni
Publication venue
Publication date: 01/01/2022
Field of study

Hardware accelerators based on field programmable gate array (FPGA) and system on chip (SoC) devices have gained attention in recent years. One of the main reasons is that these devices contain reconfigurable logic, which makes them feasible for boosting the performance of applications. High-level synthesis (HLS) tools facilitate the creation of FPGA code from a high level of abstraction using different directives to obtain an optimized hardware design based on performance metrics. However, the complexity of the design space depends on different factors such as the number of directives used in the source code, the available resources in the device, and the clock frequency. Design space exploration (DSE) techniques comprise the evaluation of multiple implementations with different combinations of directives to obtain a design with a good compromise between different metrics. This paper presents a survey of models, methodologies, and frameworks proposed for metric estimation, FPGA-based DSE, and power consumption estimation on FPGA/SoC. The main features, limitations, and trade-offs of these approaches are described. We also present the integration of existing models and frameworks in diverse research areas and identify the different challenges to be addressed

Archivio istituzionale della ricerca - Università di Trieste