Search CORE

3,038 research outputs found

Test exploration and validation using transaction level models

Author: Di Carlo Stefano
Imhof M.E
Khaligh R.S
Kochte M.A
Prinetto Paolo Ernesto
Radetzki M.
Wunderlich H.-J
Zollen C.G
Publication venue: IEEE Computer Society
Publication date: 01/01/2009
Field of study

The complexity of the test infrastructure and test strategies in systems-on-chip approaches the complexity of the functional design space. This paper presents test design space exploration and validation of test strategies and schedules using transaction level models (TLMs). Since many aspects of testing involve the transfer of a significant amount of test stimuli and responses, the communication-centric view of TLMs suits this purpose exceptionally wel

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Efficiency analysis methodology of FPGAs based on lost frequencies, area and cycles

Author: Braeken An
Cornelis Jan G.
da Silva Bruno
Lemeire Jan
Touhafi Abdellah
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

We propose a methodology to study and to quantify efficiency and the impact of overheads on runtime performance. Most work on High-Performance Computing (HPC) for FPGAs only studies runtime performance or cost, while we are interested in how far we are from peak performance and, more importantly, why. The efficiency of runtime performance is defined with respect to the ideal computational runtime in absence of inefficiencies. The analysis of the difference between actual and ideal runtime reveals the overheads and bottlenecks. A formal approach is proposed to decompose the efficiency into three components: frequency, area and cycles. After quantification of the efficiencies, a detailed analysis has to reveal the reasons for the lost frequencies, lost area and lost cycles. We propose a taxonomy of possible causes and practical methods to identify and quantify the overheads. The proposed methodology is applied on a number of use cases to illustrate the methodology. We show the interaction between the three components of efficiency and show how bottlenecks are revealed

Ghent University Academic Bibliography

A toolset for the analysis and optimization of motion estimation algorithms and processors

Author: Nunez-Yanez JL
Spiteri T
Vafiadis G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2009
Field of study

Crossref

Explore Bristol Research

Register-transfer-level power profiling for system-on-chip power distribution network design and signoff

Author: Hämäläinen J. (Joona)
Publication venue: University of Oulu
Publication date: 10/05/2019
Field of study

Abstract. This thesis is a study of how register-transfer-level (RTL) power profiling can help the design and signoff of power distribution network in digital integrated circuits. RTL power profiling is a method which collects RTL power estimation results to a single power profile which then can be analysed in order to find interesting time windows for specifying power distribution network design and signoff. The thesis starts with theory part. Complementary metal-oxide semiconductor (CMOS) inverter power dissipation is studied at first. Next, power distribution network structure and voltage drop problems are introduced. Voltage drop is demonstrated by using power distribution network impedance figures. Common on-chip power distribution network structure is introduced, and power distribution network design flow is outlined. Finally, decoupling capacitors function and impact on power distribution network impedance are thoroughly explained. The practical part of the thesis contains RTL power profiling flow details and power profiling flow results for one simulation case in one design block. Also, some methods of improving RTL power estimation accuracy are discussed and calibration with extracted parasitic is then used to get new set of power profiling time windows. After the results are presented, overall RTL power estimation accuracy is analysed and resulted time windows are compared to reference gate-level time windows. RTL power profiling result analysis shows that resulted time windows match the theory and RTL power profiling seems to be a promising method for finding time windows for power distribution network design and signoff.Rekisterisiirtotason tehoprofilointi järjestelmäpiirin tehonsiirtoverkon suunnittelussa ja verifioinnissa. Tiivistelmä. Tässä työssä tutkitaan, miten rekisterisiirtotason (RTL) tehoprofilointi voi auttaa digitaalisten integroitujen piirien tehonsiirtoverkon suunnittelussa ja verifioinnissa. RTL-tehoprofilointi on menetelmä, joka analysoi RTL-tehoestimoinnista saadusta tehokäyrästä hyödyllisiä aikaikkunoita tehonsiirtoverkon suunnitteluun ja verifiointiin. Työ alkaa teoriaosuudella, jonka aluksi selitetään, miten CMOS-invertteri kuluttaa tehoa. Seuravaksi esitellään tehonsiirtoverkon rakenne ja pahimmat tehonsiirtoverkon jännitehäviön aiheuttajat. Jännitehäviötä havainnollistetaan myös piirikaavioiden ja impedanssikäyrien avustuksella. Lisäksi integroidun piirin tehonsiirtoverkon suunnitteluvuo ja yleisin rakenne on esitelty. Lopuksi teoriaosuus käsittelee yksityiskohtaisesti ohituskondensaattoreiden toiminnan ja vaikutuksen tehonsiirtoverkon kokonaisimpedanssiin. Työn kokeellisessa osuudessa esitellään ensin tehoprofiloinnin vuo ja sen jälkeen vuon tulokset yhdelle esimerkkilohkolle yhdessä simulaatioajossa. Lisäksi tässä osiossa käsitellään RTL-tehoestimoinnin tarkkuutta ja tehdään RTL-tehoprofilointi loisimpedansseilla kalibroidulle RTL-mallille. Lopuksi RTL-tehoestimoinnin tuloksia ja saatuja RTL-tehoprofiloinnin aikaikkunoita analysoidaan ja verrataan porttitason mallin tuloksiin. RTL-tehoprofiloinnin tulosten analysointi osoittaa, että saatavat aikaikkunat vastaavat teoriaa ja että RTL-tehoprofilointi näyttää lupaavalta menetelmältä tehosiirtoverkon analysoinnin ja verifioinnin aikaikkunoiden löytämiseen

University of Oulu Repository - Jultika

Efficient Neural Network Implementations on Parallel Embedded Platforms Applied to Real-Time Torque-Vectoring Optimization Using Predictions for Multi-Motor Electric Vehicles

Author: Cosco Francesco
Dendaluce Jahnke Martin
Gomez-Garay Vicente
Novickis Rihards
Pérez Rastelli Joshué
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

The combination of machine learning and heterogeneous embedded platforms enables new potential for developing sophisticated control concepts which are applicable to the field of vehicle dynamics and ADAS. This interdisciplinary work provides enabler solutions -ultimately implementing fast predictions using neural networks (NNs) on field programmable gate arrays (FPGAs) and graphical processing units (GPUs)- while applying them to a challenging application: Torque Vectoring on a multi-electric-motor vehicle for enhanced vehicle dynamics. The foundation motivating this work is provided by discussing multiple domains of the technological context as well as the constraints related to the automotive field, which contrast with the attractiveness of exploiting the capabilities of new embedded platforms to apply advanced control algorithms for complex control problems. In this particular case we target enhanced vehicle dynamics on a multi-motor electric vehicle benefiting from the greater degrees of freedom and controllability offered by such powertrains. Considering the constraints of the application and the implications of the selected multivariable optimization challenge, we propose a NN to provide batch predictions for real-time optimization. This leads to the major contribution of this work: efficient NN implementations on two intrinsically parallel embedded platforms, a GPU and a FPGA, following an analysis of theoretical and practical implications of their different operating paradigms, in order to efficiently harness their computing potential while gaining insight into their peculiarities. The achieved results exceed the expectations and additionally provide a representative illustration of the strengths and weaknesses of each kind of platform. Consequently, having shown the applicability of the proposed solutions, this work contributes valuable enablers also for further developments following similar fundamental principles.Some of the results presented in this work are related to activities within the 3Ccar project, which has received funding from ECSEL Joint Undertaking under grant agreement No. 662192. This Joint Undertaking received support from the European Union’s Horizon 2020 research and innovation programme and Germany, Austria, Czech Republic, Romania, Belgium, United Kingdom, France, Netherlands, Latvia, Finland, Spain, Italy, Lithuania. This work was also partly supported by the project ENABLES3, which received funding from ECSEL Joint Undertaking under grant agreement No. 692455-2

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

TECNALIA Publications

NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

Author: Aimar Alessandro
Calabrese Enrico
Corradi Federico
Delbruck Tobi
Linares-Barranco Alejandro
Liu Shih-Chii
Lungu Iulia-Alexandra
Milde Moritz B.
Mostafa Hesham
Rios-Navarro Antonio
Tapiador-Morales Ricardo
Publication venue
Publication date: 01/01/2017
Field of study

Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference. We propose a flexible and efficient CNN accelerator architecture called NullHop that implements SOA CNNs useful for low-power and low-latency application scenarios. NullHop exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across kernel sizes ranging from 1x1 to 7x7. NullHop can process up to 128 input and 128 output feature maps per layer in a single pass. We implemented the proposed architecture on a Xilinx Zynq FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. Post-synthesis simulations using Mentor Modelsim in a 28nm process with a clock frequency of 500 MHz show that the VGG19 network achieves over 450 GOp/s. By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the MAC units, and achieves a power efficiency of over 3TOp/s/W in a core area of 6.3mm

^2

. As further proof of NullHop's usability, we interfaced its FPGA implementation with a neuromorphic event camera for real time interactive demonstrations

arXiv.org e-Print Archive

ZORA

Western Sydney ResearchDirect

idUS. Depósito de Investigación Universidad de Sevilla

Low Power Design Methodology

Author: Nagarajan Ashok Kumar
Natarajan Vithyalakshmi
Pandian Nagarajan
Savithri Vinoth Gopi
Publication venue: 'IntechOpen'
Publication date: 16/02/2018
Field of study

Due to widespread application of portable electronic devices and the evaluation of microelectronic technology, power dissipation has become a critical parameter in low power VLSI circuit designs. In emerging VLSI technology, the circuit complexity and high speed imply significant increase in the power consumption. In low power CMOS VLSI circuits, the energy dissipation is caused by charging and discharging of internal node capacitances due to transition activity, which is one of the major factors that also affect the dynamic power dissipation. The reduction in power, area and the improvement of speed require optimization at all levels of design procedures. Here various design methodologies are discussed to achieve our required low power design concepts

IntechOpen

Crossref