Search CORE

446 research outputs found

Energy Scalability and the RESUME Scalable Video Codec

Author: BENINI L
CHANG N
CHRISTIAENS M
Devos Harald
EECKHAUT H
KREMER U
PROBST C
Stroobandt Dirk
Publication venue: IBFI, Schloss Dagstuhl, Germany
Publication date: 01/01/2007
Field of study

Reduction of co-simulation runtime through parallel processing

Author: Coutu Jason Dean
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

During the design phase of modern digital and mixed signal devices, simulations are run to determine the fitness of the proposed design. Some of these simulations can take large amounts of time, thus slowing down the time to manufacture of the system prototype. One of the typical simulations that is done is an integration simulation that simulates the hardware and software at the same time. Most simulators used in this task are monolithic simulators. Some simulators do have the ability to have external libraries and simulators interface with it, but the setup can be a tedious task. This thesis proposes, implements and evaluates a distributed simulator called PDQScS, that allows for speed up of the simulation to reduce this bottleneck in the design cycle without the tedious separation and linking by the user. Using multiple processes and SMP machines a simulation run time reduction was found

eCommons@USASK

University of Saskatchewan Research Archive

SystemC Model of Power Side-Channel Attacks Against AI Accelerators: Superstition or not?

Author: Berekovic Mladen
Buchty Rainer
Eisenbarth Thomas
Mulhem Saleh
Nešković Andrija
Treff Alexander
Publication venue
Publication date: 22/11/2023
Field of study

As training artificial intelligence (AI) models is a lengthy and hence costly process, leakage of such a model's internal parameters is highly undesirable. In the case of AI accelerators, side-channel information leakage opens up the threat scenario of extracting the internal secrets of pre-trained models. Therefore, sufficiently elaborate methods for design verification as well as fault and security evaluation at the electronic system level are in demand. In this paper, we propose estimating information leakage from the early design steps of AI accelerators to aid in a more robust architectural design. We first introduce the threat scenario before diving into SystemC as a standard method for early design evaluation and how this can be applied to threat modeling. We present two successful side-channel attack methods executed via SystemC-based power modeling: correlation power analysis and template attack, both leading to total information leakage. The presented models are verified against an industry-standard netlist-level power estimation to prove general feasibility and determine accuracy. Consequently, we explore the impact of additive noise in our simulation to establish indicators for early threat evaluation. The presented approach is again validated via a model-vs-netlist comparison, showing high accuracy of the achieved results. This work hence is a solid step towards fast attack deployment and, subsequently, the design of attack-resilient AI accelerators

arXiv.org e-Print Archive

Translating Timing into an Architecture: The Synergy of COTSon and HLS (Domain Expertise: Designing a Computer Architecture via HLS)

Author: Giorgi Roberto
KHALILI MAYBODI Farnam
Procaccini Marco
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

Translating a system requirement into a low-level representation (e.g., register transfer level or RTL) is the typical goal of the design of FPGA-based systems. However, the Design Space Exploration (DSE) needed to identify the final architecture may be time consuming, even when using high-level synthesis (HLS) tools. In this article, we illustrate our hybrid methodology, which uses a frontend for HLS so that the DSE is performed more rapidly by using a higher level abstraction, but without losing accuracy, thanks to the HP-Labs COTSon simulation infrastructure in combination with our DSE tools (MYDSE tools). In particular, this proposed methodology proved useful to achieve an appropriate design of a whole system in a shorter time than trying to design everything directly in HLS. Our motivating problem was to deploy a novel execution model called data-flow threads (DF-Threads) running on yet-to-be-designed hardware. For that goal, directly using the HLS was too premature in the design cycle. Therefore, a key point of our methodology consists in defining the first prototype in our simulation framework and gradually migrating the design into the Xilinx HLS after validating the key performance metrics of our novel system in the simulator. To explain this workflow, we first use a simple driving example consisting in the modelling of a two-way associative cache. Then, we explain how we generalized this methodology and describe the types of results that we were able to analyze in the AXIOM project, which helped us reduce the development time from months/weeks to days/hours

Archivio della Ricerca - Università degli Studi di Siena

RITSim: distributed systemC simulation

Author: Cox David Richard
Publication venue: RIT Scholar Works
Publication date: 07/09/2005
Field of study

Parallel or distributed simulation is becoming more than a novel way to speedup design evaluation; it is becoming necessary for simulating modern processors in a reasonable timeframe. As architectural features become faster, smaller, and more complex, designers are interested in obtaining detailed and accurate performance and power estimations. Uniprocessor simulators may not be able to meet such demands. The RITSim project uses SystemC to model a processor microarchitecture and memory subsystem in great detail. SystemC is a C++ library built on a discrete-event simulation kernel. Many projects have successfully implemented parallel discrete-event simulation (PDES) frameworks to distribute simulation among several hosts. The field promises significant simulation speedup, possibly leading to faster turnaround time in design space exploration and commercial production. However, parallel implementation of such simulators is not an easy task. It requires modification of the simulation kernel for effective partitioning and synchronization. This thesis explores PDES techniques and presents a distributed version of the SystemC simulation environment. With minimal user interaction, SystemC models can executed on a cluster of workstations using a message-passing library such as the Message Passing Interface (MPI). The implementation is designed for transparency; distribution and synchronization happen with little intervention by the model author. Modification of SystemC is fashioned to promote maintainability with future releases. Furthermore, only freely available libraries are used for maximum flexibility and portability

RIT Scholar Works

A Power-Efficient Methodology for Mapping Applications on Multi-Processor System-on-Chip Architectures

Author: C. Silvano
D. Sciuto
G. Beltrame
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

This work introduces an application mapping methodology and case study for multi-processor on-chip architectures. Starting from the description of an application in standard sequential code (e.g. in C), first the application is profiled, parallelized when possible, then its components are moved to hardware implementation when necessary to satisfy performance and power constraints. After mapping, with the use of hardware objects to handle concurrency, the application power consumption can be further optimized by a task-based scheduler for the remaining software part, without the need for operating system support. The key contributions of this work are: a methodology for high-level hardware/software partitioning that allows the designer to use the same code for both hardware and software models for simulation, providing nevertheless preliminary estimations for timing and power consumption; and a task-based scheduling algorithm that does not require operating system support. The methodology has been applied to the co-exploration of an industrial case study: an MPEG4 VGA real-time encoder

Archivio istituzionale della ricerca - Politecnico di Milano

Archexplorer for automatic design space exploration

Author: Desmet V.
Girbal Sylvain
Ramírez Bellido Alejandro
Temam Olivier
Vega Augusto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Growing architectural complexity and stringent time-to-market constraints suggest the need to move architecture design beyond parametric exploration to structural exploration. ArchExplorer is a Web-based permanent and open design-space exploration framework that lets researchers compare their designs against others. The authors demonstrate their approach by exploring the design space of an on-chip memory subsystem and a multicore processor.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Ghent University Academic Bibliography

Battery-aware electric truck delivery route exploration

Author: Baek Donkyu
Chang Naehyuck
Chen Yukai
Macii Enrico
Poncino Massimo
Publication venue: 'MDPI AG'
Publication date: 24/04/2020
Field of study

The energy-optimal routing of Electric Vehicles (EVs) in the context of parcel delivery is more complicated than for conventional Internal Combustion Engine (ICE) vehicles, in which the total travel distance is the most critical metric. The total energy consumption of EV delivery strongly depends on the order of delivery because of transported parcel weight changing over time, which directly affects the battery efficiency. Therefore, it is not suitable to find an optimal routing solution with traditional routing algorithms such as the Traveling Salesman Problem (TSP), which uses a static quantity (e.g., distance) as a metric. In this paper, we explore appropriate metrics considering the varying transported parcel total weight and achieve a solution for the least-energy delivery problem using EVs. We implement an electric truck simulator based on the EV powertrain model and nonlinear battery model. We evaluate different metrics to assess their quality on small size instances for which the optimal solution can be computed exhaustively. A greedy algorithm using the empirically best metric (namely, distance × residual weight) provides significant reductions (up to 33%) with respect to a common-sense heaviest first package delivery route determined using a metric suggested by the battery properties. This algorithm also outperforms the state-of-the-art TSP heuristic algorithms, which consumes up to 12.46% more energy and 8.6 times more runtime. We also estimate how the proposed algorithms work well on real roads interconnecting cities located at different altitudes as a case study

Multidisciplinary Digital Publishing Institute

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)