Search CORE

414 research outputs found

Modeling DVFS and Power-Gating Actuators for Cycle-Accurate NoC-Based Simulators

Author: FORNACIARI WILLIAM
ZONI DAVIDE
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Networks-on-chip (NoCs) are a widely recognized viable interconnection paradigm to support the multi-core revolution. One of the major design issues of multicore architectures is still the power, which can no longer be considered mainly due to the cores, since the NoC contribution to the overall energy budget is relevant. To face both static and dynamic power while balancing NoC performance, different actuators have been exploited in literature, mainly dynamic voltage frequency scaling (DVFS) and power gating. Typically, simulation-based tools are employed to explore the huge design space by adopting simplified models of the components. As a consequence, the majority of state-of-the-art on NoC power-performance optimization do not accurately consider timing and power overheads of actuators, or (even worse) do not consider them at all, with the risk of overestimating the benefits of the proposed methodologies. This article presents a simulation framework for power-performance analysis of multicore architectures with specific focus on the NoC. It integrates accurate power gating and DVFS models encompassing also their timing and power overheads. The value added of our proposal is manyfold: (i) DVFS and power gating actuators are modeled starting from SPICE-level simulations; (ii) such models have been integrated in the simulation environment; (iii) policy analysis support is plugged into the framework to enable assessment of different policies; (iv) a flexible GALS (globally asynchronous locally synchronous) support is provided, covering both handshake and FIFO re-synchronization schemas. To demonstrate both the flexibility and extensibility of our proposal, two simple policies exploiting the modeled actuators are discussed in the article

Archivio istituzionale della ricerca - Politecnico di Milano

Weighted Round Robin Configuration for Worst-Case Delay Optimization in Network-on-Chip

Author: Jafari F.
Jafari F.
Jantsch Axel
Jantsch Axel
Lu Zhonghai
Lu Zhonghai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

We propose an approach for computing the end-to-end delay bound of individual variable bit-rate flows in a FIFO multiplexer with aggregate scheduling under Weighted Round Robin (WRR) policy. To this end, we use network calculus to derive per-flow end-to-end equivalent service curves employed for computing Least Upper Delay Bounds (LUDBs) of individual flows. Since real time applications are going to meet guaranteed services with lower delay bounds, we optimize weights in WRR policy to minimize LUDBs while satisfying performance constraints. We formulate two constrained delay optimization problems, namely, Minimize-Delay and Multiobjective optimization. Multi-objective optimization has both total delay bounds and their variance as minimization objectives. The proposed optimizations are solved using a genetic algorithm. A Video Object Plane Decoder (VOPD) case study exhibits 15.4% reduction of total worst-case delays and 40.3% reduction on the variance of delays when compared with round robin policy. The optimization algorithm has low run-time complexity, enabling quick exploration of large design spaces. We conclude that an appropriate weight allocation can be a valuable instrument for delay optimization in on-chip network designs

UEL Research Repository at University of East London

Crossref

Hope's Institutional Research Archive

Performance Aspects of Synthesizable Computing Systems

Author: Schleuniger Pascal
Publication venue: Technical University of Denmark
Publication date: 01/01/2014
Field of study

Online Research Database In Technology

Homogeneous and heterogeneous MPSoC architectures with network-on-chip connectivity for low-power and real-time multimedia signal processing

Author: Luca Fanucci
Sergio Saponara
Publication venue
Publication date: 01/01/2012
Field of study

Two multiprocessor system-on-chip (MPSoC) architectures are proposed and compared in the paper with reference to audio and video processing applications. One architecture exploits a homogeneous topology; it consists of 8 identical tiles, each made of a 32-bit RISC core enhanced by a 64-bit DSP coprocessor with local memory. The other MPSoC architecture exploits a heterogeneous-tile topology with on-chip distributed memory resources; the tiles act as application specific processors supporting a different class of algorithms. In both architectures, the multiple tiles are interconnected by a network-on-chip (NoC) infrastructure, through network interfaces and routers, which allows parallel operations of the multiple tiles. The functional performances and the implementation complexity of the NoC-based MPSoC architectures are assessed by synthesis results in submicron CMOS technology. Among the large set of supported algorithms, two case studies are considered: the real-time implementation of an H.264/MPEG AVC video codec and of a low-distortion digital audio amplifier. The heterogeneous architecture ensures a higher power efficiency and a smaller area occupation and is more suited for low-power multimedia processing, such as in mobile devices. The homogeneous scheme allows for a higher flexibility and easier system scalability and is more suited for general-purpose DSP tasks in power-supplied devices

Directory of Open Access Journals

Archivio della Ricerca - Università di Pisa

Open Access Repository

Adaptive Multiclient Network-on-Chip Memory Core : Hardware Architecture, Software Abstraction Layer, and Application Exploration

Author: Becker Jürgen
Göhringer Diana
Hübner Michael
Meder Lukas
Oey Oliver
Werner Stephan
Publication venue: Hindawi
Publication date: 01/01/2012
Field of study

This paper presents the hardware architecture and the software abstraction layer of an adaptive multiclient Network-on-Chip (NoC) memory core. The memory core supports the flexibility of a heterogeneous FPGA-based runtime adaptive multiprocessor system called RAMPSoC. The processing elements, also called clients, can access the memory core via the Network-on-Chip (NoC). The memory core supports a dynamic mapping of an address space for the different clients as well as different data transfer modes, such as variable burst sizes. Therefore, two main limitations of FPGA-based multiprocessor systems, the restricted on-chip memory resources and that usually only one physical channel to an off-chip memory exists, are leveraged. Furthermore, a software abstraction layer is introduced, which hides the complexity of the memory core architecture and which provides an easy to use interface for the application programmer. Finally, the advantages of the novel memory core in terms of performance, flexibility, and user friendliness are shown using a real-world image processing application

Crossref

KITopen

Directory of Open Access Journals

Adaptive Multiclient Network-on-Chip Memory Core: Hardware Architecture, Software Abstraction Layer, and Application Exploration

Author: Diana Göhringer
Jürgen Becker
Lukas Meder
Michael Hübner
Oliver Oey
Stephan Werner
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2012
Field of study

Crossref

Directory of Open Access Journals

NoC Design Flow for TDMA and QoS Management in a GALS Context

Author: Diguet Jean-Philippe
Evain Samuel
Houzet Dominique
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2006
Field of study

International audienceThis paper proposes a new approach dealing with the tedious problem of NoC guaranteed traffics according to GALS constraints impelled by the upcoming large System-on-Chips with multiclock domains. Our solution has been designed to adjust a tradeoff between synchronous and clockless asynchronous techniques. By means of smart interfaces between synchronous sub-NoCs, Quality-of-Service (QoS) for guaranteed traffic is assured over the entire chip despite clock heterogeneity. This methodology can be easily integrated in the usual NoC design flow as an extension to traditional NoC synchronous design flows. We present real implementation obtained with our tool for a 4G telecom scheme

HAL-CentraleSupelec

Springer - Publisher Connector

Directory of Open Access Journals

HAL-Université de Bretagne Occidentale

HAL-Rennes 1