Search CORE

98 research outputs found

An MDE Approach for Energy Consumption Estimation in MPSoC Design

Author: Ben Atitallah Rabie
Dekeyser Jean Luc
Jemai Abderrazak
Meftali Samy
Niar Smail
Trabelsi Chiraz
Publication venue: HAL CCSD
Publication date: 24/05/2010
Field of study

International audienceEnergy Consumption is a leading criterion to take into ac- count in the design of multiprocessor systems on chip (MP- SoC). In this paper, we present a solution to estimate the energy consumption early inMPSoC design in order to nd a good performance/energy trade-o in the design ow. This solution is based on the injection of consumption estimators between the hardware components during the co-simulation of a system at the CABA (Cycle Accurate Bit Accurate) level. These estimators are designed using a design frame- work and the corresponding SystemC code is automatically generated thanks to a model driven approach. Our solution oers an energy estimation framework without changing the IP(Intellectual Property)source codes, using standalone es- timation modules, which allows their reuse. The accuracy of this approach is checked by integrating the consumption estimation in the simulation of signicant applications

HAL - Lille 3

INRIA a CCSD electronic archive server

Cycle Accurate Simulation Model Generation for SoC Prototyping

Author: Fraboulet Antoine
Risset Tanguy
Scherrer Antoine
Publication venue: HAL CCSD
Publication date: 01/05/2004
Field of study

RR 2004-18, ENS-Lyon, 24 pagesWe present new results concerning the integration of high level designed ips into a complete System on Chip. We first introduce a new compu- tation model that can be used for cycle accurate simulation of register transfer level synthesized hardware. Then we provide simulation of a SoC integrating a data-flow ip synthesized with MMAlpha and the So- cLib cycle accurate simulation environment. This integration also vali- dates an efficient generic interface mechanism for data-flow ips

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Transaction-level power analysis of VLSI digital systems

Author: Conti M.
Orcioni S.
Vece G.B.
Publication venue
Publication date: 01/01/2015
Field of study

IRIS UniversitÃ Politecnica delle Marche

Open Access Repository

A Study of Hardware Performance Counters Selection for Cross Architectural GPU Power Modeling

Author: De Giusti Armando Eduardo
De Giusti Laura Cristina
Naiouf Marcelo
Pi Puig Martín
Publication venue
Publication date: 16/03/2020
Field of study

In the exascale race where huge corporations are spending billions of dollars on designing highly efficient heterogeneous supercomputers, the real need to reduce power envelopes forces current technologies to face crucial challenges as well as it demands the scientific community to evaluate and optimize the performance-power ratio. While energy consumption continues to climb up, the viability of these massive systems becomes a growing concern. In this context, the relevance of specific power-related research works turns into a priority. So we here develop an exhaustive step-by-step process for selecting a comprehensive set of hardware performance counters to serve as an input in an eventual GPU cross-architectural power consumption model. Our experiments show a high power-performance correlation between shared GPU events. Also, we present a set of events that delivers exclusive performance information in order to predict accurately GPU power fluctuations.XX Workshop Procesamiento Distribuido y Paralelo.Red de Universidades con Carreras en Informátic

A Study of Hardware Performance Counters Selection for Cross Architectural GPU Power Modeling

Author: De Giusti Armando Eduardo
De Giusti Laura Cristina
Naiouf Marcelo
Pi Puig Martín
Publication venue
Publication date: 01/10/2019
Field of study

Servicio de Difusión de la Creación Intelectual

A Study of Hardware Performance Counters Selection for Cross Architectural GPU Power Modeling

Author: De Giusti Armando Eduardo
De Giusti Laura Cristina
Naiouf Marcelo
Pi Puig Martín
Publication venue
Publication date: 01/10/2019
Field of study

FPGA accelerator for gradient boosting decision trees

Author: Alcolea A.
Resano J.
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

A decision tree is a well-known machine learning technique. Recently their popularity has increased due to the powerful Gradient Boosting ensemble method that allows to gradually increasing accuracy at the cost of executing a large number of decision trees. In this paper we present an accelerator designed to optimize the execution of these trees while reducing the energy consumption. We have implemented it in an FPGA for embedded systems, and we have tested it with a relevant case-study: pixel classification of hyperspectral images. In our experiments with different images our accelerator can process the hyperspectral images at the same speed at which they are generated by the hyperspectral sensors. Compared to a high-performance processor running optimized software, on average our design is twice as fast and consumes 72 times less energy. Compared to an embedded processor, it is 30 times faster and consumes 23 times less energy

Repositorio Universidad de Zaragoza

Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

Author: Becker Jürgen
Hübner Michael
Lagadec Loïc
Sander Oliver
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2010
Field of study

ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

KITopen

Proceedings of the 7th Junior Researcher Workshop on Real-Time Computing: JRWRTC 2013: Sophia Antipolis, France, October 16-18, 2013

Author: Altmeyer S.
Publication venue: Faculty of Science, University of Amsterdam
Publication date: 01/01/2013
Field of study

International Migration, Integration and Social Cohesion online publications

Smart hardware designs for probabilistically-analyzable processor architectures

Author: Benedicte Illescas Pedro
Publication venue: Universitat Politècnica de Catalunya
Publication date: 07/04/2022
Field of study

Future Critical Real-Time Embedded Systems (CRTES), like those is planes, cars or trains, require more and more guaranteed performance in order to satisfy the increasing performance demands of advanced complex software features. While increased performance can be achieved by deploying processor techniques currently used in High-Performance Computing (HPC) and mainstream domains, their use challenges the software timing analysis, a necessary step in CRTES' verification and validation. Cache memories are known to have high impact in performance, and in fact, current CRTES include multicores usually with several levels of cache. In this line, this Thesis aims at increasing the guaranteed performance of CRTES by using techniques for caches building upon time randomization and providing probabilistic guarantees of tasks' execution time. In this Thesis, we first focus on on improving cache placement and replacement to improve guaranteed performance. For placement, different existing policies are explored in a multi-level cache setup, and a solution is reached in which different of those policies are combined. For cache replacement, we analyze a pathological scenario that no cache policy so far accounts and propose several policies that fix this pathological scenario. For shared caches in multicore we observe that contention is mainly caused by private writes that go through to the shared cache, yet using a pure write-back policy also has its drawbacks. We propose a hybrid approach to mitigate this contention. Building on this solution, the next contribution tackles a problem caused by the need of some reliability mechanisms in CRTES. Implementing reliability close to the processor's core has a significant impact in performance. A look-ahead error detection solution is proposed to greatly mitigate the performance impact. The next contribution proposes the first hardware prefetcher for CRTES with arbitrary cache hierarchies. Given its speculative nature, prefetchers that have a guaranteed positive impact on performance are difficult to design. We present a framework that provides execution time guarantees and obtains a performance benefit. Finally, we focus on the impact of timing anomalies in CRTES with caches. For the first time, a definition and taxonomy of timing anomalies is given for Measurement-Based Timing Analysis. Then, we focus on a specific timing anomaly that can happen with caches and provide a solution to account for it in the execution time estimates.Los Sistemas Empotrados de Tiempo-Real Crítico (SETRC), como los de los aviones, coches o trenes, requieren más y más rendimiento garantizado para satisfacer la demanda al alza de rendimiento para funciones complejas y avanzadas de software. Aunque el incremento en rendimiento puede ser adquirido utilizando técnicas de arquitectura de procesadores actualmente utilizadas en la Computación de Altas Prestaciones (CAP) i en los dominios convencionales, este uso presenta retos para el análisis del tiempo de software, un paso necesario en la verificación y validación de SETRC. Las memorias caches son conocidas por su gran impacto en rendimiento y, de hecho, los actuales SETRC incluyen multicores normalmente con diversos niveles de cache. En esta línea, esta Tesis tiene como objetivo mejorar el rendimiento garantizado de los SETRC utilizando técnicas para caches y utilizando métodos como la randomización del tiempo y proveyendo garantías probabilísticas de tiempo de ejecución de las tareas. En esta Tesis, primero nos centramos en mejorar la colocación y el reemplazo de caches para mejorar el rendimiento garantizado. Para la colocación, diferentes políticas son exploradas en un sistema cache multi-nivel, y se llega a una solución donde diversas de estas políticas son combinadas. Para el reemplazo, analizamos un escenario patológico que ninguna política actual tiene en cuenta, y proponemos varias políticas que solucionan este escenario patológico. Para caches compartidas en multicores, observamos que la contención es causada principalmente por escrituras privadas que van a través de la cache compartida, pero usar una política de escritura retardada pura también tiene sus consecuencias. Proponemos un enfoque híbrido para mitigar la contención. Sobre esta solución, la siguiente contribución ataca un problema causado por la necesidad de mecanismos de fiabilidad en SETRC. Implementar fiabilidad cerca del núcleo del procesador tiene un impacto significativo en rendimiento. Una solución basada en anticipación se propone para mitigar el impacto en rendimiento. La siguiente contribución propone el primer prefetcher hardware para SETRC con una jerarquía de caches arbitraria. Por primera vez, se da una definición y taxonomía de anomalías temporales para Análisis Temporal Basado en Medidas. Después, nos centramos en una anomalía temporal concreta que puede pasar con caches y ofrecemos una solución que la tiene en cuenta en las estimaciones del tiempo de ejecución.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa