7 research outputs found

    Identification of dynamic circuit specialization opportunities in RTL code

    Get PDF
    Dynamic Circuit Specialization (DCS) optimizes a Field-Programmable Gate Array (FPGA) design by assuming a set of its input signals are constant for a reasonable amount of time, leading to a smaller and faster FPGA circuit. When the signals actually change, a new circuit is loaded into the FPGA through runtime reconfiguration. The signals the design is specialized for are called parameters. For certain designs, parameters can be selected so the DCS implementation is both smaller and faster than the original implementation. However, DCS also introduces an overhead that is difficult for the designer to take into account, making it hard to determine whether a design is improved by DCS or not. This article presents extensive results on a profiling methodology that analyses Register-Transfer Level (RTL) implementations of applications to check if DCS would be beneficial. It proposes to use the functional density as a measure for the area efficiency of an implementation, as this measure contains both the overhead and the gains of a DCS implementation. The first step of the methodology is to analyse the dynamic behaviour of signals in the design, to find good parameter candidates. The overhead of DCS is highly dependent on this dynamic behaviour. A second stage calculates the functional density for each candidate and compares it to the functional density of the original design. The profiling methodology resulted in three implementations of a profiling tool, the DCS-RTL profiler. The execution time, accuracy, and the quality of each implementation is assessed based on data from 10 RTL designs. All designs, except for the two 16-bit adaptable Finite Impulse Response (FIR) filters, are analysed in 1 hour or less

    CAMP: A technique to estimate per-structure power at run-time using a few simple parameters

    Full text link
    Microprocessor power has become a first-order constraint at run-time. Designers must employ aggressive power-management techniques at run-time to keep a processor’s ballooning power requirements under control. Effective power management benefits from knowledge of run-time microprocessor power consumption in both the core and individual microarchitectural structures, such as caches, queues, and execution units. Increasingly feasible per-structure power-control techniques, such as fine-grain clock gat-ing, power gating, and dynamic voltage/frequency scaling (DVFS), become more effective from run-time estimates of per-structure power. However, run-time computation of per-structure power esti-mates based on utilization requires daunting numbers of input sta-tistics, which makes per-structure monitoring of run-time power a challenging problem

    Caractérisation en puissance des instructions d'un processeur multicoeur asynchrone

    Get PDF
    Ce travail porte sur une technique d’estimation de consommation de puissance et d’énergie pour des programmes logiciels exécutés par un processeur multicoeur asynchrone en traitement parallèle. Cette technique se veut d’abord et avant tout simple et de première estimation. L’application de cette technique permet dans un premier temps de caractériser chaque instruction en assembleur en termes de fréquence d’exécution propre et de temps d’exécution distinct. Ces temps d’exécution sont ensuite utilisés dans le calcul théorique d’un modèle de consommation de puissance et d’énergie représentant un programme logiciel exécutant ces mêmes instructions. Les estimations théoriques des consommations de puissance et d’énergie sont validées en les comparant aux mesures expérimentales de consommation effectuées. Des erreurs relatives de moins de 1% sont obtenues pour des programmes simples et des erreurs relatives variant de 6.1% à 9.6% pour les programmes plus complexes. Cette technique permet de connaître très tôt dans le processus de conception la consommation énergétique d’un programme logiciel spécifique exécuté avec un processeur asynchrone. Ultimement, cette technique simple pourrait aider à réduire la consommation énergétique d’un processeur asynchrone en guidant le concepteur dans ses choix algorithmiques

    Architectural Power Estimation Based on Behavior Level Profiling

    No full text
    High level synthesis is the process of generating register transfer (RT) level designs from behavioral specifications. High level synthesis systems have traditionally taken into account such constraints as area, clock period and throughput time. Many high level synthesis systems [1] permit generation of many alternative RT level designs meeting these constraints in a relatively short time. If it is possible to accurately estimate the power consumption of RT level designs, then a low power design from among these alternatives can be selected.In this paper, we present an accurate power estimation technique for register transfer level designs generated by high level synthesis systems. The technique has four main aspects: (1) Each RT level component used in high level synthesis is characterized for average switched capacitance per input vector. This data is stored in the RT level component library. (2) Using user-specified stimuli, the given behavioral description is simulated and event activities of various operators and carriers are measured. Then, the behavioral specification is submitted to the synthesis system and a number of alternative RTL designs meeting speed, space and throughput rate constraints are generated. (3) Event activity of each component in an RT level design is estimated using the event activities measured at the time of behavior level profiling and the structure of the RTL design itself. (4) The event activities so obtained are then used to modulate the average switched capacitances of the respective RT level components to obtain an estimate the total switched capacitance of each component.Detailed power estimation procedures for the three different parts of RTL designs, namely, data path, controller and interconnect are presented. Experimental results obtained from a variety of designs show that the power estimates are within 3%–10% of the actual power measured by simulating the transistor level designs extracted from mask layouts

    Architectural Power Estimation Based on Behavior Level Profiling

    No full text
    High level synthesis is the process of generating register transfer (RT) level designs from behavioral specifications. High level synthesis systems have traditionally taken into account such constraints as area, clock period and throughput time. Many high level synthesis systems [1] permit generation of many alternative RT level designs meeting these constraints in a relatively short time. If it is possible to accurately estimate the power consumption of RT level designs, then a low power design from among these alternatives can be selected.In this paper, we present an accurate power estimation technique for register transfer level designs generated by high level synthesis systems. The technique has four main aspects: (1) Each RT level component used in high level synthesis is characterized for average switched capacitance per input vector. This data is stored in the RT level component library. (2) Using user-specified stimuli, the given behavioral description is simulated and event activities of various operators and carriers are measured. Then, the behavioral specification is submitted to the synthesis system and a number of alternative RTL designs meeting speed, space and throughput rate constraints are generated. (3) Event activity of each component in an RT level design is estimated using the event activities measured at the time of behavior level profiling and the structure of the RTL design itself. (4) The event activities so obtained are then used to modulate the average switched capacitances of the respective RT level components to obtain an estimate the total switched capacitance of each component.Detailed power estimation procedures for the three different parts of RTL designs, namely, data path, controller and interconnect are presented. Experimental results obtained from a variety of designs show that the power estimates are within 3%–10% of the actual power measured by simulating the transistor level designs extracted from mask layouts
    corecore