    Power estimation on functional level for programmable processors

    In diesem Beitrag werden verschiedene Ansätze zur Verlustleistungsschätzung von programmierbaren Prozessoren vorgestellt und bezĂĽglich ihrer Ăśbertragbarkeit auf moderne Prozessor-Architekturen wie beispielsweise Very Long Instruction Word (VLIW)-Architekturen bewertet. Besonderes Augenmerk liegt hierbei auf dem Konzept der sogenannten Functional-Level Power Analysis (FLPA). Dieser Ansatz basiert auf der Einteilung der Prozessor-Architektur in funktionale Blöcke wie beispielsweise Processing-Unit, Clock-Netzwerk, interner Speicher und andere. Die Verlustleistungsaufnahme dieser Bl¨ocke wird parameterabhängig durch arithmetische Modellfunktionen beschrieben. Durch automatisierte Analyse von Assemblercodes des zu schätzenden Systems mittels eines Parsers können die Eingangsparameter wie beispielsweise der erzielte Parallelitätsgrad oder die Art des Speicherzugriffs gewonnen werden. Dieser Ansatz wird am Beispiel zweier moderner digitaler Signalprozessoren durch eine Vielzahl von Basis-Algorithmen der digitalen Signalverarbeitung evaluiert. Die ermittelten Schätzwerte fĂĽr die einzelnen Algorithmen werden dabei mit physikalisch gemessenen Werten verglichen. Es ergibt sich ein sehr kleiner maximaler Schätzfehler von 3%.</p><p style=&quot;line-height: 20px;&quot;> In this contribution different approaches for power estimation for programmable processors are presented and evaluated concerning their capability to be applied to modern digital signal processor architectures like e.g. Very Long InstructionWord (VLIW) -architectures. Special emphasis will be laid on the concept of so-called Functional-Level Power Analysis (FLPA). This approach is based on the separation of the processor architecture into functional blocks like e.g. processing unit, clock network, internal memory and others. The power consumption of these blocks is described by parameter dependent arithmetic model functions. By application of a parser based automized analysis of assembler codes of the systems to be estimated the input parameters of the Correspondence to: H. Blume ([email protected]) arithmetic functions like e.g. the achieved degree of parallelism or the kind and number of memory accesses can be computed. This approach is exemplarily demonstrated and evaluated applying two modern digital signal processors and a variety of basic algorithms of digital signal processing. The resulting estimation values for the inspected algorithms are compared to physically measured values. A resulting maximum estimation error of 3% is achieved

    Optimizing the flash-RAM energy trade-off in deeply embedded systems

    Deeply embedded systems often have the tightest constraints on energy consumption, requiring that they consume tiny amounts of current and run on batteries for years. However, they typically execute code directly from flash, instead of the more energy efficient RAM. We implement a novel compiler optimization that exploits the relative efficiency of RAM by statically moving carefully selected basic blocks from flash to RAM. Our technique uses integer linear programming, with an energy cost model to select a good set of basic blocks to place into RAM, without impacting stack or data storage. We evaluate our optimization on a common ARM microcontroller and succeed in reducing the average power consumption by up to 41% and reducing energy consumption by up to 22%, while increasing execution time. A case study is presented, where an application executes code then sleeps for a period of time. For this example we show that our optimization could allow the application to run on battery for up to 32% longer. We also show that for this scenario the total application energy can be reduced, even if the optimization increases the execution time of the code

    Static analysis of energy consumption for LLVM IR programs

    Energy models can be constructed by characterizing the energy consumed by executing each instruction in a processor's instruction set. This can be used to determine how much energy is required to execute a sequence of assembly instructions, without the need to instrument or measure hardware. However, statically analyzing low-level program structures is hard, and the gap between the high-level program structure and the low-level energy models needs to be bridged. We have developed techniques for performing a static analysis on the intermediate compiler representations of a program. Specifically, we target LLVM IR, a representation used by modern compilers, including Clang. Using these techniques we can automatically infer an estimate of the energy consumed when running a function under different platforms, using different compilers. One of the challenges in doing so is that of determining an energy cost of executing LLVM IR program segments, for which we have developed two different approaches. When this information is used in conjunction with our analysis, we are able to infer energy formulae that characterize the energy consumption for a particular program. This approach can be applied to any languages targeting the LLVM toolchain, including C and XC or architectures such as ARM Cortex-M or XMOS xCORE, with a focus towards embedded platforms. Our techniques are validated on these platforms by comparing the static analysis results to the physical measurements taken from the hardware. Static energy consumption estimation enables energy-aware software development, without requiring hardware knowledge

    Identifying Compiler Options to Minimise Energy Consumption for Embedded Platforms

    This paper presents an analysis of the energy consumption of an extensive number of the optimisations a modern compiler can perform. Using GCC as a test case, we evaluate a set of ten carefully selected benchmarks for five different embedded platforms. A fractional factorial design is used to systematically explore the large optimisation space (2^82 possible combinations), whilst still accurately determining the effects of optimisations and optimisation combinations. Hardware power measurements on each platform are taken to ensure all architectural effects on the energy consumption are captured. We show that fractional factorial design can find more optimal combinations than relying on built in compiler settings. We explore the relationship between run-time and energy consumption, and identify scenarios where they are and are not correlated. A further conclusion of this study is the structure of the benchmark has a larger effect than the hardware architecture on whether the optimisation will be effective, and that no single optimisation is universally beneficial for execution time or energy consumption.Comment: 14 pages, 7 figure

    Caractérisation automatisée de la consommation de puissance des processeurs pour l'estimation au niveau système

    RÉSUMÉ De nos jours, la consommation de puissance est une contrainte clé et une métrique de performance essentielle lors du design des systèmes numériques. La dissipation de chaleur excessive sur les circuits intégrés diminue relativement leurs performances. Également, plus que jamais, nous avons le besoin d’augmenter le temps de vie des batteries de nouvelles électroniques portables. Avec les techniques de design classiques, RTL « Register Transfer Level », une estimation de puissance précise est possible seulement aux dernières étapes du processus de développement. Pour remédier à cette problématique, on a récemment proposé dans la littérature de hausser le niveau d’abstraction de la conception de systèmes embarqués à l’aide de la méthodologie de niveau système « Electronic System Level » (ESL). Dans cette perspective, ce travail propose une méthodologie capable de caractériser automatiquement la consommation de puissance des processeurs configurable de type « soft-processors » et de générer un modèle efficace pour l’estimation de l’énergie consommée au niveau système. À l'aide de ce modèle, une étude comparative entre trois techniques d’estimation est donc présentée. Les résultats de cinq programmes tests montrent une estimation de puissance huit mille fois plus rapide que les techniques d’estimation conventionnelles et une erreur moyenne de seulement ±3.98 % pour le processeur LEON3 et de ±10.70 % pour le processeur Microblaze.----------ABSTRACT Nowadays, power consumption is a key constraint and a digital system design essential metric of performance. Excessive heat dissipation of integrated circuits relatively decreases the performance of the system. Also, more than ever, we need to increase the battery lifetime of new portable electronics. With classical design techniques as RTL « Register Transfer Level », precise power estimation is only possible in the final stages of the development process. To solve this problem, the literature recently proposed to raise the abstraction level of embedded systems design, using ESL « Electronic System Level » methodology. In this context, this project proposes a methodology to automatically characterize configurable soft-processors power consumption and generate an effective power model for energy consumption estimation at system level. Using this model, a comparative study between three estimation techniques is also presented. The results of five benchmarks show that our power estimation is eight thousand times faster than conventional estimation techniques and an average error of only ±3.98 % for the LEON3 processor and ±10.70 % for the Microblaze processor

    System-Level Power Estimation Methodology for MPSoC based Platforms

    Avec l'essor des nouvelles technologies d'intégration sur silicium submicroniques, la consommation de puissance dans les systèmes sur puce multiprocesseur (MPSoC) est devenue un facteur primordial au niveau du flot de conception. La prise en considération de ce facteur clé dès les premières phases de conception, joue un rôle primordial puisqu'elle permet d'augmenter la fiabilité des composants et de réduire le temps d'arrivée sur le marché du produit final.Shifting the design entry point up to the system-level is the most important countermeasure adopted to manage the increasing complexity of Multiprocessor System on Chip (MPSoC). The reason is that decisions taken at this level, early in the design cycle, have the greatest impact on the final design in terms of power and energy efficiency. However, taking decisions at this level is very difficult, since the design space is extremely wide and it has so far been mostly a manual activity. Efficient system-level power estimation tools are therefore necessary to enable proper Design Space Exploration (DSE) based on power/energy and timing.VALENCIENNES-Bib. électronique (596069901) / SudocSudocFranceF

    Function-Level Power Estimation Methodology for Microprocessors

    We have developed a function-level power estimation methodology for predicting the power dissipation of embedded software. For a given microprocessor core, we empirically build the “power data bank”, which stores the power information of the built-in library functions and basic instructions. To estimate the average power of an embedded software on this core, we first get the execution information of the target software from program profiling/tracing tools. Then we evaluate the total energy consumption and execution time based on the “power data bank”, and take their ratio as the average power. High efficiency is achieved because no power simulator is used once the “power data bank” is built. We apply this method to a commercial microprocessor core and get power estimates with an average error of 3%. With this method, microprocessor vendors can provide users the “power data bank” without releasing details of the core to help users get early power estimates and eventually guide power optimization