Search CORE

3,400 research outputs found

Static analysis of energy consumption for LLVM IR programs

Author: Eder Kerstin
Georgiou Kyriakos
Grech Neville
Kerrison Steve
Morse Jeremy
Pallister James
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2015
Field of study

Energy models can be constructed by characterizing the energy consumed by executing each instruction in a processor's instruction set. This can be used to determine how much energy is required to execute a sequence of assembly instructions, without the need to instrument or measure hardware. However, statically analyzing low-level program structures is hard, and the gap between the high-level program structure and the low-level energy models needs to be bridged. We have developed techniques for performing a static analysis on the intermediate compiler representations of a program. Specifically, we target LLVM IR, a representation used by modern compilers, including Clang. Using these techniques we can automatically infer an estimate of the energy consumed when running a function under different platforms, using different compilers. One of the challenges in doing so is that of determining an energy cost of executing LLVM IR program segments, for which we have developed two different approaches. When this information is used in conjunction with our analysis, we are able to infer energy formulae that characterize the energy consumption for a particular program. This approach can be applied to any languages targeting the LLVM toolchain, including C and XC or architectures such as ARM Cortex-M or XMOS xCORE, with a focus towards embedded platforms. Our techniques are validated on these platforms by comparing the static analysis results to the physical measurements taken from the hardware. Static energy consumption estimation enables energy-aware software development, without requiring hardware knowledge

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks

Author: Amarasinghe Saman
Carbin Michael
Mendis Charith
Renda Alex
Publication venue
Publication date: 30/05/2019
Field of study

Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady state (the throughput) is important for both compiler designers and performance engineers. Building an analytical model to do so is especially complicated in modern x86-64 Complex Instruction Set Computer (CISC) machines with sophisticated processor microarchitectures in that it is tedious, error prone, and must be performed from scratch for each processor generation. In this paper we present Ithemal, the first tool which learns to predict the throughput of a set of instructions. Ithemal uses a hierarchical LSTM--based approach to predict throughput based on the opcodes and operands of instructions in a basic block. We show that Ithemal is more accurate than state-of-the-art hand-written tools currently used in compiler backends and static machine code analyzers. In particular, our model has less than half the error of state-of-the-art analytical models (LLVM's llvm-mca and Intel's IACA). Ithemal is also able to predict these throughput values just as fast as the aforementioned tools, and is easily ported across a variety of processor microarchitectures with minimal developer effort.Comment: Published at 36th International Conference on Machine Learning (ICML) 201

arXiv.org e-Print Archive

DSpace@MIT

Accelerating host-compiled simulation by modifying IR code: industrial application in the spatial domain

Author: Posadas Cobo Héctor
Sánchez Renedo Manuel
Villar Bonet Eugenio, 1957-
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Space applications rely on long and complex design processes, as they must deal with strict non-functional requirements such as criticality, timeliness, reliability and safety. The huge number of analysis and evaluations performed requires powerful simulations technologies combining high simulation speed and accuracy. Host-compiled simulation is a powerful approach to achieve fast, timed simulation of software running in complex embedded systems. However, in the general term, there is still the need of improving the speed and accuracy of these solutions, and there is a lack of host-compiled approaches oriented to space applications. To solve the first point, this paper presents an alternative that modifies the standard solution of adding the modeling of the cross-compiled control flow in the host computer by modifying the compiler's intermediate representation. That way, the host binary naturally follows the cross-compiled binary flow, avoiding a separate modeling, and improving simulation speed while maintaining accuracy. Additionally, the paper focuses on LEON processor, commonly used by the European Space Agency (ESA).This work has been funded by FEDER/Ministerio de Ciencia, Innovación y Universidades - Agencia Estatal de Investigación/ TEC2017-86722-C4-3-R and the EC through the FP7-JTI 621429 EMC2 project

Crossref

UCrea

Performance modeling of embedded applications with zero architectural knowledge

Author: F. Ferrandi
M. Lattuada
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Performance estimation is a key step in the development of an embedded system. Normally, the performance evaluation is performed using a simulator or a performance mathematical model of the target architecture. However, both these approaches are usually based on the knowledge of the architectural details of the target. In this paper we present a methodology for automatically building an analytical model to estimate the performance of an application on a generic processor without requiring any information about the processor architecture but the one provided by the GNU GCC Intermediate Representation. The proposed methodology exploits the linear regression technique based on an application analysis performed on the Register Transfer Level internal representation of the GNU GCC compiler. The benefits of working with this type of model and with this intermediate representation are three: we take into account most of the compiler optimizations, we implicitly consider some architectural characteristics of the target processor and we can easily estimate the performance of portions of the specification. We validate our approach by evaluating with cross-validation technique the accuracy and the generality of the performance models built for the ARM926EJ-S and the LEON3 processor

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Chapter SW Annotation Techniques and RTOS Modelling for Native Simulation of Heterogeneous Embedded Systems

Author: Díaz Álvaro
Posadas Hector
Villar Eugenio
Publication venue: 'IntechOpen'
Publication date: 02/06/2021
Field of study

Space scienc

Directory of Open Access Books (DOAB)

SW Annotation Techniques and RTOS Modelling for Native Simulation of Heterogeneous Embedded Systems

Author: Eugenio Villar
Héctor Posadas
Álvaro Díaz
Publication venue: 'IntechOpen'
Publication date: 02/03/2012
Field of study

IntechOpen

Combining Target-independent Analysis with Dynamic Profiling to Build the Performance Model of a DSP

Author: F. Ferrandi
M. Lattuada
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Fast and accurate performance estimation is a key aspect of heterogeneous embedded systems design flow, since cycle-accurate simulators, when exist, are usually too slow to be used during design space exploration. Performance estimation techniques are usually based on combination of estimation of the single processing elements which compose the system. Architectural characteristics of Digital Signal Processors (DSP), such as the presence of Single Instruction Multiple Data operations or of special hardware units to control loop executions, introduce peculiar aspects in the performance estimation problem. In this paper we present a methodology to estimate the performance of a function on a given dataset on a DSP. Estimation is performed combining the host profiling data with the function GNU GCC GIMPLE representation. Starting from the results of this analysis, we build a performance model of a DSP by exploiting the Linear Regression Technique. Use of GIMPLE representation allows to take directly into account the target-independent optimizations performed by the DSP compiler. We validate our approach by building a performance model of the MagicV DSP and by testing the model on a set of significative benchmarks

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref