6,112 research outputs found
Inferring Energy Bounds via Static Program Analysis and Evolutionary Modeling of Basic Blocks
The ever increasing number and complexity of energy-bound devices (such as
the ones used in Internet of Things applications, smart phones, and mission
critical systems) pose an important challenge on techniques to optimize their
energy consumption and to verify that they will perform their function within
the available energy budget. In this work we address this challenge from the
software point of view and propose a novel parametric approach to estimating
tight bounds on the energy consumed by program executions that are practical
for their application to energy verification and optimization. Our approach
divides a program into basic (branchless) blocks and estimates the maximal and
minimal energy consumption for each block using an evolutionary algorithm. Then
it combines the obtained values according to the program control flow, using
static analysis, to infer functions that give both upper and lower bounds on
the energy consumption of the whole program and its procedures as functions on
input data sizes. We have tested our approach on (C-like) embedded programs
running on the XMOS hardware platform. However, our method is general enough to
be applied to other microprocessor architectures and programming languages. The
bounds obtained by our prototype implementation can be tight while remaining on
the safe side of budgets in practice, as shown by our experimental evaluation.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854). Improved version of the one
presented at the HIP3ES 2016 workshop (v1): more experimental results (added
benchmark to Table 1, added figure for new benchmark, added Table 3),
improved Fig. 1, added Fig.
Improving early design stage timing modeling in multicore based real-time systems
This paper presents a modelling approach for the timing behavior of real-time embedded systems (RTES) in early design phases. The model focuses on multicore processors - accepted as the next computing platform for RTES - and in particular it predicts the contention tasks suffer in the access to multicore on-chip shared resources. The model
presents the key properties of not requiring the application's source code or binary and having high-accuracy and low overhead. The former is of paramount importance in those common scenarios in which several software suppliers work in parallel implementing different applications for a system integrator, subject to different intellectual property (IP) constraints. Our model helps reducing the risk of exceeding the assigned budgets for each application in late design
stages and its associated costs.This work has received funding from the European Space
Agency under Project Reference AO=17722=13=NL=LvH,
and has also been supported by the Spanish Ministry of
Science and Innovation grant TIN2015-65316-P. Jaume Abella
has been partially supported by the MINECO under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft
Data dependent energy modelling for worst case energy consumption analysis
Safely meeting Worst Case Energy Consumption (WCEC) criteria requires
accurate energy modeling of software. We investigate the impact of instruction
operand values upon energy consumption in cacheless embedded processors.
Existing instruction-level energy models typically use measurements from random
input data, providing estimates unsuitable for safe WCEC analysis.
We examine probabilistic energy distributions of instructions and propose a
model for composing instruction sequences using distributions, enabling WCEC
analysis on program basic blocks. The worst case is predicted with statistical
analysis. Further, we verify that the energy of embedded benchmarks can be
characterised as a distribution, and compare our proposed technique with other
methods of estimating energy consumption
Estimation of WCET using a little language to describe microcontrollers and DSPs architectures
A method for analysing and predicting the timing properties of a program fragment will be described. First a little language implemented to describe a processor’s architecture is presented followed by the presentation of a new static WCET estimation method. The timing analysis starts by compiling a processor’s architecture program followed by the disassembling of the program fragment. After sectioning the assembler program into basic blocks call graphs are generated and these data are later used to evaluate the pipeline hazards and cache miss that penalize the real-time performance. Some experimental results of using the developed tool to predict the WCET of code segments using some Intel microcontroller are presented. Finally, some conclusions and future work are presented
A machine independente wCET predictor for microcontrollers and DSPs
This paper describes a method for analyzing and
predicting the timing properties of a program fragment.
The paper first presents a little language implemented to
describe a processor’s architecture and a static WCET
estimation method is then presented. The timing analysis
starts by compiling a processor’s architecture program
followed by the disassembling of the program fragment.
The assembler program is then decomposed into basic
blocks and a call graph is generated. These data are later
used to evaluate the pipeline hazards and cache miss that
penalize the real-time performance. Finally, some
experimental results of using the developed tool to predict
the WCET of code segments with some Intel
microcontroller are presented.
execution, the desired time will be found by
averaging. Even with this approach, if you want
an accurate measurement, a number of
complications such as, compiler optimizations,
operating system distortions, must be solved.
Nevertheless, these approaches are unrealistic since
they ignore the system interferences and the effects of
cache and pipeline, two very important features of some
processors that can be used in our hardware architecture.
Shaw [1], Puschner [2], and Mok [3], developed some
very elaborated methodology for WCET estimation, but
none of them takes into account the effects of cache an
An Efficient Monte Carlo-based Probabilistic Time-Dependent Routing Calculation Targeting a Server-Side Car Navigation System
Incorporating speed probability distribution to the computation of the route
planning in car navigation systems guarantees more accurate and precise
responses. In this paper, we propose a novel approach for dynamically selecting
the number of samples used for the Monte Carlo simulation to solve the
Probabilistic Time-Dependent Routing (PTDR) problem, thus improving the
computation efficiency. The proposed method is used to determine in a proactive
manner the number of simulations to be done to extract the travel-time
estimation for each specific request while respecting an error threshold as
output quality level. The methodology requires a reduced effort on the
application development side. We adopted an aspect-oriented programming
language (LARA) together with a flexible dynamic autotuning library (mARGOt)
respectively to instrument the code and to take tuning decisions on the number
of samples improving the execution efficiency. Experimental results demonstrate
that the proposed adaptive approach saves a large fraction of simulations
(between 36% and 81%) with respect to a static approach while considering
different traffic situations, paths and error requirements. Given the
negligible runtime overhead of the proposed approach, it results in an
execution-time speedup between 1.5x and 5.1x. This speedup is reflected at
infrastructure-level in terms of a reduction of around 36% of the computing
resources needed to support the whole navigation pipeline
High-Integrity Performance Monitoring Units in Automotive Chips for Reliable Timing V&V
As software continues to control more system-critical functions in cars, its timing is becoming an integral element in functional safety. Timing validation and verification (V&V) assesses softwares end-to-end timing measurements against given budgets. The advent of multicore processors with massive resource sharing reduces the significance of end-to-end execution times for timing V&V and requires reasoning on (worst-case) access delays on contention-prone hardware resources. While Performance Monitoring Units (PMU) support this finer-grained reasoning, their design has never been a prime consideration in high-performance processors - where automotive-chips PMU implementations descend from - since PMU does not directly affect performance or reliability. To meet PMUs instrumental importance for timing V&V, we advocate for PMUs in automotive chips that explicitly track activities related to worst-case (rather than average) softwares behavior, are recognized as an ISO-26262 mandatory high-integrity hardware service, and are accompanied with detailed documentation that enables their effective use to derive reliable timing estimatesThis work has also been partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant
TIN2015-65316-P and the HiPEAC Network of Excellence. Jaume Abella has been partially supported by the MINECO under
Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. Enrico Mezzet has been partially supported by the Spanish
Ministry of Economy and Competitiveness under Juan de la Cierva-IncorporaciĂłn postdoctoral fellowship number IJCI-2016-
27396.Peer ReviewedPostprint (author's final draft
Integrated Worst-Case Execution Time Estimation of Multicore Applications
Worst-case execution time (WCET) analysis has reached a high level of precision in the analysis of sequential programs executing on single-cores. In this paper we extend a state-of-the-art WCET analysis technique to compute tight WCETs estimates of parallel applications running on multicores. The proposed technique is termed integrated because it considers jointly the sequential code regions running on the cores and the communications between them. This allows to capture
the hardware effects across code regions assigned to the same core, which significantly improves analysis precision. We demonstrate that our analysis produces tighter execution time bounds than classical techniques which first determine the WCET of sequential code regions and then compute the global response time by integrating communication costs. Comparison is done on two embedded control applications, where the gain is of 21% on average
- …