41,699 research outputs found
Inferring Energy Bounds via Static Program Analysis and Evolutionary Modeling of Basic Blocks
The ever increasing number and complexity of energy-bound devices (such as
the ones used in Internet of Things applications, smart phones, and mission
critical systems) pose an important challenge on techniques to optimize their
energy consumption and to verify that they will perform their function within
the available energy budget. In this work we address this challenge from the
software point of view and propose a novel parametric approach to estimating
tight bounds on the energy consumed by program executions that are practical
for their application to energy verification and optimization. Our approach
divides a program into basic (branchless) blocks and estimates the maximal and
minimal energy consumption for each block using an evolutionary algorithm. Then
it combines the obtained values according to the program control flow, using
static analysis, to infer functions that give both upper and lower bounds on
the energy consumption of the whole program and its procedures as functions on
input data sizes. We have tested our approach on (C-like) embedded programs
running on the XMOS hardware platform. However, our method is general enough to
be applied to other microprocessor architectures and programming languages. The
bounds obtained by our prototype implementation can be tight while remaining on
the safe side of budgets in practice, as shown by our experimental evaluation.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854). Improved version of the one
presented at the HIP3ES 2016 workshop (v1): more experimental results (added
benchmark to Table 1, added figure for new benchmark, added Table 3),
improved Fig. 1, added Fig.
Data dependent energy modelling for worst case energy consumption analysis
Safely meeting Worst Case Energy Consumption (WCEC) criteria requires
accurate energy modeling of software. We investigate the impact of instruction
operand values upon energy consumption in cacheless embedded processors.
Existing instruction-level energy models typically use measurements from random
input data, providing estimates unsuitable for safe WCEC analysis.
We examine probabilistic energy distributions of instructions and propose a
model for composing instruction sequences using distributions, enabling WCEC
analysis on program basic blocks. The worst case is predicted with statistical
analysis. Further, we verify that the energy of embedded benchmarks can be
characterised as a distribution, and compare our proposed technique with other
methods of estimating energy consumption
An Approach to Static Performance Guarantees for Programs with Run-time Checks
Instrumenting programs for performing run-time checking of properties, such
as regular shapes, is a common and useful technique that helps programmers
detect incorrect program behaviors. This is specially true in dynamic languages
such as Prolog. However, such run-time checks inevitably introduce run-time
overhead (in execution time, memory, energy, etc.). Several approaches have
been proposed for reducing such overhead, such as eliminating the checks that
can statically be proved to always succeed, and/or optimizing the way in which
the (remaining) checks are performed. However, there are cases in which it is
not possible to remove all checks statically (e.g., open libraries which must
check their interfaces, complex properties, unknown code, etc.) and in which,
even after optimizations, these remaining checks still may introduce an
unacceptable level of overhead. It is thus important for programmers to be able
to determine the additional cost due to the run-time checks and compare it to
some notion of admissible cost. The common practice used for estimating
run-time checking overhead is profiling, which is not exhaustive by nature.
Instead, we propose a method that uses static analysis to estimate such
overhead, with the advantage that the estimations are functions parameterized
by input data sizes. Unlike profiling, this approach can provide guarantees for
all possible execution traces, and allows assessing how the overhead grows as
the size of the input grows. Our method also extends an existing assertion
verification framework to express "admissible" overheads, and statically and
automatically checks whether the instrumented program conforms with such
specifications. Finally, we present an experimental evaluation of our approach
that suggests that our method is feasible and promising.Comment: 15 pages, 3 tables; submitted to ICLP'18, accepted as technical
communicatio
Costing JIT Traces
Tracing JIT compilation generates units of compilation that
are easy to analyse and are known to execute frequently. The AJITPar
project aims to investigate whether the information in JIT traces can be
used to make better scheduling decisions or perform code transformations
to adapt the code for a specific parallel architecture. To achieve this goal,
a cost model must be developed to estimate the execution time of an
individual trace.
This paper presents the design and implementation of a system for extracting
JIT trace information from the Pycket JIT compiler. We define
three increasingly parametric cost models for Pycket traces. We perform
a search of the cost model parameter space using genetic algorithms to
identify the best weightings for those parameters. We test the accuracy
of these cost models for predicting the cost of individual traces on a set
of loop-based micro-benchmarks. We also compare the accuracy of the
cost models for predicting whole program execution time over the Pycket
benchmark suite. Our results show that the weighted cost model
using the weightings found from the genetic algorithm search has the
best accuracy
Accurate modeling of parallel scientific computations
Scientific codes are usually parallelized by partitioning a grid among processors. To achieve top performance it is necessary to partition the grid so as to balance workload and minimize communication/synchronization costs. This problem is particularly acute when the grid is irregular, changes over the course of the computation, and is not known until load time. Critical mapping and remapping decisions rest on the ability to accurately predict performance, given a description of a grid and its partition. This paper discusses one approach to this problem, and illustrates its use on a one-dimensional fluids code. The models constructed are shown to be accurate, and are used to find optimal remapping schedules
Optimal pre-scheduling of problem remappings
A large class of scientific computational problems can be characterized as a sequence of steps where a significant amount of computation occurs each step, but the work performed at each step is not necessarily identical. Two good examples of this type of computation are: (1) regridding methods which change the problem discretization during the course of the computation, and (2) methods for solving sparse triangular systems of linear equations. Recent work has investigated a means of mapping such computations onto parallel processors; the method defines a family of static mappings with differing degrees of importance placed on the conflicting goals of good load balance and low communication/synchronization overhead. The performance tradeoffs are controllable by adjusting the parameters of the mapping method. To achieve good performance it may be necessary to dynamically change these parameters at run-time, but such changes can impose additional costs. If the computation's behavior can be determined prior to its execution, it can be possible to construct an optimal parameter schedule using a low-order-polynomial-time dynamic programming algorithm. Since the latter can be expensive, the performance is studied of the effect of a linear-time scheduling heuristic on one of the model problems, and it is shown to be effective and nearly optimal
Dynamic remapping decisions in multi-phase parallel computations
The effectiveness of any given mapping of workload to processors in a parallel system is dependent on the stochastic behavior of the workload. Program behavior is often characterized by a sequence of phases, with phase changes occurring unpredictably. During a phase, the behavior is fairly stable, but may become quite different during the next phase. Thus a workload assignment generated for one phase may hinder performance during the next phase. We consider the problem of deciding whether to remap a paralled computation in the face of uncertainty in remapping's utility. Fundamentally, it is necessary to balance the expected remapping performance gain against the delay cost of remapping. This paper treats this problem formally by constructing a probabilistic model of a computation with at most two phases. We use stochastic dynamic programming to show that the remapping decision policy which minimizes the expected running time of the computation has an extremely simple structure: the optimal decision at any step is followed by comparing the probability of remapping gain against a threshold. This theoretical result stresses the importance of detecting a phase change, and assessing the possibility of gain from remapping. We also empirically study the sensitivity of optimal performance to imprecise decision threshold. Under a wide range of model parameter values, we find nearly optimal performance if remapping is chosen simply when the gain probability is high. These results strongly suggest that except in extreme cases, the remapping decision problem is essentially that of dynamically determining whether gain can be achieved by remapping after a phase change; precise quantification of the decision model parameters is not necessary
- …