Compiler Techniques For Code Size And Power Reduction For Embedded Processors by Sarvani, V V N S
Abstract 
Embedded processors are becoming increasingly popular and widely used in several sys- 
tems. Embedded systems are extremely sensitive to cost and power consumption while 
still requiriing performance. In this thesis, we consider two aspects of compilation tech- 
niques for embedded systems, one that relates to cost, viz., code size reduction, and the 
other e n e r g y  reduction. 
D S P  preocessors have address generation units that can perform address computation 
in parallelwith other operations. This feature reduces explicit address arithmetic instruc- 
tions, o f t e n r  required to access locations in the stack frame, through auto-increment and 
decrement addressing modes, thereby decreasing the code size. Effective utilization of 
autc-i ncrement and decrement modes requires an intelligent placement of variables in the 
stack frame which is termed as "offset assignment". Although a number of algorithms 
for efficient offset assignment have been proposed in the literature, they do not consider 
possible instruction reordering to reduce the number of address arithmetic instructions. 
In this thes:is, we propose an unified approach which combines instruction reordering and 
algebraic transformations to reduce the number of address arithmetic instructions. The 
proposed approach has been implemented in the SUIF compiler framework. We conducted 
our cxpcriinonts on a sct of real programs and compared our approach with four existing 
approaches. 
M o s t  snmbcddd device typically coritaiii several DRAM chips (~nultlplt! xnernory mod- 
111~s) whicli car1 be individuslly put in low power inodc wlic~i a tnciliory module is not 
in  use. P rcwious work on compiler optimiaatio~~s for architectures involving DRAM chips 
do not consider the effect of program transforlnatio~~. on tile execution time, and their 
impact on energy consumption. In this thesis, we study the effect of loop and code trans- 
formations for such architectures in the presence of a cache. We propose a weightedfissaon 
heuristic which decides between loop fission and loop fusion between a pair of statements 
based on the priorities given to performance and power. We also propose an array lay- 
out heuristic to allocate arrays to  memory modules/banks such that more memory banks 
can be put into low-power mode for longer duration. The weighted fission and the array 
allocation heuristics have been implemented in the SUIF compiler framework and an en- 
ergy simulator is implemented in the Simplescalar tool set. We obtained our performance 
and power results by running experiments on a set of array-intensive benchmarks. Our 
results indicate that loop fission and array layout heuristics in general facilitate putting 
the memory modules in low-power mode for longer duration. However, this does not 
necessarily result in a corresponding decrease in energy consumption due to an increase 
in the number of cache misses or an increase in loop overheads, which, in turn, increases 
the execution time of the application causing a n  increase in the energy consumption. 
