






Tracing   the   flow of   control   in   code  generated   from 
switch­case statements is difficult for static program 
analysis   tools   when   the   code   contains   jumps   to  




















Static   analysis   of   the   worst­case   execution   time 
(WCET) of a program usually begins by building the 
control­flow   graph   (CFG).   On   the  machine   code 
level, where most WCET tools work, the tool has to 
find   the   possible   successor   instructions   of   each 







C or  Ada  is  a  very  flexible   control   structure.  The 
programmer   can   choose   the   type   of   the   switch 
index,   for   example   an   8­bit   or   a   32­bit   number; 





kinds   of   code   to   implement   different   kinds   of 
switch­case statements.
For small target processors such as the Intel 8051 
or  Atmel  AVR some compilers   try   to   reduce  code 
size  by encoding the switch­case statement  into a 
ROM switch table and generating a call or jump to a 
switch handler  routine  that   interprets   the   table  at 
run­time.   There  may   be   several   types   of   switch 
table,   for   example   depending   on   the   index   type, 
each with its own switch handler.




handler   for  use   in   examples.   Section 3  states   the 













The   switch­table   structure   in   this   example   was 







entries.  An  entry   represents   a   set  of  8­bit   switch­
index   values   that   lead   to   the   same   case   in   the 




index and  the  mask  octet  equals   the  match  octet. 
The order of entries in the table is arbitrary but the 
last entry always has a zero  mask  and  match. This 
ECRTS 2007





















In   this   example   a   switch­case   statement   is 
compiled into code that loads the switch index (the 




SwHandler  searches   the   table   for   an   entry   that 
matches the switch index, then jumps to the address 
of this entry.
SwHandler  can   be   written   in   AVR   assembly 
language [3] as shown in Listing 1 below. For later 





and   the   hex   form   of   the   switch   table.   Listing   2 








0100 pop r30 ; low octet of table address
0101 pop r31 ; high octet of table address
; Z = r31:r30 = word address of the switch table.
0102 add r30,r30 ; Multiply Z by two to make
0103 adc r31,r31 ; it an octet address for lpm.
loop: ; Z points at the next switch table entry.
0104 lpm r1,Z+ ; r1 := entry.mask
0105 lpm r2,Z+ ; r2 := entry.match
0106 and r1,r0 ; r1 := index and mask
0107 cp r1,r2 ; compare to entry.match
0108 breq found ; branch if entry matches index
0109 adiw Z,2 ; no match, point at next entry
010A rjmp loop ; try next entry
found: ; Entry matches. Z points at entry.address.
010B lpm r1,Z+ ; r1 := address low octet
010C lpm r31,Z ; r31 := address high octet














































processor­specific   parts,   for   example   the   parts   of 
Bound­T that decode AVR instructions.
Bound­T   can   analyse   many   aspects   of   a   sub­




cannot   give   a   context­dependent   resolution   of   the 







switch   table.   The   analysis   must   thus   simulate   or 
execute these instructions. Furthermore, the analysis 
must   unroll   the   table­scanning   loop   in   the   switch 
handler. Each iteration of the loop leads to a different 









original   program:   it   is   specialized   to   the   domain 
where the bound inputs have the given values.
The proposed analysis of switch tables and switch 
handlers  uses  partial   evaluation of   subprograms as 
follows. A switch handler is a subprogram with two 
inputs:   the   switch   index   and   the   switch   table.   At 
analysis   time,   in   a   given   invocation   of   a   switch 
handler for a given switch­case statement the switch 
index is usually unbound (has an unknown, dynamic 
value)   but   the   switch   table   is   bound   to   a   static 















implemented  in a way  that   fits   the Bound­T archi­
tecture,   not   as   a   general­purpose  partial   evaluator 
such   as   the  mix  evaluator   described   in [4].   In 
Bound­T the original, unevaluated subprogram (the 
switch handler) is represented implicitly by its entry 
address  and  the  instructions   in  the  target  program 
that   can   be   reached   from   the   entry   address.   The 
residual subprogram (the switch handler specialized 
to a given switch table) is represented as a part of the 
flow­graph   of   the   subprogram   that   contains   the 
switch­case statement. This part is a subgraph rooted 
at   the   node   that   invokes   the   switch   handler.   The 




this   partial   evaluator   is   machine­code   program­
memory images and the  target  language  is Bound­T 
flow­graphs.   (As   a   part   of   Bound­T   the  implemen­
tation language is Ada, but this is not important.)
The   next   two   sections   explain   how   partial 
evaluation is implemented in Bound­T and why it is a 
natural extension of the way in which Bound­T builds 
flow­graphs   from   machine   code.   This   says   more 





Bound­T   and   the   iterative   algorithm   for   building 
flow­graphs   from  machine   code.   The   next   section 
extends the algorithm to include partial evaluation.
First a definition of terms. The internal represen­
tation of  a  subprogram in Bound­T  is  a  flow­graph 
(FG). A flow­graph differs from a control­flow graph 
(CFG) because (as defined in this paper) a CFG node 















Each   flow­graph   node   has   several   attributes   to 
model the instruction in this node. For this paper the 
main  attribute   is   the  computational   effect:   a   set  of 
assignments of expressions to variables (registers or 
memory   locations).   For   example,   the   effect   of   the 
AVR   instruction  lpm r1, Z+  is   modeled   by   the 





Each edge  in  the  flow­graph  is  provided with a 
Boolean expression that  is  a necessary but  perhaps 
3
not   sufficient  condition  for   taking   this   edge.   For 
example,   the   condition   for   the   branch­taken   edge 
after the AVR instruction  breq  is that the “zero” flag 
be   set,   here   written   as  zf = 1.   The   condition   is 
evaluated after the effect of the source node.







subprogram   that   can   be   reached   from   the   entry 
address. The algorithm follows.
Building the flow­graph of a subprogram in Bound­T







instruction executed  in   this   state.  Fetch  this   instruction 
from the memory image of the target program and fill in 
the attributes of node N from this instruction.















For   the   example   in   section  2  the   return   point 




extended   to   implement   partial   evaluation   during 
flow­graph building in Bound­T.
First note that the flow­graph building algorithm 





partial   evaluation   of   an   instruction   amounts   to 
finding   the   effect   of   the   instruction  on   the  PC,   in 
other words finding the successor instructions.
We can extend this  partial  evaluation simply by 







object  models   the  values  of  program variables   just 
before executing the node tagged with this data­state.






stack word)  to  the value 0203 (hex) and  leave all 
other variables unbound. 
The flow­graph building algorithm is extended to 
handle   data­states   as   follows.   When   partial   eva­
luation is not in progress the data­state is null and 
the   algorithm   works   as   before.   Otherwise   the 
algorithm uses the data­state to partially evaluate the 
computational effects and edge conditions and uses 



























state   then   try   to   compute   the   target   address   from  the 
data­state. If this succeeds (ie. if the DTC target depends 






services   for   propagating   constant   values   in   flow­
4
graphs   and   computational   effects.   New   code   was 
needed mainly for the container of data­state objects.
Most of the extensions for data­state handling are 
implemented   in   the  processor­independent  parts  of 
Bound­T. The processor­specific modules only have to 
start and stop the partial evaluation at suitable points 
in   the   analysis.   For   this   paper   I   assume   that   the 
processor­specific   modules   detect   when   a   call   or 
jump   instruction   enters   a   switch   handler;   at   that 
point   these   modules   start   partial   evaluation   by 
putting the initial data­state for the switch handler in 
the target of the edge that enters the switch handler. 












data­state  bindings  if  any.  For brevity only relevant 
bindings  are  shown.  The AVR  instruction  is   shown 
below the flow­state, followed by the relevant parts 
of its residual computational effect. An edge with no 





instruction   (the   call)   enters   a   switch  handler.   The 
AVR­specific  modules   of   Bound­T  have   accordingly 
defined   the   successor   of   the   call   to   be   the   first 
instruction   in  SwHandler  (PC = 0100 hex)  with  a 
data­state that binds the return address (top of stack 





first   four   instructions   from  SwHandler  have   been 
inserted. Note how the partial evaluation of the  pop 
instructions   transformed   the  tosw  binding   into   a 
binding for the Z pointer and how the evaluation of 
the add and adc instructions doubled the value bound 
to  Z.  (The asterisks indicate  a computational effect 
that  was combined with a  preceding  instruction  to 
build   a   16­bit   operation   from   two   or   more   8­bit 





The   next   figure   shows   the   flow­graph   when   it 
contains   all   the   loop   instructions   and   the   first 
possible exit from the loop (for  k = 4). On the left 
the loop exits when zf = 1. The ijmp DTC is resolved 
to  a   static   jump because   the  data­state  binds  Z  to 
020B hex. This identifies the first case of the switch. 
Partial   evaluation   in   this   branch  stops  because   the 
successor flow­state [020B] has a null data­state.
On the right, when  zf = 0,  the  loop  is  about  to 
repeat (rjmp loop). The successor flow­state contains 
the address of the loop­head (0104) which is already 















instruction   effect
[flow-state]









[0100, tosw = 0203]
pop r30 Z := 0203
pop r31 *
[0104, Z = 0406]
[0102, Z = 0203]





[0100, tosw = 0203]
pop r30 Z := 0203
pop r31 *
add r30,r30 Z := 0406
adc r31,r31 *
[0104, Z = 0406]
lpm r1,Z+ r1 := 255, Z := 0407
lpm r2,Z+ r2 := 4, Z := 0408
and r1,r0 r1 := r0
cp r1,r2 zf := r1 equals 4
breq found
[020B]
[010B, Z = 0408, zf = 1]




[0109, Z = 0408, zf = 0]




[0104, Z = 040A]
5
dition  false.   This   ends   the   unrolling   and   also   the 
partial evaluation.
The last and largest figure, below, is an overview of 
the   final   flow­graph   of  foo.   The   residual   form   of 
SwHandler  within this  flow­graph is  a tree of com­















The   first   two   points   enable   partial   evaluation   of 
machine   code   into  parts   of   flow­graphs.  The   third 
point applies partial evaluation to reveal the flow of 
control in switch tables.




that   the   partial   evaluation   of   the   switch   handler 
resolves   the   DTCs.   For   example,   no   version   of 
Bound­T now models  the “half  carry”  flag for  BCD 
arithmetic. If a switch handler uses this flag in a DTC 
the partial evaluation will not resolve the DTC.
The second problem  is   to detect  when a switch 
handler is entered or exited. Bound­T now uses the 
compiler­specific   identifiers   of   the   switch   handlers 
and works only if these identifiers are present in the 
symbol­table of the program. An alternative could be 
to  use   data­flow   analysis   or   slicing  as   in   [5­9]   to 
detect that a given DTC is “table driven”.
Other applications of partial evaluation in WCET 
analysis   can   be   imagined.   For   example,   the  printf 
function   in   C   is   notoriously   difficult   for   WCET 







the   constant   format   string   should   transform   the 
interpretive   loop   over   the   format   string   into 
sequential code in the residual flow­graph. It should 
also   transform  most   format­dependent   conditional 
branches  into unconditional   flow of  control.   In  the 
above   example,   the   residual   flow­graph   should 
contain one  %d  (decimal integer) formatting action, 
followed   by   formatting   of   the   constant   string 
“ and ”,   followed   by   one  %f  (decimal   floating­
point)   formatting   action,   followed   by   a   new­line 
[0200] [0100, tosw = 0203]
[0104, Z = 0406]
1st loop iteration
[010B, Z = 0408]
exit for k = 4
[020B]
case k = 4
[021C]
case k = 8, 9 or 11
[0224]
default case
[0104, Z = 040A]
2nd loop iteration
[010B, Z = 040C]
exit for k = 8 or 9
[0104, Z = 040E]
3rd loop iteration
[010B, Z = 0410]
exit for k = 11
[0104, Z = 0412]
4th loop iteration




[0104, Z = 0412]
lpm r1,Z+ r1 := 0, Z := 0413
lpm r2,Z+ r2 := 0, Z := 0414
and r1,r0 r1 := 0
cp r1,r2 zf := 1
breq found
[0224]
[010B, Z = 0414, zf = 1]






[0109, Z = 0410]




action.   Thus,   partial   evaluation   should   resolve   the 
format­dependent aspect of the WCET for printf.
In   general,   partial   evaluation   could   help   the 
context­specific analysis of any subprogram that has 




So   far   Tidorum   has   implemented   this   method   of 








use single­bit   load and store  instructions that work 
with the dedicated “T” bit  in the status register. As 
foreseen   in   section 8  it   was   necessary   to   extend 
Bound­T/AVR   with  models   for   this   bit   and   these 
instructions,   in   order   to   get   good   residual   branch 
conditions and to terminate the partial evaluation.
Secondly,   some   switch   handlers   call   their   own 
subroutines, for example to load the next entry from 
the   switch   table   into   registers,   compare   it   to   the 
switch index, and execute a DTC when they match. 
During  partial   evaluation  all   calls   to   these  handler 
subroutines also have to be in­lined in the flow­graph 









Testing   the   method   on   several   AVR   and   8051 
programs showed that the residual flow­graphs were 
less   complex   than   could   be   feared.   The   switch 





basic  blocks   in   the   residual   flow­graphs   (note   that 
Bound­T allows unconditional branches within basic 













IAR   switch   tables   and   switch   handlers.   The   ratio 




program   slicing   and   constant   propagation   to   find 
dense tables of addresses or  jumps,  indexed by the 
switch index. This is related to but not the same as 
partial   evaluation.  Only   the   “hashing   form”   in   [8] 
involves run­time search in a table.
Bound­T   analyses   jumps   through  dense  address 
tables   with   a   combination   of   instruction­pattern 
matching, to find these jumps, and data­flow analysis 
(based on Presburger Arithmetic) to find the bounds 
of   the   address   table.   The   instruction   patterns   in 
Bound­T are currently target­specific  and inflexible; 
slicing  methods  and  data­flow patterns  as   in   [5­8] 
would be an improvement.








however   they   identify   address   tables   by   their 
assembly­language   form   (lists   of   label   identifiers), 
not by their usage.
The   drawback   of   such   top­down   flow­graph 
pruning methods [5,  7] is that they first need some 
other   method   to  find  all   the   instructions   in   the 
program.   That   is   impractical   in   the   analysis   of 
binaries for processors with instructions of different 
sizes,   because   the   common   executable­file   formats 
such as ELF do not mark  instruction boundaries  in 
the   memory   images.   Some   cross­compilers   even 
deliberately overlay instructions so that, for example, 
the code can jump to the second octet of a 3­octet 
instruction   to  use   the   last   two  octets   as   a   2­octet 
instruction.   The   bottom­up   flow­graph   extension 
methods   in   Bound­T   and   [6]   work   also   for   such 
processors and compilers.
Another problem with some top­down methods is 
that   they   assume   that   a   subprogram   consists   of 
contiguous  code.   This   is   is   false   for   several   cross­












an   HRTL   interpreter   dynamically   executes   the 
program and records  the computed target addresses 
of the virtual function calls. However, the interpreter 
records   only   the   executed   paths,   not   all   possible 
paths as in partial evaluation, so the virtual function 
tables may not be fully explored.










cooperation   between   the   two  main   approaches   to 
program   analysis,   the   first   being   concrete   state 
enumeration  by   executing   the   program   and   the 
second being state  comprehension  by abstracting the 
program.   Partial   evaluation   represents   the   first 
approach. Bound­T uses the second approach to find 
loop   bounds.   I believe   that  WCET   analysis   would 
benefit   from  more   use   of   state   enumeration.   The 
problem, of course,  is choosing which state compo­
nents to enumerate. Enumerating the states in switch 
handlers was an easy instance of this problem.
References
[1] G. Bernat and N. Holsti. Compiler Support for 
WCET Analysis: a Wish List. In Proc. of the 3rd 
International Workshop on WCET Analysis 
(WCET 2003), Porto, July 2003.
[2] Tidorum Ltd. Bound­T Execution Time Analyzer. 
http://www.bound­t.com.
[3] Atmel Corporation. 8­bit AVR Instruction Set.
Rev. 0856D­AVR­08/02.
[4] N.D. Jones. An Introduction to Partial 
Evaluation. ACM Computing Surveys, Vol. 28, 
No. 3, September 1998, pp. 480­503.
[5] B. De Sutter, B. De Bus, K. De Bosschere, 
P. Keyngnaert and B. Demoen. On the Static 
Analysis of Indirect Control Transfers in 
Binaries. In Proc. of the International Conference 
on Parallel and Distributed Processing Techniques 
and Applications, Las Vegas, Nevada, USA, 
June 2000, pp. 1013­1019.
[6] H. Theiling. Extracting safe and precise control 
flow from binaries. In Proc. of the 7th Internatio­
nal Conference on Real­Time Computing Systems 
and Applications, Dec. 2000, pp. 23­30.
[7] D. Kästner and S. Wilhelm. Generic Control 
Flow Reconstruction from Assembly Code. 
Proc. LCTES'02 – SCOPES'02, June 2002,
pp. 46­55.
[8] C. Cifuentes and M. Van Emmerik. Recovery of 
Jump Table Case Statements from Binary Code. 
In Proc. of the 7th International Workshop on 
Program Comprehension, May 1999,
pp. 192­199.
[9] J. Tröger and C. Cifuentes. Analysis of Virtual 
Method Invocation for Binary Translation.
In Proc. Ninth Working Conference on Reverse  
Engineering (WCRE'02), 2002, pp. 65­74.
[10] J. Gustafsson, A. Ermedahl, C. Sandberg and 
B. Lisper. Automatic Derivation of Loop Bounds 
and Infeasible Paths for WCET Analysis Using 
Abstract Execution. In Proc. of the 27th IEEE 
International Real­Time Systems Symposium 
(RTSS'06), December 2006, pp. 57­66.
8
