Deadline Analysis of Interrupt-driven Software by Brylow, Dennis & Palsberg, Jens
Marquette University 
e-Publications@Marquette 
Mathematics, Statistics and Computer Science 
Faculty Research and Publications 
Mathematics, Statistics and Computer Science, 
Department of (- 2019) 
10-2004 
Deadline Analysis of Interrupt-driven Software 
Dennis Brylow 
Marquette University, dennis.brylow@marquette.edu 
Jens Palsberg 
University of California - Los Angeles 
Follow this and additional works at: https://epublications.marquette.edu/mscs_fac 
 Part of the Computer Sciences Commons, Mathematics Commons, and the Statistics and Probability 
Commons 
Recommended Citation 
Brylow, Dennis and Palsberg, Jens, "Deadline Analysis of Interrupt-driven Software" (2004). Mathematics, 






Computer Sciences Faculty Research and Publications/College of Arts and 
Sciences 
 
This paper is NOT THE PUBLISHED VERSION; but the author’s final, peer-reviewed manuscript. The 
published version may be accessed by following the link in the citation below. 
 
IEEE Transactions on Software Engineering, Vol. 30, No. 10 (October 2004): 634-655. DOI. This article is 
© The Institute of Electrical and Electronics Engineers and permission has been granted for this version 
to appear in e-Publications@Marquette. The Institute of Electrical and Electronics Engineers does not 
grant permission for this article to be further copied/distributed or hosted elsewhere without the 
express permission from The Institute of Electrical and Electronics Engineers.  
 
Deadline analysis of interrupt-driven software 
 
D. Brylow 
Department of Computer Science, Purdue University, Lafayette, IN 
J. Palsberg 
Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 
 
Abstract: 
Real-time, reactive, and embedded systems are increasingly used throughout society (e.g., flight control, railway 
signaling, vehicle management, medical devices, and many others). For real-time, interrupt-driven software, 
timely interrupt handling is part of correctness. It is vital for software verification in such systems to check that 
all specified deadlines for interrupt handling are met. Such verification is a daunting task because of the large 
number of different possible interrupt arrival scenarios. For example, for a Z86-based microcontroller, there can 
be up to six interrupt sources and each interrupt can arrive during any clock cycle. Verification of such systems 
has traditionally relied upon lengthy and tedious testing; even under the best of circumstances, testing is likely 
to cover only a fraction of the state space in interrupt-driven systems. This paper presents the Zilog architecture 
resource bounding infrastructure (ZARBI), a tool for deadline analysis of interrupt-driven Z86-based software. 
The main idea is to use static analysis to significantly decrease the required testing effort by automatically 
identifying and isolating the segments of code that need the most testing. Our tool combines multiresolution 
static analysis and testing oracles in such a way that only the oracles need to be verified by testing. Each oracle 
specifies the worst-case execution time from one program point to another, which is then used by the static 
analysis to improve precision. For six commercial microcontroller systems, our experiments show that a 
moderate number of testing oracles are sufficient to do precise deadline analysis. 
SECTION 1 Introduction 
Real-time systems have become pervasive in the world. Commerce, health care, transportation, and 
telecommunication all rely increasingly on real-time sensing and control. Particularly for applications in areas 
that are a matter of life and death, the correctness of real-time software is of paramount importance. 
1.1 Background 
Correctness of real-time software can be thought of as having two parts. The first issue is correctness of input-
output behavior and the second is timeliness of that behavior. Verification and validation of input-output 
behavior has been widely studied; there are now many static-checking tools available, including type 
checkers [12], bytecode verifiers [31], and model checkers [13], as well as numerous tools for supporting the 
test process. Verification of timing properties is more difficult, but steady progress has been made toward 
understanding the foundations of checking the timing properties of real-time software in recent years [6], [5]. 
However, major open issues still remain. These issues are due to the low-level nature of real-time systems, with 
most still implemented either in assembly language or at lower levels, such as FPGAs or custom-built ASICs. Even 
when real-time software is written in a higher-level language such as C, it is desirable to check the real-time 
properties of the compiled code because it can be difficult to predict the effects of the compiler. Most previous 
work on analysis of assembly code [58], [7], [47] is not concerned with timing properties. 
The goal in this project is to provide tool support for checking timing properties of real-time assembly code. This 
work focuses on interrupt-driven software, where a signal from a source outside the direct control of the 
software can cause computation to be interrupted by control being transferred to an interrupt handler. Typical 
interrupts in such systems occur because new sensor data is available, a signal pulse arrives at the controller, an 
internal timer goes off, or for many other reasons. The specification of an interrupt-driven system will usually list 
deadlines for the handling of each type of interrupt. It is part of the correctness of the system that all deadlines 
are met. Reasoning about the timing behavior of interrupt-driven software is complicated because interrupts 
can be enabled and disabled by the software itself, an interrupt handler can be interrupted, and interrupts can 
arrive in a myriad of different scenarios. It is critical to know whether an interrupt arrives at a point where it is 
enabled and can be handled right away, or whether it arrives 50 clock cycles later, when the system has just 
disabled interrupt handling and will be doing other work for the next two million clock cycles. In summary, the 
interrupts and the main computation can interact in potentially bizarre ways. Programs in such a style can be 
found, for example, in the area of sensor networks [23] where a sensor node may not have space for a 
sophisticated scheduler and analysis of interrupts is known to be difficult [47], [29]. 
Tool support for real-time interrupt-driven software strives to answer the following question. 
Deadline Analysis: Will every interrupt be handled before its deadline? 
One can approach this question in a testing-based manner: Try a suite of interrupt schedules and measure 
whether all deadlines are met. Developing a good suite of interrupt schedules is a difficult problem because of 
the fine granularity of the timing domain. Even if a clock cycle is as long as one microsecond, it is very difficult to 
engineer or discover interrupt schedules that lead to any reasonable coverage of the program. Statement 
coverage would be easy in this setting, but is not a useful coverage criteria because it does not take into account 
the interplay of different interrupts and the times when they occur. Branch coverage is more accurate but far 
more expensive; at every program point where an interrupt is enabled, there is an implicit branch to the 
handler. Covering all branches can therefore be an intractable task. In summary, the problem with a test-based 
approach is that it is difficult to test a sufficiently wide variety of schedules to gain confidence in the software. 
An alternative is a static-analysis-based approach to deadline verification. A good illustration of the difficulties 
faced by that approach is given in our earlier paper with Damgaard [9], which showed that for six commercial 
microcontrollers the maximal stack size for interrupt-driven assembly code could be estimated successfully by a 
static analysis. The experiments of that paper also illustrated that static analysis of timing properties cannot 
work without information about the behavior of external devices. For example, if the code uses a loop to busy-
wait on a new value from a port, static analysis will view it as an infinite loop, even if the programmer knows 
that an external device will deliver a new value every 100 milliseconds. Once the static analysis has detected an 
infinite loop on the path from 𝐴𝐴 to 𝐵𝐵, it will determine that if an interrupt occurs when the execution is at 
program point 𝐴𝐴 and the handler for the interrupt has exit point 𝐵𝐵, the handling may never terminate, let alone 
meet its deadline. In summary, the static analysis approach of [9] predictably failed to perform useful deadline 
analysis. 
Our thesis is that we should combine static analysis and testing. In practical terms, the fundamental challenge is: 
Challenge: Can static analysis significantly decrease the required testing effort? 
There are previous success stories of combining static analysis and testing. For example, in the area of 
regression testing, rather than rerunning the software on the whole test suite every time a change has been 
made, one can use static analysis to conservatively estimate which test inputs must be tried again [26]. In the 
interrupt-driven setting, static analysis can reduce the required testing effort, allowing the testing effort to be 
more focused, which is exactly what is desired with a combined static-analysis/testing approach to deadline 
analysis. 
Our approach uses test oracles [49] for certain worst-case execution time (WCET) questions that cannot possibly 
or easily be answered by static analysis. The oracles assert to the static analysis that if execution reaches 
program point 𝐴𝐴, then it will reach program point 𝐵𝐵 at most 𝑡𝑡 microseconds later. When 𝐴𝐴 and 𝐵𝐵 are close, then 
a much smaller testing effort is required to verify such an oracle than to do the entire deadline analysis. 
Moreover, if more than one oracle is needed for a program, the work of validating the different oracles can be 
done in parallel. Our goal is to combine static analysis with timing oracles to improve the precision of the 
deadline analysis. 
Deadline analysis cannot be performed without WCET analysis. However, most research on deadline analysis 
assumes that WCET analysis has already been successfully completed, and most published papers on WCET 
analysis do not consider the needs of deadline analysis. Many papers in this area concentrate on estimating the 
execution time from one program point to another, usually from start to finish, sometimes even focusing on a 
particular input, and they rarely handle interrupts [42], [53], [18], [20], [8], [56], [14]. Deadline analysis is more 
complicated than simple WCET analysis because the interrupts can occur at any time and their handlers can be 
enabled or disabled at any program point. In deadline analysis, the starting point for the analysis is not given. It 
is a task of the analysis to identify the worst-case program point at which an interrupt can occur and then 
estimate the WCET to the exit point of the handler for that interrupt. 
In summary, deadline analysis for interrupt-driven assembly code remains a difficult and little-studied problem. 
1.2 Experimental Results 
We have designed and implemented ZARBI (Zilog Architecture Resource Bounding Infrastructure), a tool for 
integrated deadline and WCET analysis of interrupt-driven assembly code. In slogan form: 
deadline analysis = static analysis + testing oracles. 
For six commercial microcontroller programs, each on the order of 1, 000 lines of code, we found that at most 
17 oracles were sufficient for each program. In our experience, an expert user can add the oracles in less than an 
hour, in an interactive fashion, until the deadline analysis is complete. 
Our tool uses a multiresolution analysis, which allows it to explore difficult segments of the control flow graph in 
sufficient depth to bound the latency while staving off the intractable complexity that would arise from using 
such fine-grained analysis over the whole program. 
Our static analysis proceeds by building and coloring a flow graph. Each node is given one of five 
colors: Green, Magenta, Blue, Yellow, and Red. Intuitively, Green means that the time bound can be 
found, Magenta means that starvation is possible, Blue means that starvation is possible later in the 
computation, Yellow means that the analysis thinks that the time bound might not be calculable, and Red means 
that the analysis is certain that there is no time bound. For our test suite, no red nodes were found, we were 
able to eliminate all yellow nodes by adding oracles, and we observed that very few nodes were magenta. 
Fig. 1 illustrates a flow graph at the time the deadline analysis is complete, that is, when all yellow nodes have 
been eliminated. Notice that “other Handler” can starve an interrupt that is to be handled by “Handler.” 
 
Fig. 1. Coloring a flow graph. 
Our tool is intended to be used as part of a three step process. For a given interrupt, 1) add oracles until all 
nodes are green, magenta, or blue, 2) use simulation and testing to find a WCET for the magenta clouds, and 3) 
combine the WCET's from the green, blue, and magenta clouds to compute the WCET for handling the interrupt. 
The remaining sections of the paper proceed according to the following outline: Section 2 covers related 
work. Section 3 reviews control flow graph terminology and concepts that are used throughout the technical 
sections of the paper. Section 4 explains the coloring of control flow graphs for deadline analysis purposes, 
and Section 5 shows how testing oracles have been integrated into our tool. Section 6 describes the 
multiresolution analysis we have developed for deadline analysis. Section 7 details implementation issues 
encountered while implementing our prototype, and Section 8 presents our results and walks the reader 
through a session with ZARBI. Section 9 summarizes and closes. 
SECTION 2 Related Work 
In the general case, the problem of bounding stack sizes and maximum execution times is equivalent to the 
halting problem [48]; it is a basic theorem of computer science that these questions are undecidable. Much 
work has been done on tools that operate on decidable subsets of programming languages, for example, 
Berkeley Packet Filters [35] or Agere Systems' C-NP language [1] for programming network processors, which do 
not allow backward branching. 
Most research in the area of calculating real-time software resource bounds stems from Puschner and Koza's 
work [45], which uses the following conditions to guarantee decidability: no asynchronous interrupts, no 
recursion, no indirect calls, no goto instructions, and only strictly bounded loops. 
In the 1990's, researchers worked to relax several of these restrictions, with a variety of trade-offs. However, 
despite the fact that asynchronous interrupts are the most salient feature of actual real-time systems, they 
remain the least researched topic on the above list. 
2.1 The False Path Problem 
Altenbernd identified that a key issue in accurate worst-case execution time (WCET) analysis is the False Path 
Problem [4]. In constructing a control-flow graph, the abstraction often contains paths that cannot actually take 
place in a real program execution. In order to calculate tight bounds on execution time, the algorithm must 
search for the longest executable path in the graph, rather than the longest structural path in the graph. This is 
equivalent to an NP-complete problem that exists in hardware design; finding the longest executable path in a 
network of logic gates is substantially more difficult than finding the longest structurally connected path [33]. 
Altenbernd used symbolic execution to track possible values of key conditional variables, thereby pruning 
unrealizable paths out of the control-flow graph. This is essentially the same technique used by our tool to prune 
away a substantial number of unrealizable interrupt handler paths from the control-flow graphs. 
2.2 Testing Oracles 
There are previous success stories of combining static analysis and testing. For example, in the area of 
regression testing, rather than rerunning the software on the whole test suite every time a change has been 
made, one can use static analysis to conservatively estimate which test inputs must be tried again [26]. In our 
setting, static analysis can reduce the required testing effort, allowing the testing effort to be more focused, 
which is exactly what we desire to achieve with a combined static-analysis/testing approach to deadline analysis. 
Our approach uses test oracles [49] for certain worst-case execution time (WCET) questions that cannot possibly 
or easily be answered by static analysis. The oracles assert to the static analysis that if execution reaches 
program point 𝐴𝐴, then it will reach program point 𝐵𝐵 at most 𝑡𝑡 microseconds later. When 𝐴𝐴 and 𝐵𝐵 are close, then 
a much smaller testing effort is required to verify such an oracle than to do the entire deadline analysis. 
Moreover, if more than one oracle is needed for a program, the work of validating the different oracles can be 
done in parallel. Our goal is to combine static analysis with timing oracles to improve the precision of the 
deadline analysis. 
2.3 Worst-Case Execution Time Analysis 
Deadline analysis cannot be performed without WCET analysis. However, most research on deadline analysis 
assumes that WCET analysis has already been successfully performed, and most published papers on WCET 
analysis do not consider the needs of deadline analysis. Many papers in this area concentrate on estimating the 
execution time from one program point to another, usually from start to finish, sometimes even focusing on a 
particular input, and they rarely handle interrupts [42], [53], [18], [20], [8], [56], [14]. Deadline analysis is more 
complicated than simple WCET analysis because the interrupts can occur at any time and their handlers can be 
enabled or disabled at any program point. In deadline analysis, the starting point for the analysis is not given. It 
is a task of the analysis to identify the worst-case program point at which an interrupt can occur and then 
estimate the WCET to the exit point of the handler for that interrupt. 
Work in automatic detection of induction variables [36], and bounding of unnatural loops in low-level 
languages [27] is applicable to loops present in the commercial microcontroller systems examined later in this 
paper. Healy and Whalley's approach [28] concentrates on the branch instructions themselves. By searching 
backward to find all of the assignments that influence registers used in the branch comparison, they are able to 
classify all jumps as one of unknown, fall-through, or jump. The search continues until all registers in the 
expression can be replaced by immediate values, or a control-flow merge point is encountered. This 
intraprocedural analysis allows tighter bounds to be calculated for many loops. 
2.4 Call Graphs and Model Checking 
A static analysis of assembly code may attempt to approximate the values in specific registers or on the stack. 
This problem is closely related to the problems of call-graph construction and points-to analysis for object-
oriented programs. Accurate, scalable analyses for these purposes exist in the programming languages 
community [41], [55]. 
The FLAVERS system at University of Massachusetts, (FLow Analysis for VERifying Specifications), is a flexible 
framework for flow analysis of concurrent programs [16], [37]. FLAVERS has even been extended to analyze 
infinite executions [38], which are common in embedded systems. However, the FLAVERS system has a much 
higher-level abstraction of concurrent tasks; separate tasks do not have completely shared stack and data 
registers. Such a high-level analysis thrives on a more rigidly specified interface between tasks than can exist at 
the Z86 microcontroller level. 
The stack-size checking algorithm in ZARBI can be seen as a demand-driven version of an algorithm for model 
checking of pushdown systems like Podelski [44]. The algorithm presented in our work differs from 
Podelski [44] in that it generates edges on demand, thereby ensuring that many unreachable nodes are 
automatically pruned away. This demand-driven quality, combined with tight approximation of feasible IMR 
values, prevents the exponential state-space explosion that would occur in more naïve analyses. 
Maximum execution time is formulated as a graph theoretic problem in Puschner and Schedl [46], using T-
graphs and ILP solvers. 
Like Brylow et al. [9], Wegener and Mueller [57] shows that static analysis and evolutionary testing can be used 
in concert to seek both upper and lower bounds on worst-case execution time. 
2.5 Tools 
The ASTEC group [19] represents control flow using a basic unit called a scope, which is intuitively a looping 
construct. All scopes have an iteration counts associated with them; nonlooping code is a scope with zero or one 
iteration. Scopes are assembled into a scope tree, which implicitly represents all possible control flow in the 
program. Scopes are a very general concept, to which a wide variety of execution facts can be attached, 
including flow information facts [18] to describe feasible execution paths, or facts about low-level factors like 
pipeline effects on the execution time [20]. Scope trees are processed into a system of constraints using an 
implicit path enumeration technique (IPET) analysis to determine the maximum execution count for each point 
in the program. The ASTEC infrastructure now includes support for flow analysis of C programs [24]. 
The USES group has used abstract interpretation [15], [39] and ILP solvers to extensively model the Motorola 
“ColdFire” MCF 5307 processor [21]. Their modular architecture breaks down the overall WCET problem into 
smaller parts: a value analysis approximates possible addresses of memory accesses; a cache analysis 
characterizes all memory accesses as hits or possible misses; a pipeline analysis takes into account the speedup 
caused by subsequent instructions passing through the pipeline in succession; a final path analysis calculates the 
WCET of the program. Each analysis can make use of information provided by the previous analysis in the chain. 
The USES group's tool has been applied to test programs supplied by AIRBUS [21]. 
Commercial ILP solvers like CPLEX [30] and lp_solve [40] have been employed to analyze advanced processor 
features like cache and pipeline analysis [3], [22], and branch prediction [34]. 
2.6 Summary of Related Work 
Much work has been done on timing schema for high-level languages, and on mitigating the timing effects of 
pipelines and caches in modern processors. Symbolic execution and implicit path merging are among several 
techniques intended to eliminate false paths in representative control-flow graphs in order to keep static 
analysis tractable in size. Model checking and type system advances have been used to verify many useful 
software properties. Nevertheless, previous work in the area of bounding resources for real-time software can 
be separated into two categories: 
• Work that ignores preemptive interrupts altogether. 
• Work that assumes interrupt handlers are trivially isolatable from the main process. 
All of the real-time systems examined in our work have interrupt handlers heavily integrated with the main 
program; they share the same system stack, have no memory protection, and, in many cases, affect control flow 
within the main program. Prior research does not attempt analysis of interrupt handlers as an integral part of 
the real-time system and, thus, cannot provide useful bounds on interrupt-driven systems. Furthermore, for 
most prior work, the exponential increase in state-space that occurs when taking interrupt-handler control-flow 
into account would make analysis largely intractable. 
This paper presents techniques for analysis of interrupt-driven programs that mitigate much of the exponential 
increase in state-space during analysis. 
SECTION 3 Control Flow Graphs 
This section reviews common control flow graph terminology and concepts that will be used throughout the 
remainder of the paper. Readers familiar with this material can safely skip ahead to the next section. As a 
preview, we will be using control flow graphs in which nodes represent program states, edges represent possible 
transitions between states, and edge weights can represent either timing information or changes in stack height; 
stack analysis graphs can have zero and negative-weighted edges. 
A control flow graph [2] is an abstraction of program states and the transitions between them. Details and 
examples of control flow graph construction are given in later sections. 
Control flow graphs (hereafter abbreviated as CFG's) are a special case of the general graph data structure. A 
graph 𝐺𝐺 is defined as the tuple ⟨𝑉𝑉,𝐸𝐸⟩, consisting of a finite set of vertices 𝑉𝑉 and edges 𝐸𝐸 ⊆ 𝑉𝑉 × 𝑉𝑉. A vertex is 
also sometimes called a node. A control flow graph is a tuple ⟨𝐺𝐺,𝑤𝑤, 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡⟩, where 𝑤𝑤 is a weight function 
that maps edges 𝑡𝑡 ∈ 𝐸𝐸 to integers (𝑤𝑤:𝐸𝐸 ↦ 𝑍𝑍𝑍𝑍) and terminus is the designated vertex (𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 ∈ 𝑉𝑉) to be 
the starting or ending point of a search in the CFG. 
The CFG is a digraph [50], meaning that all edges 𝑡𝑡 ∈ 𝐸𝐸 are directed, or one-way; the first vertex in 𝑡𝑡 is 
the source, and the second vertex is the destination. Let 𝐴𝐴(𝑣𝑣) be the set of edges 𝑡𝑡 ∈ 𝐸𝐸 such that 𝑣𝑣 is the 
destination vertex for e. Let Ω(𝑣𝑣) be the set of edges 𝑡𝑡 ∈ 𝐸𝐸 such that 𝑣𝑣 is the source vertex for 𝑡𝑡. 𝐴𝐴(𝑣𝑣) is 
vertex 𝑣𝑣 's incoming edge set, and Ω(𝑣𝑣) is 𝑣𝑣 's outgoing edge set. 
The vertices in a CFG are an abstraction of the possible states in the corresponding program. As such, each 
vertex may be associated with a wide range of state information. 
In the context of our analysis, a CFG vertex will consist of the triplet ⟨𝑃𝑃𝑃𝑃,𝜎𝜎, 𝐼𝐼𝐼𝐼𝐼𝐼⟩, where 𝑃𝑃𝑃𝑃 is the program 
counter value associated with program point, 𝜎𝜎 is a call string suffix (explained below), and 𝐼𝐼𝐼𝐼𝐼𝐼 is an 
approximation of the interrupt mask register's value. (The purpose of the IMR will be explained in the next 
section.) 
Call string suffixes are calculated using stacks. In general terms, a stack is a last-in, first-out data structure. The 
stack has at least two operations defined, push and pop. An element x pushed onto a stack 𝜎𝜎 results in a new 
stack, 𝑥𝑥𝜎𝜎. The pop operation on a stack 𝑥𝑥𝜎𝜎 returns element 𝑥𝑥 and stack σ. Let the pop operation be undefined 
for an empty stack, written “{}.” A call site [2] is a program point from which another program point (subroutine 
or function) is invoked. A CFG that distinguishes program states within a function by distinct call sites is context 
sensitive [39]. Context sensitivity can be expressed quite naturally as a stack of call sites at each node in the CFG. 
A stack of call sites is also known in the literature as a call string [51], [39]. The CFG's constructed in our analysis 
contain call strings or call string suffixes at each vertex. 
3.1 Control Flow Paths 
Resource-bounding algorithms deal extensively with paths in CFG's. A control flow path, or path, 𝜋𝜋 is a sequence 
of vertices 𝑣𝑣0, … , 𝑣𝑣𝑘𝑘  such that 
∀𝑡𝑡 ∈ {0, … ,𝑘𝑘 − 1}: ⟨𝑣𝑣𝑖𝑖 ,𝑣𝑣𝑖𝑖+1⟩ ∈ 𝐸𝐸 
A simple path is a path in which each 𝑣𝑣𝑖𝑖 in 𝜋𝜋 is distinct. A cycle [50] consists of a simple path from 𝑣𝑣0 to 𝑣𝑣𝑘𝑘, with 
an additional edge from 𝑣𝑣𝑘𝑘 back to 𝑣𝑣0. A vertex 𝑣𝑣𝑘𝑘 is reachable from vertex 𝑣𝑣0 if there exists a path 
from 𝑣𝑣0 to 𝑣𝑣𝑘𝑘. A vertex 𝑣𝑣0 is upstream of 𝑣𝑣𝑘𝑘 if there exists a path from 𝑣𝑣0 to 𝑣𝑣𝑘𝑘, but not vice-versa. 
The resources to be analyzed in a CFG can be represented as edge weights. The weight function w maps each 
edge to an integer cost. 𝐺𝐺 is therefore a weighted digraph, or network [50]. Every path 𝜋𝜋 has a path 
weight or cost𝑃𝑃(𝜋𝜋) = � 𝑤𝑤(𝑣𝑣𝑖𝑖, 𝑣𝑣𝑖𝑖+1).𝑖𝑖∈{0,…,𝑘𝑘−1} Let a null path be a path in which ∀𝑡𝑡 ∈ {0, … ,𝑘𝑘 −
1}:𝑤𝑤(𝑣𝑣𝑖𝑖 , 𝑣𝑣𝑖𝑖+1) = 0. 
With call string suffixes present at every vertex, the CFG contains a notion of valid or realizable paths in the CFG. 
Realizable paths 𝜋𝜋𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 ∈ 𝐺𝐺 are those in which the sequence of program states corresponding to vertices 
along 𝜋𝜋𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 preserve the procedure call semantics of the original program. That is, for all 𝜋𝜋𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 outgoing from 
vertex 𝑣𝑣𝑐𝑐𝑟𝑟𝑟𝑟𝑟𝑟, 𝜋𝜋𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 returns from the procedure subgraph to call site 𝑣𝑣𝑐𝑐𝑟𝑟𝑟𝑟𝑟𝑟, rather than some other call site. 
The longest path [50] in the graph is defined as the path with the largest cost, which is not necessarily the path 
with the largest number of edges. An analysis for bounding stack size or interrupt latency in a CFG is at its core a 
longest realizable path problem. The chief difficulty in finding the longest realizable path in these graphs is in 
first constructing a suitable CFG that expresses external information not normally available to control flow 
analysis. 
3.2 Control Flow Cycles 
Many of the algorithmic details of resource bound analysis in this paper deal with the different types of cycles in 
CFGs. A negative cycle refers to a cycle 𝜋𝜋 in which 𝑃𝑃(𝜋𝜋) < 0. A cycle is said to be positive if 𝑃𝑃(𝜋𝜋) > 0. A zero-
weight cycle is one in which 𝑃𝑃(𝜋𝜋) = 0. A zero-weight cycle which is also a null path is a null cycle. 
The longest path in a graph is undefined if the graph contains negative cycles. While it is possible to construct 
actual programs that result in negative cycles in CFGs, (when edge weights represent change in stack height, for 
example,) such programs are outside the scope of our analysis. Negative cycles can be detected in 𝑂𝑂(𝑉𝑉3) using 
Floyd's Algorithm [50]. 
The longest path in a CFG is not defined for graphs with positive cycles. (At least, it is not defined in the context 
of this paper.) If a positive cycle exists in the graph, a path can become arbitrarily long by passing through the 
cycle multiple times. A graph with neither negative nor positive cycles is bounded. Given a graph 𝐺𝐺 that has no 
negative edges, positive cycles can be checked for by a bounded depth-first search, in which a graph is not 
bounded if the cost of a path exceeds a given boundary, 𝑡𝑡. For stack size analysis, it is assumed that there is a 
known bound on allowable stack size for the program; the maximum allowable size is used as 𝑡𝑡 when checking 
for positive cycles in the graph. This check can be performed in time 𝑂𝑂(𝑉𝑉 ⋅ 𝑡𝑡), which is linear in 𝑉𝑉 when 𝑡𝑡 is 
constant. 
A graph with no negative edges and no positive cycles cannot contain any cycles except those that are zero-
weight cycles. Zero-weight cycles without negative-weighted edges can only be null cycles. Null cycles cannot 
contribute to the longest path and, thus, can be collapsed into a single vertex without changing the cost of the 
longest path. Null cycles can be detected in a graph with no negative edges and no positive cycles in at 
worst 𝑂𝑂(𝑉𝑉2) time [50]. 
A digraph with no cycles is a directed, acyclic graph, or DAG. For DAG's, the longest path problem can be solved 
in linear time, 𝑂𝑂(𝑉𝑉) [50]. 
SECTION 4 Coloring Control Flow Graphs 
Deadline analysis of the control flow graph for a program entails categorizing vertices according to their 
proximity to the deadline. In broad terms, these categories can be thought of as “close,” “far,” and various 
shades of “don't know.” This section presents a realistic example fragment of interrupt-driven code and 
demonstrates how the corresponding CFG can be colored to assign each vertex into one of the categories. 
Deadline analysis begins with stack size analysis. This section presents a running example to illustrate the 
analysis (Section 4.1), constructs an initial flow graph (Section 4.1.1), and finds the bound for the program's 
stack (Section 4.1.2). Next, the graph is reweighted with execution times on the edges instead of stack deltas, 
and graph coloring begins (Section 4.2). An iterative process of adding time summary oracles (Section 5) and 
recoloring the graph with additional resolution (Section 6) continues until the analysis is complete. 
4.1 Example 
The example program shown in Fig. 2 is a short excerpt of Z86 assembly code designed to exhibit interrupt 
latency characteristics hostile to static analysis. As a result, the corresponding flow graph in Fig. 3 is larger than 
one would expect for a code segment of this size. There are two vectored interrupt handlers, IRQVC0 and 
IRQVC1, both of which do nothing but execute the return-from-interrupt instruction, IRET. The procedure PROC 
pushes a value from a register onto the stack, pops it off, and returns. The main loop, LOOP branches to itself 
infinitely. The OUTLP loop outputs the bytes 255 through 1 to an external data port and terminates, while the 
BSYLP loop waits until data from an external port arrives with 0 as the most significant bit. 
 
Fig. 2. Example program. 
 
Fig. 3. Example program flow graph with stack annotation edges. 
The two-digit hexadecimal numbers along the leftmost column of the figure. are the ROM addresses that would 
be generated for this program if it were actually compiled into machine code. These addresses will be used 
throughout the rest of this section to refer to specific lines in the example. 
4.1.1 Example Flow Graph 
Fig. 3 shows the flow graph constructed for the example program in Fig. 2. Each vertex in the graph has three 
pieces of information: 
• Code address—the value of the instruction pointer when the processor begins executing the instruction. 
The upper leftmost node in the graph (“INIT”) contains address “0C,” which is the first instruction 
executed by the Z86 processor on powerup. 
• IMR value—the bits in the Interrupt Mask Register control vectored interrupt handling by the Z86 
processor. The layout of the IMR is “M.543210,” where bit “M” controls global interrupt handling, and 
the lower order bits enable the six correspondingly numbered interrupt sources. The seventh bit is 
reserved. The node at INIT has IMR value “00,” indicating that all interrupts are turned off, while the 
node at LOOP has IMR value “83,” indicating that vectored interrupt handling is turned on and the 
handlers for interrupts 1 and 0 are enabled. 
• Call string suffix—initially, this field contains the top element on the system stack, “{}” for an empty 
stack, or “?” when the exact value on the top of the stack is irrelevant. As shown later, multiresolution 
analysis may add additional items of stack context to the call string suffix of a vertex as needed. 
Solid edges in the graph represent possible control flow between vertices. When the transition between two 
vertices involves a change in the stack, the edges have been annotated with “!” and “?.” The notation “!3” 
indicates an operation that pushes three bytes onto the stack—an interrupt. (When an interrupt handler is 
invoked, the Z86 pushes two bytes of return address and one byte of condition code bits onto the stack.) The 
notation “?2” indicates two bytes being popped off of the stack—a return from a procedure call. Dashed edges 
in the graph represent stack summary edges, as defined in the next section. 
4.1.2 Stack Size Analysis 
Prior to coloring a CFG for deadline analysis, it must be known that the program represented by the CFG has 
bounded stack size. The methods we employ for checking this property are discussed extensively in our earlier 
paper with Damgaard on static checking of interrupt-driven software [9]. 
Stack size analysis begins with a CFG in which the edge weight function equals the change in stack height seen 
by the program when moving between the source state and destination state. Such labels can be seen in the 
example graph of Fig. 3. For the Z86 architecture, the edge weights can range over [−3,3]. 
A summary edge 𝑡𝑡Σ in the graph has weight zero, a source vertex 𝑣𝑣0, and a destination vertex 𝑣𝑣𝑘𝑘 such that there 
exists a path 𝜋𝜋Σ from 𝑣𝑣0 to 𝑣𝑣𝑘𝑘 in which the edge 𝑡𝑡+ = (𝑣𝑣0, 𝑣𝑣1) has a positive weight, edge 𝑡𝑡− = (𝑣𝑣𝑘𝑘−1,𝑣𝑣𝑘𝑘) has 
an equal but opposite negative weight, and the subpath from 𝑣𝑣1 to 𝑣𝑣𝑘𝑘−1 is a null path. An example summary 
edge is shown in Fig. 4. The first and last edges in 𝜋𝜋Σ are said to be matched since they have the same absolute 
value of weight, with opposite polarity. Because 𝜋𝜋Σ consists of two matched edges and a null path, the total cost 
of 𝜋𝜋Σ is zero. 
 
Fig. 4. Summary edge closure in stack analysis graphs. 
A graph is said to be closed with respect to summary edges if and only if every nonzero-weighted edge is part of 
a zero-weighted path 𝜋𝜋Σ and, thus, associated with a summary edge 𝑡𝑡Σ. Closed graphs cannot contain a negative 
edge 𝑡𝑡− that does not have a matching 𝑡𝑡+. Likewise, a closed CFG cannot contain an 𝑡𝑡+ that does not have a 
matching 𝑡𝑡−, or a summary edge 𝑡𝑡Σ with a nonzero weight. These conditions correspond to the type-checking of 
stack elements to ensure that pushes match pops, procedure calls match returns, etc. 
Summary edges summarize well-structured zero-weight paths in such a way that all negative-weighted edges 
can be deleted from the graph without altering the length of the longest paths. In a summary edge closed graph, 
any path from terminus through a negative-weighted edge must pass through an equal and opposite positive-
weighted edge. If a longest path passes through a negative-weighted edge, then there exists another path of 
equal length passing through the associated summary edge instead. If a longest path does not pass through a 
negative-weighted edge, then again no negative-weighted edges were required. Summary edge closure is a key 
property that allows all negative-weighted edges to be removed from the graph without altering the length of 
any longest paths. 
A graph with no negative cycles can be closed with respect to summary edges in time polynomial in 𝑉𝑉 [32]. 
Termination of the closure algorithm guarantees that the program has a known, bounded stack height. This 
bounded stack height is a requirement for the deadline analysis algorithm to terminate. 
4.1.3 𝑘𝑘-CFA 
At this stage of the analysis, the control flow analysis is 1-CFA [52], meaning that each vertex contains a call 
string suffix of at most one call site. This is relatively imprecise and inexpensive as control flow analyses normally 
go, but has proven sufficient to bound stack height in a solid majority of the interrupt-driven software we have 
examined. A 0-CFA analysis (one in which no stack context is stored at the vertices) is not sufficiently precise 
even for stack height analysis in realistic programs. Later sections describe how deadline analysis requires some 
portions of the graph to store additional stack context, 𝑘𝑘-CFA for values of 𝑘𝑘 greater than 1. This more costly 
control flow analysis incurs additional cost both in analysis time and in size of the vertex state space. 
4.2 Initial Coloring of the Example Graph 
The designer of the example program in Fig. 2 would like to know if the tasks corresponding to interrupts 0 and 
1 will meet their deadlines. This requires information about the minimum interarrival time for each interrupt 
source. But, even before that kind of data can be considered, there is another key piece of information that any 
such analysis must have: The WCET of the program with respect to interrupt latency. The maximum possible 
delay between the arrival of an interrupt request and subsequent handling of that request must be known in 
order to make any accurate statement about the system's ability to meet deadlines. 
In order to perform deadline analysis for a given interrupt, we first annotate the CFG with a new edge weight 
function—each edge cost now reflects the number of machine cycles taken by the program to transition from 
one program point to another. Calculating the worst-case interrupt latency is now another instance of the 
longest path problem on the new CFG. However, the longest path cannot yet be calculated because the CFG may 
contain many cycles which cannot be eliminated without additional information about the program. 
We classify the vertices in the flow graph into five colors. Three of those colors will be explained here; two more 
will be covered in Section 6. 
• Green nodes in the graph are those from which computation will inevitably reach the handler of 
interest. For a green node, the analysis can compute the WCET from the node to the handler in linear 
time (see Section 5.1). 
• Red nodes are those from which it is impossible to reach the handler of interest. In our model of 
computation, this would be a significant program error, such as an infinite loop with interrupt handling 
disabled. The test suite of production microcontroller software contained no such errors, so red will not 
be discussed any further. 
• Yellow nodes are those which could not be definitively classified as green or red for the handler of 
interest. 
When the analysis colors the example system flow graph (Fig. 3) with respect to interrupt handler 1, the nodes 
with addresses 2C, 23, and 20 are colored green, as is the node for the lowest instance of the interrupt zero 
handler, 2B, off of the LOOP node. Nodes 1B and 1E are colored yellow because the analysis cannot statically 
determine how long it will take to complete the BSYLP loop. Finally, since the remaining nodes in the graph 
above BSYLP can reach interrupt handler 1 only through BSYLP, they too will be colored yellow in the initial 
round. 
Eliminating all yellow nodes in the graph would allow the analysis to give firm bounds on the execution time of 
any path in the program leading to the interrupt handler. The yellow nodes fall into five basic categories: 
• External Yellow nodes comprise a cycle that depends on external input. These cannot be resolved 
through static analysis, and will require some form of additional information about the external 
environment of the controller. (For example, the node with PC value 1B in Fig. 3 is part of an external 
yellow cycle.) 
• Ultra Yellow nodes comprise a cycle in the graph corresponding to some kind of unbounded loop. 
• Starvation Yellow nodes are yellow because the interrupt handler of interest can be starved (delayed 
indefinitely [11]) by another interrupt source calling its own handler frequently enough to prevent the 
processor from making progress toward the handler of interest. (Nodes 15, 17, and 19 in the example 
can be starved by the handler starting at 2B.) 
• Artificial Yellow nodes comprise unrealizable cycles that appear in the graph as a result of implicit path 
merging. (The cycle of 0F, 26, 28, and 2A in the example is an artificial yellow cycle.) 
• Upstream Yellow nodes are yellow only because they are upstream of other yellow nodes. (Nodes 0C 
and 12 in the example are upstream yellow.) 
Intuitively, yellow represents a “don't know” category of nodes which lie along positive cycles in the CFG. 
External and ultra yellow nodes can be dealt with through the use of oracles, as explained in the next section. 
Artificial yellow nodes are eliminated using adaptive slicing, as outlined in the section on multiresolution 
analysis. Starvation yellow nodes will be assigned a new color, to be dealt with by simulation and testing. Finally, 
upstream yellow nodes will disappear when the other four classes of yellow nodes are eliminated. 
SECTION 5 Testing Oracles 
Real-time, interrupt-driven software can contain loops that cannot be bounded through static analysis. 
Synchronous communication with off-chip resources, decisions predicated on external data, or interaction with 
the user can be expressed as loops whose bounds depend on additional information outside the realm of the 
system source code. 
The BSYLP area of the example system is such a loop. It is a simplified version of a busy-wait loop found in 
several of the production microcontroller systems we have examined. Typically, such a loop could be waiting for 
a peripheral device to signal that it has received the last command, and can be issued further commands. The 
designers of the system would know that the manufacturer of the device guarantees the maximum response 
time for this operation will be, for example, 40ms, a fact that cannot be ascertained from the source code. In 
order to take advantage of this external information the analysis uses an oracle, an entity that answers 
questions about latency that cannot be answered by static analysis. 
An oracle gives an assertion of the form: 
𝐴𝐴𝐴𝐴𝐴𝐴𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡1 → 𝐴𝐴𝐴𝐴𝐴𝐴𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡2 = 𝐿𝐿𝐿𝐿𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿𝐿𝐿 
which says that the program will take at most Latency machine cycles to get from 𝐴𝐴𝐴𝐴𝐴𝐴𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡1 to 𝐴𝐴𝐴𝐴𝐴𝐴𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡2. 
When constructing the initial control flow graph, information provided by the oracle is used to insert time 
summary edges from a node 𝑁𝑁 in the graph with address 𝐴𝐴𝐴𝐴𝐴𝐴𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡1 to a node 𝐼𝐼 in the graph with 
address 𝐴𝐴𝐴𝐴𝐴𝐴𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡2 such that 𝐼𝐼 and 𝑁𝑁 have the same IMR value and stack context. It was initially anticipated 
that the analysis would need more complex syntax for specifying oracle edges, such as pattern matching on IMR 
values or stack arithmetic. However, in the six production microcontroller systems examined, the address-
matching-only edges have proven sufficient to bound all of the external yellow loops. 
The semantics of these time summary edges is such that the color of the destination node can be safely 
extended backward to the source node of the summary edge. This does not in itself imply anything about 
maximum latency between nodes that lie along a path from the source to the destination. The time summary 
applies strictly to the maximum latency between two nodes touched by the time summary edge. 
5.1 Time Summary Edges 
At this stage of the analysis, the CFG has an edge weight function w that associates each edge in the graph with 
a positive integer execution time count. The final WCET to the interrupt handler will be calculated as 
a multisource longest path problem [50], which can be computed in linear time once the graph has been 
transformed into an acyclic digraph. 
Given a weight function ranging over positive integers, there can be no negative cycles in the graph. However, 
positive cycles are common in deadline analysis CFG's because they correspond to the iterative control flow 
produced by looping constructs. Positive cycles must be removed from the graphs before deadline analysis can 
take place because the algorithm does not consider the longest path to be defined in CFG's with positive cycles. 
Given a positive cycle 𝜋𝜋𝑐𝑐𝑐𝑐𝑐𝑐𝑟𝑟𝑟𝑟 and a maximum cost bound 𝑃𝑃𝑚𝑚𝑟𝑟𝑚𝑚 that has been asserted by the oracle to be the 
maximum cost of any path along 𝜋𝜋𝑐𝑐𝑐𝑐𝑐𝑐𝑟𝑟𝑟𝑟, the cycle can be replaced with a time summary edge of weight 𝑃𝑃𝑚𝑚𝑟𝑟𝑚𝑚 as 
shown in Fig. 5. 
 
Fig. 5. Time summary edge. 
The validity of assertions made by the user to the oracle are taken for granted by the current system. In order 
for the deadline analysis algorithm to remain conservative, time summary edges must be conservative. That is, a 
time summary edge can overestimate the true execution time of the loop it summarizes, but it cannot 
underestimate. If underestimated time summary edges exist in a graph, the deadline analysis algorithm is not 
guaranteed to arrive at correct bounds. 
Time summary edges cannot be used to summarize cycles in all cases; there exist ill-formed loops for which no 
single time summary edge is sufficient. Section 8.2 presents results of a practical study of time summary edges 
required for real programs. 
In practice, one would want to concentrate system testing or simulation on areas with time summary edges to 
gain confidence in the validity of the assertions. However, the key point to be made is that the static analysis has 
greatly reduced the sheer volume of program states that must be tested. In each of the production 
microcontrollers analyzed, there were fewer than 20 overall assertions to the oracle, each of which covered only 
a handful of nodes in the graph, out of tens or hundreds of thousands of nodes in the graph overall. 
Static analysis can reduce the size of the latency testing problem from an utterly intractable scale down to a 
subset of the program small enough that one could conceivably use exhaustive simulation to ascertain the 
remaining WCET information, or apply other finer-grained and less-scalable analyses. 
5.2 Time Summary Edges in the Example 
For the example program, a time summary oracle specifies that the BSYLP loop takes at most 320, 000 machine 
cycles (40ms on the example architecture). The input to the oracle is: 
[0x001B] -> [0x0020] = 320000 
The resulting change to the graph is shown in Fig. 6. The time summary edge from 1B to 20 (which is already a 
green node) allows 1B to be recolored green. This in turn causes 1E to be recolored green as well, so this oracle 
edge has eliminated BSYLP as an obstacle to determining maximum interrupt latency for the entire program. 
 
Fig. 6. Time summary oracle in the example. (a) Before and (b) after. 
We use oracles in three ways: 
• External event delays—bounds for loops that rely on data external to the system, such as bytes arriving 
on the input ports of the processor. 
• Internal loop bounds—many of the for-loop style constructs could be bounded using well-known static 
analysis techniques [36], [18]. However, implementing the proper structural loop analysis for assembly 
language source, without any annotations from the programmer, could be far more expensive than 
ascertaining the loop bounds manually. Many of the loops found in the benchmarks are trivially 
bounded by casual examination of the code, and the time summary oracle construct is sufficiently 
general to bound the maximum loop execution time. This would not be a preferred use of the tool in 
practice. An industrial strength version of ZARBI would infer these bounds statically, or interactively 
assist the programmer in annotating the code with proper bounds. The current tool leaves this for 
future work. 
• Internal data dependent loop bounds—a small number of loops in the test suite relied not on 
immediate constants near the top of the loop, but rather on data elsewhere in the program. The most 
common example of this was a display routine that iterated over a zero-terminated ASCII string. 
Techniques exist to automatically infer these kinds of bounds, but, for simplicity of implementation, 
these were not employed. Instead, bounds on these loops were manually ascertained, and equivalent 
time summary edges were inserted. 
Fully two thirds of the input provided to the time summary oracle for these experiments were loop bounds that 
could either be statically checked as annotations or statically inferred by other means. The remaining third of 
the input was for external event delays of the kind that could not possibly be determined statically. A very small 
number of the input items were for loops dependent on internal data, which could probably be determined with 
a very thorough flow analysis of all registers in the program. 
The interface provided to assist the user in giving these assertions to the oracle is quite straightforward. After 
initial coloring of the graph, the tool produces a list of border yellow nodes—yellow nodes that are one edge 
away from green nodes. Typically, these will be branch or jump instructions that comprise the bottom of a loop. 
In the case of the example program, the prototype tool would produce the result, 
Border Yellow instructions: 
L001E: JR NZ, L001B 
directing the user to the BSYLP loop. 
SECTION 6 Multiresolution Analysis 
Initial construction of the control flow graph includes estimates of the possible IMR values and top stack 
elements for each node. Abstracting away the rest of the machine state implicitly merges control flow paths, 
thereby allowing the size of the graph to remain tractable—typically much less than a million nodes, rather than 
the 227 nodes which is the worst case for this model. (7 bits of IMR, 10 bits of stack element, and 10 bits of PC = 
27 bits per node.) However, the imprecision of having nodes distinguished by only one element of stack context 
(analogous to 1-CFA in flow analysis parlance [52]), can result in artificial cycles appearing in the control flow 
graph. 
Such is the case in the example program, where procedure PROC is called twice within a segment where 
interrupt handling is disabled. Ignoring for a moment the question of how to bound latency from node 12, the 
INIT segment of the graph would still be colored yellow because of the path [0F, 00,{}], [26, 00,{12}], [28, 00,{?}], 
[2A, 00,{0F}], and back to [0F, 00,{}]. This is a false path [4], which does not correspond to realizable control 
flow—the second call to PROC will return to the originating call site, not the previous call site. 
The approach to multiresolution analysis shown here improves the control flow graph by eliminating many 
unrealizable paths. 
False paths are a well-known problem in control flow analysis; one solution is to employ 𝑘𝑘-CFA with larger 
values of 𝑘𝑘. However, it could be expensive to recompute the entire control flow graph with a higher value of 𝑘𝑘, 
as this quickly causes a combinatorial explosion in graph size for interrupt-driven software. Instead, the CFG is 
constructed using multiresolution analysis, where the value of 𝑘𝑘 (the size of the call string suffixes) is increased 
only in the areas of the graph where it is necessary to alleviate ambiguity in latency analysis. Thus, nodes like 
[28, 00,{?}] in the example are adaptively sliced into nonyellow nodes with greater stack context, [28, 00,{?, 0F}] 
and [28, 00,{?, 12}], as shown in Fig. 7. This approach is inspired by Plevyak and Chien [43]. Independently of our 
work, Guyer and Lin [25] have also used multiresolution analysis. (a) Before and (b) after. 
 
Fig. 7. Example program adaptive slicing. 
Multiresolution analysis takes place automatically; the algorithm iteratively identifies nodes that are both border 
yellow and stack popping instructions (POP, RET, and IRET), and adaptively slices these nodes and their 
associated graph segments to the necessary depth. This technique represents a substantial savings in graph 
complexity, reducing the size of the graph by 20 to 60 percent compared to running the analysis of the 
production programs with a fixed, nonadaptive 𝑘𝑘-CFA. However, the reduction in graph size can come at the 
cost of increased analysis time, as mentioned below. 
While the multiresolution analysis reduces the number of nodes and edges in the graphs in all cases, when 
compared with the running time of straight 𝑘𝑘-CFA, it runs faster in some cases, but slower in others. In two 
cases, the multiresolution analysis is an order of magnitude slower than straight 𝑘𝑘-CFA. This wide variation in 
relative runtimes is highly dependent on the structure of the program under analysis—the depth that the 
adaptive slicing must go to in order to disambiguate latency, the number of call sites involved, and the lengths of 
the subroutines being sliced are all factors in the cost of multiresolution analysis. For this reason, the prototype 
tool includes a command-line option which tells it to use straight 𝑘𝑘-CFA with a specific 𝑘𝑘, rather than automatic 
multiresolution analysis, so that the user can choose whichever method performs better for their given program 
input. 
The multiresolution analysis is guaranteed to terminate because the control flow graphs have a bounded stack 
size, as verified by a previous phase of the tool. The full details of the adaptive slicing can be found in [10]. 
Time summary oracles allow the deadline analysis to resolve both external and internal yellow loops. 
Multiresolution analysis slices apart artificial yellow nodes. Of the five types of yellow nodes, all that remain are 
starvation yellow and upstream yellow. 
Because these nodes are yellow for a fundamentally different reason than the other nodes dealt with thus far, a 
new color is designated for them. 
• Magenta nodes are those which are one edge away from either green or magenta nodes in the graph, 
AND are one edge away from a nongreen interrupt handler. 
Magenta nodes are set aside as a special case for which maximum latency of the green interrupt handler cannot 
be bounded without additional, detailed meta-knowledge about the characteristics of the other nongreen 
interrupt handlers involved, (such as interarrival times of interrupts, jitter, etc.). These nodes are also different 
in that the straightforward oracle-inserted time summary edges cannot help render these nodes green even if 
the oracle provides bounds on the WCET of the segment of magenta nodes. This is because each magenta node 
can be starved since the nongreen interrupt handler can in the worst case execute so frequently that the 
computation does not make progress from the magenta node. (This is a point on which the Z86E30 
documentation is vague; it is not clear whether an interrupt can occur frequently enough to completely halt 
progress in the noninterrupt code. In the absence of a clear answer, the worst case is assumed. But, whatever 
the processor-specific behavior, either no progress or very little progress will be made in the main program 
under the worst-case interrupt arrival conditions, and starvation will be severe in either case.) 
The WCET of contiguous clusters, or clouds, of magenta nodes cannot be reasoned about at the individual node 
level, unlike all of the other analyses presented here. For this reason, the problem of bounding magenta 
clouds is left as future work beyond the scope of this paper. Fortunately, the current analysis has revealed that 
on average, fewer than 2 percent of the nodes in the production microcontroller suite are magenta; in several 
cases, there are no magenta nodes at all. 
Those yellow nodes which are upstream of the newly designated magenta nodes are also assigned a new color. 
• Blue nodes are those for which the deadline analysis algorithm can precisely bound the WCET to reach a 
cloud of magenta nodes. 
Intuitively, blue nodes are well-behaved segments of the graph which would be green if there were not a 
magenta cloud of potential interrupt starvation between them and the green handler, as suggested by Fig. 1. 
SECTION 7 Implementation Issues 
We now present details of our implementation of graph coloring and adaptive slicing. The implementation does 
those tasks in an integrated fashion. 
7.1 Coloring Algorithm 
The algorithm for coloring a graph is presented in Computation Tree Logic notation [17] in Fig. 8. 𝐻𝐻 is a predicate 
that is true for a node which represents the first instruction of the interrupt handler of interest. 
Predicate ℎ𝐿𝐿𝑡𝑡𝐴𝐴𝑎𝑎𝑡𝑡𝑡𝑡 is true for a node which represents the first instruction of any interrupt handler (so, 𝐻𝐻 ⇒
ℎ𝐿𝐿𝑡𝑡𝐴𝐴𝑎𝑎𝑡𝑡𝑡𝑡). We use CTL notation to talk about edges and paths to a node that satisfy a condition 𝑝𝑝: 
 
Fig. 8. Coloring of a graph for latency analysis. 
• 𝐸𝐸𝐸𝐸(𝑝𝑝) means there is an edge from the current node to a node that satisfies 𝑝𝑝. 
• 𝐴𝐴𝐸𝐸(𝑝𝑝) means every edge from the current node is to a node that satisfies 𝑝𝑝. 
• 𝐸𝐸𝐸𝐸(𝑝𝑝) means there is a path from the current node to a node that satisfies 𝑝𝑝. 
• 𝐴𝐴𝐸𝐸(𝑝𝑝) means every path from the current node reaches a node that satisfies 𝑝𝑝. 
A node is 𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 if it either satisfies H or there is an edge to a node that satisfies 𝐻𝐻 (as expressed 
with 𝐸𝐸𝐸𝐸(𝐻𝐻)). In other words, handling of the interrupt has just started or will start in the next computation step. 
The definition 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 ≡ 𝐴𝐴𝐸𝐸(𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡) means that 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 nodes are those for which all outgoing paths 
inevitably reach 𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 nodes. Thus, for 𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 nodes we can compute the WCET from the node to 
the handler. Notice that all 𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 nodes are also 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡. A 𝐼𝐼𝑡𝑡𝐴𝐴 node cannot reach the handler of interest 
(as expressed with ¬𝐸𝐸𝐸𝐸(𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡)). A 𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿 node can reach the handler of interest (as expressed 
with 𝐸𝐸𝐸𝐸(𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡)), but it also has an edge to some other handler (as expressed with 𝐸𝐸𝐸𝐸(ℎ𝐿𝐿𝑡𝑡𝐴𝐴𝑎𝑎𝑡𝑡𝑡𝑡)) which 
is neither 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 nor 𝐼𝐼𝑡𝑡𝐴𝐴 (as expressed with 𝐴𝐴𝐸𝐸(ℎ𝐿𝐿𝑡𝑡𝐴𝐴𝑎𝑎𝑡𝑡𝑡𝑡 ⇒ ¬(𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 ∨ 𝐼𝐼𝑡𝑡𝐴𝐴))). This other handler may be 
able to prevent the current node from reaching any 𝑈𝑈𝑎𝑎𝑡𝑡𝑡𝑡𝐿𝐿𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 nodes if the interfering interrupt or interrupts 
occur repeatedly. A 𝐵𝐵𝑎𝑎𝑡𝑡𝑡𝑡 node can reach a 𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿 node (as expressed with 𝐸𝐸𝐸𝐸(𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿)) but it cannot 
reach any handler in one step (as expressed with ¬(𝐸𝐸𝐸𝐸(ℎ𝐿𝐿𝑡𝑡𝐴𝐴𝑎𝑎𝑡𝑡𝑡𝑡)), and every path will lead to either 
and 𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿 node or a 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 node (as expressed with 𝐴𝐴𝐸𝐸(𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿 ∨ 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡)). In other words, a 𝐵𝐵𝑎𝑎𝑡𝑡𝑡𝑡 node 
is upstream from 𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿 nodes and we can compute the WCET from the node to the 𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿 nodes. It is 
straightforward to show that every node has exactly one of the colors 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡, 𝐼𝐼𝑡𝑡𝐴𝐴, 𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿, 𝐵𝐵𝑎𝑎𝑡𝑡𝑡𝑡, 
and 𝑌𝑌𝑡𝑡𝑎𝑎𝑎𝑎𝑌𝑌𝑤𝑤. The ZARBI implementation of the coloring algorithm is detailed in [10]. 
Let us consider the insertion of a time summary edge 𝐴𝐴 → 𝐵𝐵 = 𝑏𝑏𝑌𝑌𝑡𝑡𝑡𝑡𝐴𝐴, where 𝐴𝐴,𝐵𝐵 are nodes in the graph. We 
require that 1) 𝐴𝐴 can reach 𝐵𝐵, 2) 𝐴𝐴 is 𝑌𝑌𝑡𝑡𝑎𝑎𝑎𝑎𝑌𝑌𝑤𝑤 or 𝐵𝐵𝑎𝑎𝑡𝑡𝑡𝑡, and 3) 𝐵𝐵 is 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡. In other words, the time summary edge 
is a shortcut from 𝐴𝐴 to 𝐵𝐵 which creates a one-step link from a 𝑌𝑌𝑡𝑡𝑎𝑎𝑎𝑎𝑌𝑌𝑤𝑤 or 𝐵𝐵𝑎𝑎𝑡𝑡𝑡𝑡 node to a 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 node. The 
insertion of 𝐴𝐴 → 𝐵𝐵 = 𝑏𝑏𝑌𝑌𝑡𝑡𝑡𝑡𝐴𝐴 is done by inserting an edge from 𝐴𝐴 to 𝐵𝐵 and removing all other edges from 𝐴𝐴. It is 
straightforward to show that, if we recalculate colors after inserting such time summary edges, the only possible 
color changes are that 𝑌𝑌𝑡𝑡𝑎𝑎𝑎𝑎𝑌𝑌𝑤𝑤, 𝐼𝐼𝐿𝐿𝑀𝑀𝑡𝑡𝑡𝑡𝑡𝑡𝐿𝐿, and 𝐵𝐵𝑎𝑎𝑡𝑡𝑡𝑡 nodes may have become 𝐺𝐺𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡. The insertion procedure 
can easily be extended to allow the simultaneous insertion of multiple time summary edges. 
Returning to the control flow graph from Fig. 3, the three nodes at 15, 17, and 19 are colored magenta. The 
interrupt handler nodes, 2B, hanging off of the magenta section are considered blue. The entire segment above 
OUTLP, with the help of the slicing explained in the previous section, is colored blue. 
All edges in the CFG are annotated with execution cycles; all timing information is taken from the Z86 reference 
manual [59]. The entire flow graph of the example program is now green, blue, or magenta. The magenta cycles 
cannot be statically bounded, but the green and blue nodes can be broken into directed, acyclic subgraphs, each 
of which can be evaluated for WCET by a recursive traversal in which 
𝑊𝑊𝑃𝑃𝐸𝐸𝑊𝑊(𝐵𝐵) = 𝑡𝑡𝐿𝐿𝑥𝑥(𝑊𝑊𝑃𝑃𝐸𝐸𝑊𝑊(𝐴𝐴) + 𝑡𝑡𝐴𝐴𝑀𝑀𝑡𝑡𝐴𝐴𝐴𝐴) 
where 𝐴𝐴 ranges over all nodes that connect directly to node 𝐵𝐵, and 𝑡𝑡𝐴𝐴𝑀𝑀𝑡𝑡𝐴𝐴𝐴𝐴 is the cost of the edge from 𝐴𝐴 to 𝐵𝐵. 
Running this traversal over the green nodes in the example program produces a WCET time of 320, 010 machine 
cycles between the magenta node at 19 and the interrupt handler at 2C. The same calculation over the blue 
subgraph reveals a maximum WCET of 102 machine cycles from the start of the program to the start of the 
magenta nodes. 
Combining this information with additional knowledge about the magenta section, such as, it will take at most 
200 cycles to get from 12 to 1B through the magenta section, bounds the maximum interrupt latency to be 320, 
312 cycles. 
7.2 Adaptive Slicing 
Our deadline analysis includes adaptive slicing, an automated technique for increasing the resolution of the 
analysis in areas of the graph where abstraction causes ambiguity. An example is presented in Section 6; the 
details of the implementation are presented here, with pseudocode shown in Fig. 9. 
 
Fig. 9. Adaptive slicing algorithm. 
In the overall scheme of deadline analysis, multiresolution analysis takes place after the initial coloring of the 
graph with respect to a given handler. The first pass scans backward from the ultragreen handler to collect a list 
of all yellow nodes which are one edge away from green or ultragreen nodes. These nodes are the border 
yellow nodes and are the primary candidates for both adaptive slicing and time summary oracle assertions. 
Not all border yellow nodes can be recolored green through adaptive slicing or time summary oracles—some 
could be yellow because of loops and path mergings elsewhere in the graph. These are upstream yellow nodes 
because their yellow classification depends entirely upon structures elsewhere in the graph. However, 
regardless of what percentage of the border yellow nodes is upstream yellow, it is still the case that some 
number of border yellow nodes can be recolored green with the help of slicing or oracles. 
The multiresolution analysis next iterates through the list of border yellow nodes and discards any nodes which 
are not pop nodes. Pop nodes correspond to one of three opcodes in the Z86 assembly language—POP, RET, and 
IRET. Only the border yellow pop nodes are of interest for adaptive slicing because they are the merge points in 
a backward traversal where stack context is lost. In other words, if a green node in the program has an incoming 
pop edge from a yellow POP instruction, it is the implicit merging of the node for the POP instruction with 
another node in a different stack context which causes an artificial yellow cycle to appear in the graph. 
While filtering nonpop nodes out of the list, the analysis also checks each border yellow pop node candidate to 
see that it has at least one outgoing yellow edge that does not lead to a nongreen interrupt handler. Pop nodes 
that are yellow only because of outgoing edges that lead in one step to a nongreen interrupt handler cannot be 
recolored green with additional stack context; they will be colored magenta in a later graph coloring pass. 
Finally, for each remaining candidate border yellow pop node, a 𝑡𝑡𝐿𝐿𝑥𝑥𝑂𝑂𝑡𝑡𝑡𝑡𝑀𝑀𝑌𝑌𝑡𝑡𝑡𝑡𝑀𝑀𝑚𝑚 tally is made, giving the 
maximum stack context value of any node reachable in one outgoing edge from the candidate. If a candidate 
node's 𝑡𝑡𝐿𝐿𝑥𝑥𝑂𝑂𝑡𝑡𝑡𝑡𝑀𝑀𝑌𝑌𝑡𝑡𝑡𝑡𝑀𝑀𝑚𝑚 is larger than the candidate's maximum stack context minus 1, the candidate is placed 
on the final adaptive slicing list. This condition prevents adaptive slicing from taking place on candidates where 
the stack context is already at least one more than all of the outgoing edge destinations. These nodes cannot be 
successfully recolored through slicing, as they already have full precision with regard to the stack information 
available at all of their successor nodes. An important caveat is that these nodes may not be their final color just 
because they were filtered out in the current pass; they may still need additional stack context to be colored 
green, but not before some outgoing destination node is itself sliced into nodes with greater context. 
In practical terms, the 𝑡𝑡𝐿𝐿𝑥𝑥𝑂𝑂𝑡𝑡𝑡𝑡𝑀𝑀𝑌𝑌𝑡𝑡𝑡𝑡𝑀𝑀𝑚𝑚 test also provides an important component to the stopping criteria for 
the multiresolution analysis. Without this test on candidate nodes, the analysis could loop indefinitely trying to 
add greater stack context to a graph segment that is yellow for some other reason. 
The multiresolution analysis iterates through the final list of nodes selected for adaptive slicing. For each node in 
the list, two new lists are calculated: The deletion list is the transitive closure of all nodes that can be reached by 
backward traversal of nonpush nodes; the rebuild list is the set of push nodes bordering the deletion list. Push 
nodes can correspond to PUSH or CALL instructions. In the case where the deletion list includes the first 
instruction of an interrupt handler, the push nodes can be any instruction from which that interrupt handler can 
be reached in one edge. 
The nodes on the deletion list are deleted. The nodes in the rebuild list are used to seed the initial worklist when 
the graph builder is called to reconstruct the deleted graph segment. Before reconstruction, however, the stack 
context ceiling is set to one item higher than whatever the highest stack context number was among all the 
nodes in the delete list. After reconstruction, the entire graph is recolored. 
The overall stopping criteria for the multiresolution analysis is expensive to calculate. Adaptive slicing on any 
given run can push back the green frontier to expose new border yellow pop nodes that were not candidates in 
the previous scan. Thus, the entire process must be repeated —the entire loop in Fig. 9—until the list of final 
slicing candidates is of zero size. 
The existing adaptive slicing algorithm is not optimal in that a lot of work is needlessly duplicated during the 
analysis. In practice, large segments of graph can be built, deleted, rebuilt with greater stack context, deleted 
again, and rebuilt with even more stack context. A cleverer algorithm would instead update existing nodes with 
greater context, rather than completely recalculating control flow each time. However, this would add 
substantial complexity to the implementation, as the adaptive slicer would need a different graph building 
engine, distinct from the primary graph builder. 
The complexity of the multiresolution analysis is surprisingly large, due both to the complexity of the stopping 
criteria, and the complexity of completely recoloring the graph after each slicing. Knowing when to stop looking 
for candidates for slicing requires global knowledge of the graph and, thus, cannot be inexpensively 
implemented in a system that focuses on per-node operations. The current prototype is designed around the 
assumption that most analysis will focus on a single node and its immediate neighbors and, thus, collecting 
global information can be more expensive. 
SECTION 8 Experimental Results 
The following sections present experiments applying the prototype implementation of our analysis to a test 
suite of commercially available microcontroller systems. Following these results, Section 8.4 presents a narrative 
of a representative session with the tool, starting from a fresh program, and iterating the deadline analysis until 
all nodes are either green, blue, or magenta. 
8.1 Benchmark Characteristics 
The benchmarks (see Fig. 10) used to evaluate our deadline analysis techniques are a collection of real-time, 
interrupt-driven systems programmed in Z86 assembly language and lent to us for analysis by Greenhill 
Manufacturing, Ltd. (www.greenhillmfg.com). 
 
Fig. 10. Benchmark characteristics. 
These systems operate on a descendant of the Z80 processor, the Z86E30 microcontroller, with 256 bytes of 
RAM, 4K of program ROM, and a 12MHz clock. The resources available to such a chip are moderate at best, but 
this is true of millions of units of similar 4-, 8-, and 16-bit embedded systems sold this year. The software for 
these systems was written by hand, in Z86 assembly language, and varies in size from about 800 to 1, 600 lines 
of code. The prototypes for each of these systems underwent months of testing prior to actual production, but 
the overall properties of these systems were still poorly understood, largely due to the lack of proper analysis 
tools like the one we present here. 
The test suite also includes the example program from Fig. 2, called “FSE03,” as well as examples from [9], called 
“ICSE01,” and “Micro00.” The commercial program “Fan” (included in earlier papers,) has been omitted because 
the stack size analysis cannot currently bound its maximum stack height (due to both positive and negative 
cycles in the corresponding CFG); bounded stack height is a precondition to running the deadline analysis 
algorithm. 
8.2 Measurements 
The results shown in Fig. 11 give the final percentages of nodes by color after completion of the deadline 
analysis. For clarity of presentation, interrupt sources in the tables are numbered as “IRQ1,” “IRQ2,” and “IRQ3.” 
This does not imply any kind of priority relationship between the various interrupt sources, nor are these the 
actual interrupt source numbers from the Z86 processor; they are merely organized into columns. (For example, 
Cturk has interrupt handlers for Z86 IRQ3, IRQ4, and IRQ5, and these are labeled 1st, 2nd, and 3rd IRQ, 
respectively, in the table.) Note that the tool rounds percentages down in most cases, or up in the case of 
percentages less than 1 percent, so the tables in Fig. 11 may not total precisely to 100 percent. 
 
Fig. 11. Results with completed oracles. 
Yellow nodes were entirely eliminated, and the percentages of green and blue were high. The magenta present 
in the final graphs was uniformly low, less than 2 percent on average. Several of the benchmarks had 0 percent 
magenta for a given IRQ, which means the analysis can safely and completely bound interrupt latency for those 
particular handlers from anywhere in the program. 
The ZARBI deadline analysis tool is implemented in Java and took less than the 128 Megabytes of available RAM 
to complete the analysis in all cases. The running time of the tool increases as the number of oracle assertions 
allows the tool to slice deeper into the graphs. Runtime varied from less than 2 seconds up to an hour for the 
largest benchmark (with full multiresolution analysis), with an average runtime of 15 minutes overall. The 
current implementation has been optimized toward rapid prototyping and easy debugging of the tool, with little 
regard for running time and space requirements. It is expected that an industrial-strength version of the tool 
could be constructed to run more efficiently. 
Fig. 12 shows the sizes of the graphs generated by the analysis, both with adaptive slicing, and with a 
fixed 𝑘𝑘-CFA, where the value for 𝑘𝑘 is fixed to the depth needed by the adaptive slicing. 
 
Fig. 12. Adaptive slicing versus fixed 𝑘𝑘-CFA. 
As mentioned earlier, employing multiresolution analysis results in a substantial savings in graph complexity, 
with multiresolution graphs 20 percent to 60 percent smaller than the equivalent fixed 𝑘𝑘-CFA graphs. While the 
fixed 𝑘𝑘-CFA graphs can be constructed substantially faster in some cases, the reduction in yellow nodes offered 
by the multiresolution analysis is usually far more valuable. When using the tool to iteratively discover time 
summary assertions for reducing yellow nodes, (as demonstrated in Section 8.4), anything that causes larger 
graphs potentially creates more yellow nodes, adding more data to the output of the tool, and making the entire 
process increasingly difficult. 
Fig. 13 characterizes the number and types of assertions that were provided to the time summary oracle in 
order to eliminate all yellow nodes in the test suite. 
 
Fig. 13. Oracle information provided. 
In all cases, there was only one contiguous magenta cloud for each program that had any magenta nodes. 
8.3 Assessment 
The complete elimination of yellow nodes from the control flow graphs of the commercial microcontrollers was 
the primary goal in the deadline analysis experiments, and this was accomplished by the algorithms presented. 
The high percentage of green and blue nodes makes it possible to completely bound interrupt latency for some 
of the interrupt sources in some of the benchmarks, and greatly decreases the remaining work to be done in 
bounding the others. 
The low percentage of magenta nodes in the graphs, combined with the fact that magenta nodes are 
constrained to a single, contiguous cloud in all of the benchmarks, paves the way for automatically bounding 
these most troublesome parts of the graph in the future. The only case where magenta levels reached a double 
digit percentage was the FSE03 example program, which was constructed to have a prominent magenta 
segment. In many cases, the magenta section is small enough that the total uninterrupted WCET of the magenta 
cloud could be less than the minimum period of the interfering interrupt handler(s), making it possible to reason 
about these sections with a first-order worst-case response time analysis [54] or by detailed simulation and 
testing. 
The number of time summary oracle assertions necessary to eliminate yellow nodes from the benchmarks is 
small and manageable. Well over half of the assertions are of the type that could be automatically inferred by 
local data flow analysis. 
The original motivating challenge for this work was to see if static analysis could significantly reduce the 
required testing effort. Our answer is a definitive “yes.” Our tool allows the testing problem to be reduced from 
needing to cover the entire state space of the system down to concentrating on a handful of key passages in the 
code corresponding to oracle assertions and magenta sections. It is difficult to quantify just how much of a 
reduction in testing we can achieve given that bounding interrupt latency for benchmarks such as ours is a 
largely intractable problem without our analysis technique. With the static analysis we propose, proper testing 
of such systems is not only possible, it can now be undertaken with standard test coverage techniques. 
8.4 User Experience 
This section details the complete process of starting with a raw program, and iterating with the deadline analysis 
to add time summary oracle assertions until all yellow nodes are eliminated. 
This example will use one of the medium sized benchmarks, Rop. 
The overall process of the deadline analysis is shown in Fig. 14. After an initial stack analysis, the user iterates 
the graph coloring deadline analysis, adding assertions until all of the yellow nodes are eliminated. 
 
Fig. 14. Tool usage flow diagram. 
The initial run of the tool takes 23 seconds and outputs: 
Border Yellow instructions: 
L0667: JR ULT, L0680 
L0675: JR ULT, L0680 
L00D2: JR EQ, L00E3 
L066C: JR UGT, L067C 
L067A: JR ULE, L0681 
L0312: JR C, L0308 
L062D: JR ULE, L061C 
L0268: JR UGE, L02B7 
L0080: JR EQ, L00F2 
L02BA: JR UGE, L02C3 
L034C: JR EQ, L0354 
L0396: PUSH %FBh 
L04E6: DJNZ r14, L04E0 
  
Edges = 24503 Green Yellow Magenta Blue 
Nodes = 18559 12522 6029 2 6 
Percent = 67% 32% 1% 1% 
 
The list of potential yellow nodes is long for the initial run, because it is not trivial for the tool to distinguish 
between key yellow loops that must be broken and loop instructions that happen to be on the yellow border for 
other reasons. 
Looking through some of the tool's suggested locations in the code, the user's attention is immediately drawn to 
a potential loop to bound—the DJNZ instruction at L04E6 is part of a double loop that debounces the input from 
a mechanical switch attached to the system. The design of the system specifies that this mechanical contact 
should not bounce for more than 10ms when in good working order. 
The double loop is actually two intertwined loops (which would be difficult to implement in most higher level 
languages), but can be bounded with a pair of assertions to the time summary oracle: 
[0x04E0]->[0x04E8]=80000; 
Debounce. (10ms) [E] 
[0x04DC]->[0x04E8]=80000; 
Debounce. (10ms) [E] 
 
The syntax on the left describes the source and destination nodes, and the length of time to assert. To the right 
of the semicolon, a comment documents the reason for the assertion and the time translated into seconds. (80, 
000 machine cycles equals 10 milliseconds with an 8MHz clock.) The full grammar of the time summary oracle 
file format can be found in [10]. 
The user reruns the tool, with the new oracle assertions. After 31 seconds, the tool responds: 
Border Yellow instructions: 
L0667: JR ULT, L0680 
L0675: JR ULT, L0680 
L00D2: JR EQ, L00E3 
L066C: JR UGT, L067C 
L067A: JR ULE, L0681 
L0312: JR C, L0308 
L062D: JR ULE, L061C 
L0268: JR UGE, L02B7 
L0080: JR EQ, L00F2 
L02BA: JR UGE, L02C3 
L034C: JR EQ, L0354 
L0396: PUSH %FBh 
L04DA: JR NZ, L04D6 
  
Edges = 24513 Green Yellow Magenta Blue 
Nodes = 18559 12528 6023 2 6 
Percent = 67% 32% 1% 1% 
 
Note that the node total has remained the same, but six nodes that were yellow are now green. The DJNZ 
instruction at L04E6 is no longer listed as a border yellow node, and a new border node is listed in its place. The 
tool also outputs the number of red nodes in the graph, if any, but none of these graphs contained red nodes. 
The loop at L04DA is a holding pattern that waits for the human operator to release one of the push buttons. 
The user interface segments of this microcontroller system are only executed when the system is in a 
programming mode, so attention to interrupt handlers is not important here. The user assumes that no one is 
pushing the button, and the branch will never be taken. 
The loop at L0312 waits on an external device that the microcontroller has synchronous communication with. 
The manufacturer guarantees a maximum 40ms delay before the device responds. 
The loop at L062D has a visible bound, but calls several levels of complex subroutines. This is the sort of loop 
that would be extremely tedious to estimate by hand with any accuracy, but which could probably be 
automatically bounded by a local data flow analysis around the loop and its subroutines. For now, the user puts 
in an outrageous overestimate of 3 full seconds; this area should be simulated in depth in order to tighten the 
estimate later. 
The jump instruction JR EQ, L0354 at L034C is part of a loop that writes ASCII strings to a connected LCD panel 
one byte at a time. The number of iterations for the loop is dependent upon the length of the string passed into 
the subroutine, but the system is designed to have a 16 character LCD display, and none of the zero-terminated 
ASCII string constants in the program are longer than 17 characters. The subroutine called from within the loop 
is green from some other call sites, so with some work, the user can conservatively bound the loop to be 17 
characters times at most 40ms, for a total of 680ms. 
The oracle is provided with the next set of assertions. The bracketed letters on the far right of the comment are 
personal notes about the type of assertion. An “[E]” indicates “external delay loops,” which are impossible to 
statically bound. An “[A]” indicates loops dependent on internal data, and the letter “[D]” indicates a more 
difficult class of internal data-dependent loops. 
[0x04D6]->[0x04DC]=30; 
No button press. [E] 
[0x061C]->[0x062F]=24000000; 
Punt. (3sec) [A] 
[0x0308]->[0x0314]=320000; 
Display. (40ms) [E] 
[0x033D]->[0x0354]=5440000; 
17 char (680ms) [D] 
 
This run takes 36 seconds, and has reduced the number of suggested border nodes to look at. The PUSH 
instruction continues to appear in the list only because some other yellow obstacle is preventing the slicer from 
identifying the correct segment to which additional stack context should be added. 
Border Yellow instructions: 
L0396: PUSH %FBh 
L0608: DJNZ r12, L0601 
L0650: JR ULE, L063F 
L042A: JR Z, L041C 
  
Edges = 25044 Green Yellow Magenta Blue 
Nodes = 18992 16470 2431 2 89 
Percent = 86% 12% 1% 1% 
 
The loop at L042A is part of another software debouncing area. The user will assume no button press. 
The loop at L0650 is a twin to the loop at L062D above, so the user duplicates the assertion edge with new 
source and destination addresses. 
The DJNZ instruction at L0608 is part of a nested loop that was designed to wait 20ms before sending more data 
to a peripheral chip. 
More assertions are added, and the tool is rerun. 
[0x0420]->[0x0427]=46; 
No button press. [E] 
[0x0420]->[0x042C]=66; 
No button press. [E] 
[0x063F]->[0x0652]=24000000; 
Punt. (3sec) [A] 
[0x0601]->[0x060A]=166086; 
EEPROM write (20ms) [A] 
[0x0603]->[0x060A]=166086; 
EEPROM write (20ms) [A] 
  
Border Yellow instructions: 
L0396: PUSH %FBh 
L05E5: DJNZ r13, L05D8 
L05F6: DJNZ r13, L05EA 
  
Edges = 25088 Green Yellow Magenta Blue 
Nodes = 19020 17562 1367 2 89 
Percent = 92% 7% 1% 1% 
 
After 39 seconds of analysis, the percentage of green nodes has topped 90 percent, and the remaining yellow 
nodes are in the single digit range. The user is in the home stretch now. 
Both of the suggested DJNZ instructions belong to loops with obvious bounds. While somewhat tedious, the user 
is able to total up the execution time of the dozen instructions in the bodies of the loops, and multiply them by 
the bounds. 
[0x05EA]->[0x05F8]=144; 
RDLP1 (8*18cyc=18uS) [A] 
[0x05D8]->[0x05E7]=1200; 
SENDBF (8*150c =150uS) [A] 
  
Border Yellow instructions: 
L0396: PUSH %FBh 
L0490: DJNZ r14, L048D 
  
Edges = 28728 Green Yellow Magenta Blue 
Nodes = 21837 21242 504 2 89 
Percent = 97% 2% 1% 1% 
 
After a 1 minute, 19 second analysis, the program has 97 percent green nodes. 
The next border node belongs to a loop with obvious bounds calling a 40ms subroutine. There are two very 
similar loops with slightly different bounds on the page above L0490. The user adds assertions for all three. 
[0x048D]->[0x0492]=1601000; 
DSPBCK 5x (201ms) [A] 
[0x046C]->[0x0471]=1601000; 
DSPBCK 5x (201ms) [A] 
[0x0445]->[0x044A]=1280800; 
DSPBCK 4x (161ms) [A] 
 
The final run of the tool takes 1 minute, 26 seconds, but produces zero yellow nodes. 
Edges = 28731 Green Yellow Magenta Blue 
Nodes = 21837 21746 0 2 89 
Percent = 99% 0% 1% 1% 
 
There is still much testing to be done for this embedded system. The user has presented 16 assertions to the 
oracle, 10 of those based upon manual inspection of the code, rather than external design criteria. Simulation 
and testing of the system should aim to validate and/or tighten these unchecked assertions. 
While the two magenta nodes in the system seem to be a small window of opportunity for interrupt starvation, 
they comprise an infinite loop with a nongreen interrupt source turned on. In other words, the system turns off 
all other interrupts, and waits for a particular, different interrupt to occur before returning to normal operation. 
Thus, deadline analysis for this system and this particular interrupt handler depends ultimately upon knowing 
the upper bound on the time the system will have to wait for this other interrupt source to be triggered. 
Overall understanding of the example system's timing behavior has increased as a result of the deadline 
analysis. Testing and simulation can concentrate on the lines of code for which assertions have been provided, 
and on the magenta nodes, both of which comprise a tiny fraction of the total state space for the code. The 
prototype implementation also produces flow graphs that depict the colors of code regions, or can dump the 
graph in a flat file format suitable for import into other visualization tools. Additional implementation details are 
presented in [10]. 
SECTION 9 Conclusion 
For interrupt-driven assembly code, our tool makes it significantly easier to perform deadline analysis. We use 
static analysis to reduce the required testing effort to concentrate on the validity of certain testing oracles. Our 
multiresolution analysis allows for compact and efficient representation of timing properties while smoothly 
incorporating the oracles. For each of our test programs, at most 17 oracles are sufficient, and these can be 
added in an interactive fashion until the deadline analysis is complete. In our experience, an expert user can go 
from a bare program of about 1, 000 lines of assembly code to a completed deadline analysis in less than an 
hour, not counting the testing of the oracles. 
While the current incarnation of the tool uses a Z86 front end, the abstractions used in the graph analysis are 
applicable to a wide range of other processors which use bit-maskable, vectored interrupt handling, such as the 
Motorola 68000 family and many RISC DSP chips. 
Several valuable lessons were learned while experimenting with our tool on real code “found in the wild.” First, 
for designers of deadline analysis tools, it is absolutely crucial to find good techniques for presenting and 
visualizing the vast amount of data available when analyzing even moderately sized interrupt-driven systems. 
For interrupt-driven system designers, it is key to note that stack boundedness is a precondition for this kind of 
deadline analysis. We have one commercial microcontroller system for which we cannot provide a deadline 
analysis because its stack behavior is too byzantine for the current incarnation of the stack analyzer to handle. 
Finally, for low-level, interrupt-driven systems of this kind it is essential that the full interrupt capabilities of the 
microcontroller be used as sparingly as possible. Global interrupts should be kept disabled whenever possible, 
and individual interrupts should be masked off when not needed. Judicious use of interrupts greatly reduces the 
complexity of the analysis problem, and ultimately reduces the size of the magenta sections that require further 
modeling after our deadline analysis. While we believe our work has shown that existing techniques can tackle 
deadline analysis for well-written, medium-sized interrupt-driven systems, it is surely the case that sloppy or 
naive implementation of an interrupt-driven system can still render the problem into intractability. 
Future work includes improvements in 1) discovery of loop variables bounds, 2) static analysis of magenta 
clouds, based on specifications of minimum interarrival times for interrupts, and 3) the interface for visualization 
of the graph. 
ACKNOWLEDGMENTS 
The authors would like to thank Mayur Naik, Krishna Nandivada, John Regehr, Michael Richmond, Ben Titzer and 
the anonymous reviewers for helpful comments on drafts of the FSE paper. They also thank Greenhill 
Manufacturing, Ltd., for the use of their proprietary software as test input. The authors were supported by a US 
National Science Foundation ITR Award number 0112628. 
References 
1. "CPLEX mixed integer optimizer", 2003. 
2. APP550 Student Training Guide, May 2003. 
3. A.V. Aho, R. Sethi and J.D. Ullman, Compilers: Principles Techniques and Tools., 1985. 
4. M. Alt, C. Ferdinand, F. Martin and R. Wilhelm, "Cache Behavior Prediction by Abstract Interpretation", Proc. 
SAS 96: Int'l Static Analysis Symp., 1996. 
5. P. Altenbernd, "On the False Path Problem in Hard Real-Time Programs", Proc. ERTS 96: Eighth EuroMicro 
Workshop Real-Time Systems, pp. 102-107, June 1996. 
6. D.W. Brylow, "Static Analysis of Interrupt-Driven Software", 2003. 
7. A. Burns and A. Wellings, Real-Time Systems and Programming Languages, 2001. 
8. L. Cardelli, "Type Systems", The Computer Science and Engineering Handbook, pp. 2208-2236, 1997. 
9. T. Lindholm and F. Yell, The Java Virtual Machine Specification, Apr. 1999. 
10. D. McAllester, "On the Complexity Analysis of Static Analyses", Proc. SAS 99: Int'l Static Analysis Symp., 1999. 
11. P.C. McGeer and R.K. Brayton, Integrating Functional and Temporal Domains in Logic Design., May 1991. 
12. T. Mitra and A. Roychoudhury, "A Framework to Model Branch Prediction for WCET Analysis", Proc. WCET 
02: Second IEEE Int'l Workshop Worst Case Execution Time Analysis, pp. 68-71, June 2002. 
13. J.C. Mogul, R.F. Rashid and M.J. Accetta, "The Packet Filter: An Efficient Mechanism for User-Level Network 
Code", Proc. SOSP 87: 11th ACM Symp. Operating Systems Principles, pp. 39-51, 1987. 
14. S. Muchnick, Advanced Compiler Design and Implementation., 1997. 
15. G. Naumovich, G.S. Avrunin and L.A. Clarke, "Data Flow Analysis for Checking Properties of Concurrent Java 
Programs", Proc. ICSE 99: 21st Int'l Conf. Software Eng., pp. 399-410, May 1999. 
16. G. Naumovich and L.A. Clarke, "Extending FLAVERS to Check Properties on Infinite Executions of Concurrent 
Software Systems", Apr. 2000. 
17. F. Nielson, H.R. Nielson and C. Hank, Principles of Program Analysis., 1999. 
18. P. Notebaert, "lp_solve: Mixed Integer Linear Program Solver", 2003. 
19. J. Palsberg and M.I. Schwartzbach, "Object-Oriented Type Inference", Proc. OOPSLA 91: Sixth Ann. ACM 
SIGPLAN Conf. Object-Oriented Programming Systems Languages and Applications, pp. 146-161, 1991. 
20. S. Petters and G. Frber, "Making Worst Case Execution Time Analysis for Hard Real-Time Tasks on State of 
the Art Processors Feasible", Proc. RTCSA 99: Sixth Int'l Conf. Real-Time Computing Systems and 
Applications, pp. 442-449, 1999. 
21. J. Plevyak and A.A. Chien, "Precise Concrete Type Inference for Object-Oriented Languages", Proc. OOPSLA 
94: Ninth Ann. Conf. Object-Oriented Programming Systems Language and Applications, pp. 324-340, 
1994. 
22. A. Podelski, "Model Checking as Constraint Solving", Proc. SAS 00: Int'l Static Analysis Symp., 2000. 
23. P. Puschner and C. Koza, "Calculating the Maximum Execution Time of Real-Time Programs", J. Real-Time 
Systems, vol. 1, no. 2, pp. 159-176, Sept. 1989. 
24. P.P. Puschner and A.V. Schedl, "Computing Maximum Task Execution TimesA Graph-Based Approach", J. 
Real-Time Systems, vol. 13, no. 1, pp. 67-91, 1997. 
25. J. Regehr, A. Reid and K. Webb, "Eliminating Stack Overflow by Abstract Interpretation", Proc. EMSOFT 03: 
Third Int'l Conf. Embedded Software, 2003. 
26. T. Reps, "Undecidability of Context-Sensitive Data-Independence Analysis", ACM Trans. Programming 
Languages and Systems (TOPLAS), vol. 22, no. 1, pp. 162-186, 2000. 
27. D.J. Richardson, S.L. Aha and T.O. O'Malley, "Specification-Based Test Oracles for Reactive Systems", Proc. 
ICSE 92: 14th Int'l Conf. Software Eng., pp. 105-118, 1992. 
28. R. Sedgewick, Algorithms in C Part 5: Graph Algorithms., 2001. 
29. M. Sharir and A. Pnueli, "Two Approaches to Interprocedural Data Flow Analysis", Program Flow Analysis 
Theory and Applications, 1981. 
30. E. Clarke, O. Grumberg and D. Peled, Model Checking., Jan. 2000. 
31. M. Corti, R. Brega and T. Gross, "Approximation of Worst-Case Execution Time for Preemptive Multitasking 
Systems", Proc. LCTES 00: ACM SIGPLAN Workshop Languages Compilers and Tools for Embedded 
Systems, 2000. 
32. P. Cousot and R. Cousot, "Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by 
Construction or Approximation of Fixpoints", Proc. POPL 77: Fourth Ann. ACM SIGPLAN-SIGACT Symp. 
Principles of Programming Languages, pp. 238-252, 1977. 
33. M.B. Dwyer, "Data Flow Analysis for Verifying Correctness Properties of Concurrent Programs", Sept. 1995. 
34. E.A. Emerson, "Temporal and Modal Logic", Handbook of Theoretical Computer Science (vol. B): Formal 
Models and Semantics, pp. 995-1072, 1990. 
35. J. Engblom and A. Ermedahl, "Modeling Complex Flows for Worst-Case Execution Time Analysis", Proc. RTSS 
00: 21st IEEE Real-Time Systems Symp., Nov. 2000. 
36. J. Engblom, A. Ermedahl, M. Sjdin, J. Gustafsson and H. Hansson, "Worst-Case Execution-Time Analysis for 
Embedded Real-Time Systems", Software Tools for Technology Transfer, vol. 14, 2000. 
37. J. Engblom and B. Jonsson, "Processor Pipelines and Their Properties for Static WCET Analysis", Proc. 
EMSOFT 02: Second Int'l Conf. Embedded Software, pp. 334-348, Oct. 2002. 
38. C. Ferdinand, R. Heckmann, M. Langenbach, F. Martin, M. Schmidt, H. Theiling, et al., "Reliable and Precise 
WCET Determination for a Real-Life Processor", Proc. EMSOFT 01: First Int'l Workshop Embedded 
Software, Oct. 2001. 
39. C. Ferdinand, F. Martin and R. Wilhelm, "Applying Compiler Techniques to Cache Behavior Prediction", Proc. 
LCTRTS 97: ACM SIGPLAN Workshop Languages Compilers and Tools for Real-Time Systems, pp. 37-46, 
1997. 
40. D. Gay, P. Levis, J.R. vonBehren, M. Welsh, E.A. Brewer and D.E. Culler, "The nesC Language: A Holistic 
Approach to Networked Embedded Systems", Proc. PLDI 03: ACM SIGPLAN Conf. Programming 
Language Design and Implementation, pp. 1-11, 2003. 
41. J. Gustafsson, B. Lisper, N. Bermudo, C. Sandberg and L. Sjberg, "A Prototype Tool for Flow Analysis of C 
Programs", Proc. WCET 02: Second IEEE Int'l Workshop Worst Case Execution Time Analysis, pp. 10-13, 
June 2002. 
42. S.Z. Guyer and C. Lin, "Client-Driven Pointer Analysis", Proc. SAS 03: 10th Ann. Int'l Static Analysis Symp., pp. 
214-236, 2003. 
43. M.J. Harrold, J.A. Jones, T. Li, D. Liang and A. Gujarathi, "Regression Test Selection for Java Software", Proc. 
OOPSLA 01: 16th Ann. ACM SIGPLAN Conf. Object-Oriented Programming Systems Languages and 
Applications, pp. 312-326, 2001. 
44. C.A. Healy, M. Sjdin, V. Rustagi, D.B. Whalley and R. vanEngelen, "Supporting Timing Analysis by Automatic 
Bounding of Loop Iterations", J. Real-Time Systems, pp. 129-156, 2000. 
45. C.A. Healy and D.B. Whalley, "Automatic Detection and Exploitation of Branch Constraints for Timing 
Analysis", IEEE Trans. Software Eng., vol. 28, no. 8, pp. 763-781, Aug. 2002. 
46. R. Alur and D.L. Dill, "A Theory of Timed Automata", Theoretical Computer Science, vol. 126, no. 2, pp. 183-
235, 1994. 
47. R. Alur and T.A. Henzinger, "Logics and Models of Real Time: A Survey", Real-Time: Theory in Practice, 1992. 
48. G. Balakrishnan and T. Reps, "Analyzing Memory Accesses in x86 Executables", Proc. CC 04: 13th Int'l Conf. 
Compiler Construction, vol. 2985, pp. 5-23, Mar. 2004. 
49. G. Bernat, A. Burns and A. Wellings, "Portable Worst-Case Execution Time Analysis Using Java Byte 
Code", Proc. 12th Euromicro Conf. Real-Time Systems, pp. 81-88, June 2000. 
50. D. Brylow, N. Damgaard and J. Palsberg, "Static Checking of Interrupt Driven Software", Proc. ICSE 01: 23rd 
Int'l Conf. Software Eng., pp. 47-56, June 2001. 
51. T.A. Henzinger, R. Jhala and R. Majumdar, "Race Checking by Context Inference", of PLDI 04: Int'l Conf. 
Programming Language Design and Implementation, pp. 1-13, 2004. 
52. O. Shivers, "Control-Flow Analysis of Higher-Order Languages", May 1991. 
53. H. Theiling, C. Ferdinand and R. Wilhelm, "Fast and Precise WCET Prediction by Separated Cache and Path 
Analyses", J. Real-Time Systems, pp. 157-179, 2000. 
54. K.W. Tindell, A. Burns and A.J. Wellings, "An Extendible Approach for Analysing Fixed Priority Hard Real-Time 
Tasks", J. Real-Time Systems, vol. 6, no. 2, pp. 133-152, Mar. 1994. 
55. F. Tip and J. Palsberg, "Scalable Propagation-Based Call Graph Construction Algorithms", Proc. OOPSLA 00: 
15th Ann. ACM SIGPLAN Conf. Object-Oriented Programming Systems Languages and Applications, pp. 
281-293, 2000. 
56. E. Vivancos, C. Healy, F. Mueller and D. Whalley, "Parametric Timing Analysis", Proc. LCTES 01: ACM SIGPLAN 
Workshop Languages Compilers and Tools for Embedded Systems, pp. 88-93, 2001. 
57. J. Wegener and F. Mueller, "A Comparison of Static Analysis and Evolutionary Testing for the Verification of 
Timing Constraints", J. Real-Time Systems, vol. 21, no. 3, pp. 241-268, Nov. 2001. 
58. Z. Xu, B.P. Miller and T. Reps, "Safety Checking of Machine Code", Proc. PLDI 00: ACM SIGPLAN Conf. 
Programming Language Design and Implementation, 2000. 
59. Z86E30/E31/E40 Preliminary Product Specification, 1998. 
 
 
