Composable code generation for distributed giotto by Thomas A. Henzinger et al.
Composable Code Generation for Distributed Giotto 
Thomas A. Henzinger
EPFL and UC Berkeley
tah@ep.ch
Christoph M. Kirsch
University of Salzburg
ck@cs.uni-salzburg.at
Slobodan Matic
University of California, Berkeley
matic@eecs.berkeley.edu
Abstract. We present a compositional approach to the implemen-
tation of hard real-time software running on a distributed platform.
We explain how several code suppliers, coordinated by a system
integrator, can independently generate different parts of the dis-
tributed software. The task structure, interaction, and timing is
specied as a Giotto program. Each supplier is given a part of the
Giotto program and a timing interface, from which the supplier
generates task and scheduling code. The integrator then checks,
individually for each supplier, in pseudo-polynomial time, if the
supplied code meets its timing specication. If all checks succeed,
then the supplied software parts are guaranteed to work together
and implement the original Giotto program. The feasibility of the
approach is demonstrated by a prototype implementation.
Categories and Subject Descriptors C.3 [Special-purpose and
Application-based Systems]: [Real-time and Embedded Systems;
D.1.3 [Software Techniques]: [Distributed Programming
General Terms Languages, Reliability
Keywords Real Time, Distributed Compilation
1. Introduction
The distributed implementation of hard real-time systems is a
key challenge in modern control systems, especially in automo-
bile (drive-by-wire) and aircraft (y-by-wire) control. Much of the
work in this area has been devoted to hardware-focused solutions,
such as the time-triggered architecture [1], which guarantees hard
real-time constraints across a distributed system by strict adher-
ence to clock-synchronized networking protocols. The cost of such
a solution is paid in terms of exibility, and even recent efforts
in the automotive industry (FlexRay, Autosar [2]) require that all
component processes, their dependencies, and their timing pro-
les be known in advance. We suggest that the competing goals of
timely execution and composable design can be achieved together
by adopting a software solution that requires only basic hardware
services such as clock synchronization and redundancy manage-
ment. We have previously proposed the LET (logical execution
time) paradigm, and the LET-based language Giotto, as a software
This research was supported in part by the AFOSR MURI grant F49620-
00-1-0327 and the NSF grants CCR-0208875 and CCR-0225610.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specic permission and/or a fee.
LCTES'05, June 1517, 2005, Chicago, Illinois, USA.
Copyright c  2005 ACM 1-59593-018-3/05/0006...$5.00.
model that guarantees predictable real-time execution and at the
same time supports portable, composable code [3]. In this paper,
we demonstrate how Giotto can be implemented on a distributed
platform by distributed compilation with little global coordination.
In this way, Giotto offers a framework for the compositional design
of hard real-time systems.
Giotto. Giotto is a domain-specic language for control appli-
cations [3]. A Giotto program executes a periodic set of LET tasks,
and the set of tasks, or their periods, may change whenever a Giotto
mode switch occurs. Instead of just a deadline, a LET task has a re-
lease and a termination time: the release time species the exact
time at which the task inputs are made available to the task; the
termination time species when the task outputs become available
to other tasks. The task must start running, may be preempted, and
must complete execution during its LET, which is the time from
release to termination. Thus the times when a LET task reads and
writes data are decoupled from the task execution. LET avoids race
conditions, and thus ensures the predictable, deterministic execu-
tion of a set of real-time tasks. LET tasks can be replaced and com-
posed without modifying their behavior or timing. Since LET is
an abstract programming model, the compiler must ensure that the
generated code satises the LET assumption. This can be achieved
by compiling Giotto into schedule-carrying code (SCC) [4] for a
pair of virtual machines: the E (embedded) machine mediates be-
tween tasks and the physical environment [5]; the S (scheduling)
machine mediates between tasks and the CPU [4]. E code species
when sensors and task inputs are read, and when actuators and task
outputs are written; S code species when a task is executed on the
CPU. We have implemented the E and S machine as part of a high-
performance microkernel for real-time systems [6] and used Giotto
to implement the ight-control system for a model helicopter [7].
Distributed hard real-time code. A Giotto program species
the functional and timing behavior of a dynamic set of tasks, for
example, the tasks of an automotive control system. Such a sys-
tem is typically executed by an on-board network with several hosts
(CPUs).Moreover, such a system istypically put together from sev-
eral parts, which correspond to different control problems, for ex-
ample, fuel injection and anti-lock brake control. While the differ-
ent software parts may interact, they are often developed by differ-
ent suppliers: the brake supplier will deliver its own software, etc.
Furthermore, to optimize the use of computational resources, there
need not be a one-to-one correspondence between hosts and suppli-
ers. The contracting company, or integrator (e.g., the car manufac-
turer), then faces the challenge of putting together and maintaining
the entire system. Using today's methodologies, a simple modi-
cation in the software of a single supplier may induce a series of
modications in the whole system. For example, a change of tim-
ing attributes (e.g., task execution times) in one software compo-
nent may cause the schedule of other components to change. We
show how this problem can be avoided using Giotto.
Our approach. We view the Giotto program as the overall
system specication (timing and task interaction). Each supplier isgiven a part of the Giotto program with the charge to implement
the corresponding tasks. This information can be regarded as a
component specication. So that all supplied software parts will
t together, each supplier also receives timing information in the
form of a timing interface. The timing interface species the time
slots that can be used by the supplier for computation on the
hosts, and the time slots that can be used by the supplier for
communication over the network. From a component specication
and a timing interface each supplier produces code. The integrator
then checks that the produced code complies to the timing interface
and meets, on the given hardware, the release and termination
times specied by the Giotto program. The rst check is called
interface compliance; the second, time safety. Both checks are local
for each piece of supplied code and can be performed in pseudo-
polynomial time. If all checks go through, the integrator is assured
that all supplied software parts t together and correctly implement
the original Giotto program (note that correctness includes the
satisfaction of all real-time constraints).
Why does this work? Essentially, we build a fully software-
based instance of the time-triggered paradigm. Instead of having
the hardware and network protocol enforce all timing interfaces,
each timing interface is enforced separately by the compiler (dur-
ing distributed code generation by the suppliers) and by program
analysis (during code integration by the integrator). The LET as-
sumption is crucial to this approach. The LET (release to termina-
tion) of a task is always non-zero. This allows us to communicate
values across the network without changing the timing of a task,
and without introducing nondeterminism, as long as the timing in-
terface ensures that all values are available in time to meet all task
release and termination times, and all sensor read and actuator up-
date times. By contrast, the synchrony assumption used by other
real-time languages [8] does not offer this exibility, and hence an
important approach to distributing synchronous programs is based
on the Globally Asynchronous, Locally Synchronous paradigm [9].
What are the benets? We obtain the benets of the time-
triggered paradigm in terms of real-time assurance, and at the same
time achieve a high degree of exibility. For example, a supplier
may be replaced by another one, and as long as the code produced
by the new supplier complies to its component specication and
timing interface, it will work together properly with all other code
in the system. Likewise, if new functionality is added to the sys-
tem, say by adding a new supplier, as long as the new software
passes the two checks (interface compliance and time safety), it
will not change the behavior (neither functionality nor timing) of
the original system in any way. This is because interface compli-
ance succeeds only if the original set of timing interfaces can ac-
commodate an additional timing interface with sufcient capacity,
and time safety succeeds only if the original set of hosts can ac-
commodate the new tasks. The advantage of our approach lies in
the fact that the two checks can be performed automatically, and
the system integrator need not rely exclusively on testing to see if
the upgraded system behaves correctly.
Related work. Previously, Giotto had only been compiled for
single-CPU systems [10]. The contribution of this paper is two-
fold: we describe a methodology that supports (1) distributed real-
time code generation for (2) distributed real-time systems. Mul-
tiple suppliers (1) can independently compile different parts of a
Giotto program to run on a system of multiple CPUs (2). Because
of the time-driven nature of our timing interfaces, (1) immedi-
ately enables (2) on clock-synchronized systems. Other approaches
for (2), however, may not necessarily support (1); for example, syn-
chronous reactive programs written in Lustre have been compiled
globally for distributed real-time systems [11]. Aimed at (1) are
scheduling techniques that address the problem of dividing tasks
into groups, and scheduling tasks within groups [12, 13]: the chal-
lenge is to develop compositional schemes for resource partitioning
such that each task group may be programmed as if it had dedicated
access to the resource and may be tested for schedulability with-
out global task knowledge. However, these techniques typically as-
sume a single CPU and no interaction between tasks. In distributed
real-time systems there are efforts [14] to dene minimal but com-
plete interfaces that link components together. In avionics software,
where previously each control subsystem had its own dedicated re-
source, new solutions are proposed which offer a common com-
puting platform for multiple functions; [15] presents requirements
for the temporal partitioning of such a platform. The car manufac-
turers' and suppliers' perspectives on embedded software reuse are
described in [16], which presents a general framework in which
different software components can be classied according to their
degree of reusability, albeit without considering real-time commu-
nication in detail.
Outline of the paper. In Section 2, we present a brief review of
Giotto and introduce a running example that we will use through-
out this paper. In Section 3, we discuss the algorithm that generates
from a given Giotto program virtual machine code (SCC) for each
host and each supplier. In Section 4, we introduce timing interfaces
and show how they can be composed. Section5 describes our proto-
type implementation of distributed Giotto. In Section 6, we analyze
distributed SCC generated from Giotto, present pseudo-polynomial
checks for interface compliance (w.r.t. a timing interface) and time
safety (w.r.t. the worst-case execution times of tasks), and prove the
distributed Giotto compiler correct.
2. The Giotto Language
We give a brief introduction to Giotto and refer to [3] for details.
A simple example of a Giotto program GA is shown in Fig. 1. For
now ignore the distribution annotations given in the brackets to the
right of the program. In this audio application a prerecorded PCM-
format audio le is read, processed, analyzed, and reproduced by
three real-time tasks. The Generator task synthesizes the digital
audio samples of the sound that resembles the plucking of a string.
This is done according to the Karplus-Strong algorithm [17], where
the period of the task determines the pitch of the generated sound.
The Mixer task merges the le samples with the synthesized sam-
ples amplifying the string pluck sound. The Analyzer task com-
putes a short-time Fourier series of the mix sound.
A Giotto program begins with port declarations. A port is a
typed variable. The set Ports is partitioned into the following four
sets: a set SensePorts of sensor ports, a set ActPorts of actuator
ports, a set InPorts of task input ports, and a set OutPorts of task
output ports. The sensor ports include the integer-typed port pc, a
discrete clock. In Fig.1 the sensor portAudioSampler represents a
vector of audio le samples, the actuator port MixPlayer a vector
of nal waveform samples, and the task output ports Spectrum,
MixSound, and StringSound, respectively, represent vectors of
Fourier coefcients, mix samples, and string samples. The Fig. 2
shows the data dependency graph for the tasks (rectangles with
rounded corners), the sensor, and the actuator. Each sensor (resp.
actuator) port p is read (resp. written) by a device driver dev[p].
Each task output port is double-buffered, i.e., it is implemented by
two copies, a local copy that is used by the task only, and a global
copy that is accessible to the rest of the program including other
tasks. The copy driver copy[p] copies data from the local copy to
the global copy of the task output port p.
Giotto has two kinds of computational activities, tasks and
drivers. Tasks are released and their execution take time, while
drivers are executed in logically zero time. A Giotto task t has a set
In[t]  InPorts of input ports, a set Out[t]  OutPorts of out-
put ports, and a task function task[t] from the input to the output
ports. The task function represents the result of the computationalsensor
AudioSampler uses dev[AudioSampler];
actuator
MixPlayer uses dev[MixPlayer];
output
Spectrum uses copy[Spectrum];
MixSound uses copy[MixSound];
StringSound uses copy[StringSound];
task
Analyzer(In1) output(Spectrum);
Mixer(In2) output(MixSound);
Generator(In3) output(StringSound);
driver
InDrv1(MixSound) output(In1);
InDrv2(AudioSampler;StringSound) output(In2);
InDrv3() output(In3);
ActDrv(MixSound) output(MixPlayer);
start m1 f
mode m1() period 8 f
actfreq 2 do MixPlayer(ActDrv);
taskfreq 1 do Analyzer(InDrv1);
taskfreq 2 do Mixer(InDrv2);
taskfreq 1 do Generator(InDrv3); g
g
[s1;h1]
[s1;h1]
[s1;h1]
[s2;h2]
[s3;h2]
Figure 1. Audio mixer Giotto program GA
Figure 2. Data dependency graph for the program GA
activity performed by the task. For example, the task Mixer is de-
ned with input port In2, output port MixSound, and task function
task[Mixer]. In addition to the device and copy drivers described
above, drivers can be used to transport data between ports and to
initiate mode changes. A Giotto driver d has a set Src[d]  Ports
of source ports, a set Dst[d]  Ports of destination ports, a driver
function drv[d] from the source to the destination ports, and an
optional boolean condition on the source ports to control mode
switching. For instance, AudioSampler and StringSound are the
source ports and In2 is the destination port of the driver InDrv2.
Let Tasks (resp. Drvs) be the set of tasks (resp. drivers).
A Giotto program is dened with a set of modes, each of which
consists of a set of periodic tasks. In each mode the invocation of
tasks is repeated after a xed amount of time we call the mode
period. The task set can change at transitions (switches) from one
mode to another. Let Modes be the set of modes, containing a start
mode start 2 Modes. A Giotto mode m has a period [m] 2
￿
>0, a set of task invocations, a set of actuator updates, and a set
of mode switches. Each task invocation (!task;t;d) consists of a
task frequency !task 2
￿
>0 relative to the mode period, a task t,
and a task input driver d, which loads the task inputs. In our ex-
ample there is only one mode m1 with the period [m1] = 8 time
units, in this case milliseconds. The audio le is discretized at the
rate of 11Khz, and 44 of its samples are read every 4ms. The mix
sound is also processed with the period of 4ms, so the frequency
of the Mixer task is 2, and one of the three task invocations of
mode m1 is (2;Mixer;InDrv2). The LET character of the Mixer
task implies that, even if it completes earlier, its output MixSound
is made available through the copy[MixSound] driver exactly at
4ms. Each actuator update (!act;d) consists of an actuator fre-
quency !act 2
￿
>0, and an actuator driver d. Each mode switch
(!switch;m
0;d) consists of a switch frequency !switch 2
￿
>0, a
target mode m
0, and a mode driver d which uses the boolean con-
dition on its source ports to control the mode switch. For the single
mode m1 ofthe example, wehave one actuator update (2;ActDrv)
mode m2() period 8 f
exitfreq 4 do m1(ModeDrv2);
actfreq 4 do MixPlayer(ActDrv);
taskfreq 1 do Analyzer(InDrv1);
taskfreq 4 do Mixer(InDrv2);
taskfreq 1 do Generator(InDrv3); g
Figure 3. Additional mode for the Giotto program GA
and no mode switches. In the rest of the paper we will refer to the
single-mode program in Fig. 1. However, if, for instance, we want
to be able to switch to a mode m2 in which task Mixer is executed
twice as fast, i.e.with !task=4, the program GA should also contain
code for m2 shown in Fig. 3.
For a mode m, the least common multiple of the task, ac-
tuator, and mode-switch frequencies of m is called the num-
ber of units of m, denoted !max[m]. The duration of a unit is
[m] = [m]=!max[m]. For the compilation procedure we need
the following sets which can, given a mode m and an integer unit
0  k < !max[m], be directly determined from the Giotto pro-
gram. The set taskInvocations(m;k) contains all task invocations
of mode m that are released at unit k, i.e., for which k  [m]
is an integer multiple of [m]=!task. For instance, [m1] = 4
and taskInvocations(m1;1) = f(2;Mixer;InDrv2)g, because
the Mixer task is the only task that is released at unit 1 of m1,
at 4ms. An output port is in the set taskOutPorts(m;k) if in
mode m it is updated at unit k, i.e., if it is an output port of
a task in taskInvocations(m;k). A sensor port is in the set
senPorts(m;k) if in mode m it is read at unit k, i.e., if it is a
source port of an input driver of a task in taskInvocations(m;k).
The set actDrivers(m;k) contains all actuator drivers of mode
m that are invoked at unit k. Finally, an actuator port is in the set
actPorts(m;k) if in mode m it is updated at unit k, i.e., if it is
a destination port of a driver in actDrivers(m;k). For instance,
senPorts(m1;1) = fAudioSamplerg and actPorts(m1;1) =
fMixPlayerg.
E code, S code, and schedule-carrying code (SCC). In [4]
we presented the execution of a Giotto program on a single pro-
cessor through the interpretation of code compiled for two virtual
machines, embedded and scheduling machine. The embedded ma-
chine [5] handles sensors, actuators, and all task requests. It runs
E code that species the timing and control ow of Giotto tasks and
drivers. The embedded machine has three non-control-ow instruc-
tions. A call(d) instruction immediately invokes a driver d. A
release(t) instruction
1 releases a task t and proceeds to the next
E code instruction. A future(`;a) instruction marks the E code at
the address a for execution after ` ms elapse. The positive integer
` species a time trigger, the simplest and only form of trigger that
we consider in this paper. In order to handle multiple active trig-
gers, the embedded machine maintains a trigger queue. The Giotto
compiler generates a block of E code instructions for each unit of
each program mode.
For example, in Fig. 4, the block of E code for unit 0 of mode
m1 is identied by the label E(m1;0). It initiates the execution
of the copy drivers that update the three task output ports, and the
execution of the audio player device driver. Then the audio sampler
device driver and the three task input drivers update the input ports
of the three tasks that are released next. Note the order of driver
call instructions: copy drivers are followed by device drivers,
followed by task input drivers. Finally, a time trigger with address
label E(m1;1) is activated. So, after 4ms the embedded machine
executes the block of E code starting at the address E(m1;1).
The last instruction of this block activates another 4ms trigger,
now with address E(m1;0). In this way the execution of each
1The release instruction corresponds to the schedule instruction in [5],
but has been renamed for clarity.E(m1;0):
call(copy[Spectrum])
call(copy[MixSound])
call(copy[StringSound])
call(ActDrv)
call(dev[MixPlayer])
call(dev[AudioSampler])
call(InDrv1)
call(InDrv2)
call(InDrv3)
release(Analyzer)
release(Mixer)
release(Generator)
future(4; E(m1;1))
E(m1;1):
call(copy[MixSound])
call(ActDrv)
call(dev[MixPlayer])
call(dev[AudioSampler])
call(InDrv2)
release(Mixer)
future(4;E(m1;0))
Figure 4. E code blocks for the program GA
S(m1;0):
dispatch(Mixer;4)
dispatch(Generator;4)
dispatch(Analyzer;4)
S(m1;1):
dispatch(Mixer;4)
dispatch(Generator;4)
dispatch(Analyzer; 4)
Figure 5. S code blocks for the program GA
of the two blocks is repeated every 8ms. Note that the task and
driver functions are external to the embedded machine and must be
implemented in some other language.
The scheduling machine [4] determines when, and in what or-
der, tasks released by the E code are executed (dispatched). It re-
places the system task scheduler, since the code that it runs, S code,
denes a schedule according to which, at run time, a simple dis-
patcher selects which task to execute. The scheduling machine also
has three instructions, one of which is call(d) as for the embed-
ded machine. A dispatch(t;`) instruction resumes (or starts) the
execution of a released task t until ` ms elapse, measured from
the start instant of the current S code block. The integer ` spec-
ies the simplest and the only form of timeouts that we consider
in this paper. The task executes until either it completes or the
timeout becomes true, whichever happens rst, and after that the
scheduling machine proceeds to the next instruction. An idle(`)
instruction causes the scheduling machine to idle until the timeout
` becomes true. Each block of E code is annotated with a block
of S code which starts execution in a separate thread after the last
instruction of the E code block. An important difference between
E and S code is that each E code block executes instructions in-
stantaneously, whereas each block of S code executes over time.
We call the resulting code, consisting of both E and S code blocks,
schedule-carrying code (SCC). The example S code in Fig. 5 con-
tains a possible schedule for the Giotto program GA. The block of
S code at the label S(m1;0) is interpreted after the block of E code
at the label E(m1;0). It starts with the execution of the Mixer
task followed by the other two tasks. The task executing at 4ms is
suspended and resumed with the corresponding dispatch instruc-
tion in the S(m1;1) block. We note that an S code instruction that
dispatches a task not yet released is simply ignored. With the SCC
code in Fig. 4 and 5 the Mixer task is executed twice every 8ms,
and the tasks Generator and Analyzer once, exactly as specied
by the Giotto program GA.
3. Distributed Code Generation
In our distributed model the system integrator generates a Giotto
program G to be implemented by a set S of suppliers on a set H of
hosts. A supplier is an independent code developer. A host is a self-
contained computational element with its own processor, memory,
and communication interface. We assume that hosts are connected
by a shared bus or a broadcast network. Hosts communicate by
exchanging messages containing port values. For a port p 2 Ports,
let [p] be the message with the port p value.
The integrator assigns each task and each driver dened in G to
a particular host and supplier. For a task t 2 Tasks let  h(t) (resp.
 s(t)) be the host (resp. supplier) which executes (resp. implements)
task t. We similarly dene  h(d) and  s(d) for a driver d 2 Drvs.
Let Taskss;h (resp. Drvss;h) be the set of all tasks (resp. drivers)
assigned to supplier s on host h. We require that a task and its
input and copy drivers be assigned to the same supplier on the
same host. Also, an actuator driver and the corresponding device
driver must be assigned to the same supplier on the same host. With
such an assignment the integrator also allocates each port of G to a
particular host and supplier. If p 2 Ports is a sensor or an actuator
port, then  s(p) =  s(dev[p]) and  h(p) =  h(dev[p]). If p is a task t
input or output port, i.e., if p 2 In[t] [ Out[t], then  s(p) =  s(t)
and  h(p) =  h(t). Finally, each message [p] is associated with a
supplier s(p) and host  h(p), namely, the sending supplier and host.
Let Msgss;h be the set of all messages that are associated with
supplier s on host h.
In the rest of the paper we assume that the example Giotto
program GA, a streaming audio application, is to be implemented
by three suppliers on two hosts. In Fig. 1 each annotation given
in brackets to the right of a port denotes the supplier and the
host to which the port is allocated. The assignment for tasks is
shown in Fig. 2. The audio le is read on host h1, and every
4ms 44 of its samples are sent to host h2 for processing. The
Mixer and Generator tasks, implemented respectively by the sup-
pliers s2 and s3, run on h2. After receiving the samples from
h1, the task Mixer merges them with the generated samples, and
within the same 4ms, the resulting MixSound samples are sent
back to host h1. The nal waveform is there reproduced and an-
alyzed by the Analyzer task implemented by supplier s1. The
sets of tasks, drivers, and messages that are associated, for in-
stance, with s2 on h2 are Taskss2;h2 = fMixerg, Drvss2;h2 =
fInDrv2;copy[MixSound]g, and Msgss2;h2 = f[MixSound]g.
For each supplier s 2 S and each host h 2 H, the integrator
gives out (see the next sections for formal denitions)
1. an E code module Es;h that describes the timing and control
ow of driver, task, and message invocations for supplier s on
host h, and
2. a timing interface Ts;h that species the computation and trans-
mission time instants on host h that are available for supplier s.
Once a supplier s receives the E code module Es;h and timing
interface Ts;h for host h it generates
1. an S code module Ss;h for host h,
2. functionality code for all tasks Taskss;h and drivers Drvss;h
(sequential functions written in, e.g., native C code), and
3. worst-case execution (transmission) time estimates ws;h for the
tasks in Taskss;h (messages in Msgss;h).
Provided with the worst-case execution and transmission times
the integrator then veries each generated S code module against
the corresponding timing interface and E code module. In this way
the integrator can check the composability of all supplied S code
modules and ensure that the resulting distributed SCC program
satises the semantics (including the timing) of the original Giotto
program G. Moreover, once a supplier modies its S code module
on a host it is sufcient to check whether the new module complies
to its timing interface to preserve Giotto semantics.
DistributedGiotto compilation.Let P be the entire distributed
SCC program. The set PortsP of distributed SCC ports contains
additional ports (Ports  PortsP) to store the data sent over the
network. Namely, ifaccording tothe Giottoprogram G and port-to-
host allocation a value of the port p 2 Ports is needed as input to
a driver on a host h different from the originating host  h(p), i.e., if
a message with the value of p must be sent over the network, then
the host h must keep its own copy ph of port p. For a given port
p, let the set recHosts(p) be the set of hosts that need to receiveAlgorithm 1 The distributed Giotto compiler (mode m)
k := 0; [m] := [m]=!max[m];
while k < !max[m] do
8s 2 S . 8h 2 H: link Es;h(m;k) to next address of Es;h;
8p 2 taskOutPorts(m;k).8h 2 recHosts(p) [ f h(p)g.8s 2S:
5: emit(s;h;call(copy[ph]));
8d 2 actDrivers(m;k):
emit( s(d); h(d);call(d));
8p 2 actPorts(m;k):
emit( s(p); h(p);call(dev[p]));
10: Mode Switch Compilation Algorithm [10]
8p 2 senPorts(m;k):
emit( s(p); h(p);call(dev[p]));
if recHosts(p) 6= ; then
emit( s(p); h(p);release([p];));
15: 8(;t;d) 2 taskInvocations(m;k):
1 := 0; 2 := 0;
if Src[d] \ senPorts(m;k) 6= ; then 1 := ;
if sendOutPorts(t) 6= ; then 2 := ;
emit( s(t); h(t);release(1;t;2));
20: 8p 2 sendOutPorts(t) :
emit( s(t); h(t);release(;[p]));
8s 2 S . 8h 2 H:
emit(s;h;future([m];Es;h(m;(k + 1) mod !max[m])));
8s 2 S . 8h 2 H: emit(s;h;return);
25: k := k + 1;
end while
messages with port p values during program execution in at least
one mode, i.e., the set of hosts on which a task input, actuator, or
mode switch driver d is executed in at least one mode such that p is
a source port of d. The host  h(p) to which the port p is allocated is
not in recHosts(p).For a given task t, let the set sendOutPorts(t)
be the set of task t output ports p for which there are hosts that
must receive the message with the port p value (i.e., those with
recHosts(p) 6= ;).
According to Giotto semantics, each task t input (resp. copy)
driver reads (resp. writes) input (resp. output) ports at the release
(resp. termination) time instants dened by the beginning (resp.
end) of the task t period. In the distributed SCC implementation
each copy driver is still executed by an E code instruction at the
end of the task period. However, each task input driver is executed
by an S code instruction and it is delayed if its source ports need to
be sent over the network rst. In general, in each task period, the
transmission of sensor ports precedes task execution, which pre-
cedes the transmission of task output ports. More precisely, let d be
the task input driver for a task t assigned to host h. For all sensor
ports p 2 Src[d] such that  h(p) 6= h, a message [p] is received
at h. The completion of the message [p] transmission updates on
each host h
0 2 recHosts(p) (including h) the sensor port ph0. The
task t input driver reads ph (and other ports), applies its function,
and writes to the task t input ports. It succeeds all sensor port mes-
sages and precedes the task t execution. The completion of the task
t writes to the local copy of the task t output ports. The dispatch
of the task output port message [p
0] for p
0 2 Out[t] succeeds the
task t completion. The completion of the task output port message
[p
0] writes on each of the hosts in h
00 2 recHosts(p
0) to the task
output port p
0
h00. Finally, at each h
00 2 recHosts(p
0) [ fhg, the
copy[p
0
h00] driver copies local into global task output ports at the
end of the task t period (i.e., at the termination time of the task).
We assume that the transmission of a sensor port value is per-
formed in a time interval of length  after the time instant the sen-
sor is read. The latency value  must be determined at compile time
and for simplicity we also assume that this value is the same for all
ports. If a task reads a sensor port that needs to be received, then
the task input driver is called exactly  time instants after the task is
Es1;h1(m1;0):
call(copy[MixSoundh1])
call(copy[Spectrum])
call(drv[ActDrv])
call(dev[MixPlayer])
call(dev[AudioSampler])
release([AudioSampler];1)
release(0; Analyzer;0)
future(4;Es1;h1(m1;1))
Es1;h1(m1;1):
call(copy[MixSoundh1])
call(drv[ActDrv])
call(dev[MixPlayer])
call(dev[AudioSampler])
release([AudioSampler];1)
future(4;Es1;h1(m1;0))
Es2;h2(m1;0):
call(copy[MixSound])
call(copy[StringSound])
release(1; Mixer;1)
release(1; [MixSound])
future(4;Es2;h2(m1;1))
Es2;h2(m1;1):
call(copy[MixSound])
release(1;Mixer; 1)
release(1;[MixSound])
future(4;Es2;h2(m1;0))
Es3;h2(m1;0):
call(copy[MixSound])
call(copy[StringSound])
release(0; Generator;0)
future(4;Es3;h2(m1;1))
Es3;h2(m1;1):
call(copy[MixSound])
future(4;Es3;h2(m1;0))
Figure 6. E code modules for the program GA compiled by Alg. 1
released. Otherwise, it is executed at the time the task is released.
Symmetrically, the transmission of task output ports is performed
in a time interval of length  before the task is terminated (i.e., be-
fore its period expires). We require that the time  be less than or
equal to the mode unit time [m] = [m]=!max[m] for each mode
m. This implies that the task input driver is always called before its
source ports are updated with values that are more recent than what
is allowed by the LET semantics.
Given a Giotto program, Alg. 1 generates all E code modules
Es;h executing in mode m. This is done in parallel for each supplier
s 2 S and each host h 2 H. The while loop generates a block of
E code for each unit k of mode m. The E code compiler command
emit(s;h;instr) generates the E code instruction instr for sup-
plier s on host h. The compiler rst generates call instructions to
the task output (copy) drivers, actuator drivers, and actuator device
drivers. Line 10 refers to [10] for details on generating a block of
E code instructions that addresses mode switching; this is orthogo-
nal to the issues discussed in this paper. The last segment handles
call instructions for sensor device drivers, the invocation of tasks
and messages, and the future invocation of the embedded machine
at the next unit. The release instructions in the algorithm (lines
14, 19 and 21) are of a special form not needed for single-processor
SCC.They indirectly contain precedence constraints that are neces-
sary for correct communication by explicitly specifying the latency
time . This number does not affect the program execution itself,
but a supplier needs it in order to construct a correct schedule, i.e.,
S code module. We treat messages sent over the network analo-
gous to tasks. In particular, we use the same SCC instructions for
messages. The instruction release([p];) releases the message
[p] with the sensor port p value, but demands that the message
transmission be completed by time  from the release. The instruc-
tion release(1;t;2) releases the task t with the constraint that
the task be dispatched no earlier than time 1 after the release, and
completed at the latest 2 time before the task t termination time.
The instruction release(;[p]) releases the message with task t
output port p, with the constraint that the message be sent no earlier
than  time before the task t termination. The nal future instruc-
tion causes the embedded machine to wait for time [m] and then
execute the E code for the next unit.
Fig. 6 shows the E code modules compiled by Alg. 1 from
the audio mixer Giotto program GA. The code for different sup-
pliers on the same host is separated by a single horizontal line,
and the code for different hosts is separated by two lines. The
latency is chosen to be  = 1ms. For instance, the command
release([AudioSampler];1) releases the message with the sen-sor port AudioSampler value, but also species a constraint that
the message must be sent before 1ms expires.
Note that the code generation scheme of Alg. 1 implies the
order of execution: copy drivers are followed by actuator drivers,
mode switch drivers, and task input drivers, in that order. However,
E code blocks compiled for the same host and same unit of a mode
are fully composable, i.e., they can be executed in any order. If
a task output port p is a source port of an actuator, mode switch,
or task input driver that executes at a host h in a mode m, then
h 2 recHosts(p)[f h(p)g. The set of hosts that receive port p data
does not depend on the program mode. This means that a message
with the port p value is sent to the host h even if the program
executes in a mode in which p is not a source port to any driver
on h. This is so because in a mode where p is used, p must have a
correct value even in the rst period of execution in the mode.
4. Timing Interfaces
As presented in Section 3, each supplier obtains for each host an
E code module specifying the release times of the tasks (resp.
messages) that it implements, and for which it has to determine
the times of execution (resp. transmission). Since both computation
and communication resources are shared, this information must be
accompanied by a temporal specication that provides exclusive
time windows for task execution (resp. message transmission). This
specication, which we call timing interface, is also given to each
supplier. A timing interface denes the available computation and
communication time windows, but not when to perform a particular
action within these windows. This gives exibility to a supplier,
especially if multiple tasks are assigned to a supplier on a host.
It also enables timing modications that are local to a supplier
and host, if a modication in the corresponding E module (e.g.,
adding a task) is made. In the next sections we show that the timing
interface contains all information necessary for correct distributed
code generation.
Formally, a supplier s 2 S on host h 2 H receives for each
mode m 2 Modes of the Giotto program G a timing interface,
which is a pair of predicates T
m
s;h = (D
m
s;h;X
m
s;h). The predicates
D
m
s;h;X
m
s;h : f0;:::;[m]   1g ! f0;1g are dened as follows:
 D
m
s;h(`) = 1 iff in mode m at time ` supplier s on host h may
execute a task from Taskss;h;
 X
m
s;h(`) = 1 iff in mode m at time ` supplier s on host h may
send a message from Msgss;h.
Let Ts;h = fT
m
s;hjm 2 Modesg and T = fTs;hjs 2 S;h 2 Hg.
Fig. 7 shows a graphical representation of a timing interface for
the program GA from Fig. 1. The computation slots are shaded
light; for these time units the corresponding predicate D is equal
to 1. Recall the E module Es1;h1 of Fig. 6, in particular the blocks
labeled Es1;h1(m1;0) and Es1;h1(m1;1). The timing interface
given to supplier s1 on host h1 can be interpreted as follows. The
task Analyzer may be executed at any time in the intervals (1,3)
and (5,7) ms (modulo 8ms, which is the period of the mode m1).
Furthermore, the 0ms-sample of the AudioSampler sensor value
may be sent at any time in the interval (0,1) ms, and the 4ms-
sample of the same sensor may be sent in (4,5) ms.
We assume that all hosts are clock-synchronized, so that com-
munication is performed according to the Time Division Multiple
Access (TDMA) protocol: in each time slot only one node is al-
lowed to send data while all other nodes can listen for data. We
have dened timing interfaces considering a simple communication
architecture, where each host has only one processor for both com-
putation and communication tasks. A host with an additional dedi-
cated communication processor, e.g., a node in the Time-Triggered
Architecture [1], can be modeled as two hosts.
0 1 4 8 3 2
Ts2;h2
Ts1;h1
Ts3;h2
D
X
Figure 7. Timing interface for the program GA
We next dene interface feasibility, a property needed for the
composition of SCC modules. First, we require that the timing
interface windows for the same resource but different suppliers
must be disjoint, i.e., at every time instant on each host at most
one supplier may execute a task, and at most one of the suppliers
may send a message. Second, when a host is supposed to receive
data, no task execution is allowed. In particular, for sensor port data
this is true in the latency time window (-window) after the data
is read, and for task output port data, in the -window before the
task termination time. Both properties are satised for the interface
shown in Fig. 7.
Formally,a timinginterface T = (D;X)isfeasible foraGiotto
program G if the following two conditions are satised:
 (Resource Sharing) For all modes m 2 Modes, suppliers
s1;s2 2S (with s1 6= s2), hosts h1;h2 2 H (with h1 6= h2),
and times ` 2 f0;:::; [m]   1g,
atmostone of D
m
s1;h1(`),D
m
s2;h1(`),X
m
s1;h1(`),and X
m
s2;h1(`)
is equal to 1, and
atmostone of X
m
s1;h1(`),X
m
s2;h1(`),X
m
s1;h2(`),and X
m
s2;h2(`)
is equal to 1.
 (Data Reception) For all modes m 2 Modes, units k 2
f0;:::;!max[m]   1g, ports p 2 SensePorts [ OutPorts,
and times ` 2
￿
0, if either
p 2 senPorts(m;k) and k  [m]  ` < k  [m] + , or
p 2 taskOutPorts(m;k + 1) and (k + 1)  [m]    
` < (k + 1)  [m],
and if X
m
 s(p); h(p)(`) = 1, then D
m
s;h(`) = 0 for each supplier
s 2 S and host h 2 recHosts(p).
Given a Giotto program and a set of timing interfaces, one for each
supplier, host, and mode, the feasibility conditions can be checked
independently for each interface.
Earliest-deadline-rst S code. Provided with the pattern of
task and message releases in an E code module Es;h, and available
time windows in a timing interface Ts;h, the supplier s generates
the schedule for host h, i.e., the order and timing of tasks and mes-
sages on h, and encodes it as an S code module Ss;h. We briey
explain a potential generation scheme for Ss;h. Even with the tim-
ing constraints imposed by Ts;h, it can be shown that the Earliest
Deadline First (EDF) strategy is an optimal strategy with respect to
schedule feasibility, i.e., if tasks and messages are schedulable in
Ts;h time windows by some strategy, then they are also schedula-
ble by the EDFstrategy. The release and deadline times of tasks and
messages to be implemented by a supplier s on a host h in mode m
are implicitly contained in the E code module Es;h. So, the supplier
s can always check the EDF strategy and, if feasible, generate the
S code module Ss;h according to the following scheme.
Let, for instance, an interval [`1;`2)  [0;[m]), with integer
bounds `1;`2 2
￿
0, be a computation window of the timing
interface T
m
s;h, i.e., for all ` 2 [`1;`2) be D
m
s;h(`) = 1. LetSs1;h1(m1;0):
call(InDrv1)
dispatch([MixPlayer];1)
idle(1)
dispatch(Analyzer;3)
Ss1;h1(m1;1):
dispatch([MixPlayer];1)
idle(1)
dispatch(Analyzer; 3)
Ss2;h2(m1;0):
idle(1)
call(InDrv2)
dispatch(Mixer;2)
idle(3)
dispatch([MixSound];4)
Ss2;h2(m1;1):
idle(1)
call(InDrv2)
dispatch(Mixer;2)
idle(3)
dispatch([MixSound];4)
Ss3;h2(m1;0):
call(InDrv3)
idle(2)
dispatch(Generator;3)
Ss3;h2(m1;1):
idle(2)
dispatch(Generator;3)
Figure 8. S code modules for the program GA
t1;t2;:::;tjTaskss;hj be the EDF permutation of tasks Taskss;h at
unit k of mode m (the task t1 has the earliest deadline). The EDF
S code module Ss;h has the following sequence of instructions:
idle(`1   k  [m])
dispatch(t1;`2   k  [m])
dispatch(t2;`2   k  [m])
...
dispatch(tjTaskss;h j;`2   k  [m])
The entire EDF S code module consists of such code segments for
each computation or communication slot of the timing interface.
Fig. 8 shows all EDF S code modules for the Giotto program GA
which are generated using the timing interface of Fig. 7. Note that
these modules also contain invocations of task input drivers.
5. Implementation
Our test system consists of several off-the-shelf PC hosts with
200MHz Pentium Pro processors and 128MB RAM. All hosts are
equipped with standard 100Mbit Ethernet network cards and are
locally connected. The underlying operating system is RTLinux,
where standard Linux runs under the control of a real-time ker-
nel as the lowest priority task [18]. In contrast to Linux' fair time-
sharing scheduling, RTLinux uses a simple priority-based preemp-
tive scheduler, thus permitting real-time functions to operate in a
predictable and low-latency environment. In our tests the maximum
scheduling latency was about 30s.
Real-time communication is attained through a special network
driver [19] that precludes the standard Ethernet CSMA/CD proto-
col by establishing a TDMA-based time-triggered protocol, where
each node has exclusive access to the network within its scheduled
time slot. A software-based synchronization of the hosts is carried
out by controlling the period of a thread that performs send and
receive network operations. The control algorithm uses the arrival
times of incoming data packets. The communication cycle is shown
in Fig. 9. For the purposes of synchronization, one of the hosts is
designated as master and all others as clients. In each cycle the mas-
ter sends a sync packet with the id of the client that is supposed to
respond by sending a resync packet in the next slot. The subsequent
slots are reserved for each of the hosts to send actual data packets.
If T0 is the duration of a single slot, and N is the number of hosts
operating under the time-triggered protocol, then the cycle repeats
after time T0  (N + 2).
In general, the protocol latency, i.e., the time between the send
call of the network driver and the arrival of the data packet, de-
pends on the time instant at which the call is made. However, the
driver provides a function that synchronizes the sending thread with
the network schedule, i.e., the driver resumes the thread when it
reaches the exclusive time slot to send a message. This mecha-
nism enables the precise timing in the interpretation of the SCC in-
Figure 9. Cycle of the communication protocol [19]
structions (including message dispatch) with respect to the global
time. The distributed SCC virtual machine is built as a dynami-
cally loadable RTLinux kernel module. For the code of each sup-
plier the machine maintains a context data structure similar to the
non-distributed implementation described in [6]. To implement dis-
tributed SCC correctly we make use of special RTLinux calls that
suspend and resume task threads.
To test the virtual machine we implemented the audio appli-
cation GA through the distributed SCC program shown in Fig. 6
and 8. Note that in Fig. 8 each dispatch instruction with a task
(resp. message) as an argument executes in computation (resp.
communication) slots shown in Fig. 7. In this setup each time slot
lasts T0 = 1ms, and an entire communication cycle lasts 4ms
(N=2). The maximum bandwidth available to each host in such a
conguration is 2.86Mbit=s. The tests show that the sound card
is fed continuously with samples. The audio reproduced back at
h1 plays without any noticeable interruption or other sound de-
fects. The estimated overhead of the network driver synchroniza-
tion thread is 25s. The overhead of the virtual machine, i.e., the
time it takes to go through the machine event loop with two trig-
ger and thread instances, is less than 12s (divided roughly equally
between E and S parts). Since the machine is invoked at 1kHz, the
system overhead is about 3:7%. The actuator jitter is less than 2s,
since in Giotto a task output is written at the task termination time.
In these measurements we used the Pentium time stamp counter,
the most precise PC clock.
6. Compositional SCC Analysis
We rst characterize the control-ow graphs of the distributed SCC
program that is compiled from a Giotto program G according to
the scheme presented in Section 3. The distributed SCC program is
then represented as a set of state-transition systems, one for each
supplier and host, which are used to verify the correctness of this
implementation of G.
6.1 Giotto-Generated Distributed SCC
We start by describing E and S code modules separately, and then
dene the entire distributed SCC program. Let G be a Giotto pro-
gram with M modes. Let gs;h be equal to jTaskss;hj+jMsgss;hj+
jDrvss;hj, i.e., gs;h represents the size of the program part which
is allocated to supplier s on host h. Let a node of a directed graph
without predecessor (resp. successor) be called a source (resp. sink)
node of the graph. A G-generated E module Es;h consists of a di-
rected acyclic control-ow graph (V
E
s;h;E
E
s;h), two edge-labeling
functions  and , and a node-labeling function . Each edge
e 2 E
E
s;h is labeled with an instruction (e) and an argument (e),
and each node v 2 V
E
s;h is labeled with a pair (v) = (m;k) such
that m is a mode and k is a unit of m, i.e., k 2 f0;:::;!max[m]g.
The graph (V
E
s;h;E
E
s;h) has the following properties:
 Each path from a source to a sink consists of
a sequence of O(gs;h) edges e, each with a (e) = call in-
struction that calls a driver (e) from Drvss;h, followed by
a sequence of O(gs;h) edges e, each with a (e) =
release instruction that releases a task or message (e)
from Taskss;h [ Msgss;h, and followed by
a single edge e with a (e) = future instruction and an
argument (e) = (;v
0) that marks a source v
0 of V
E
s;h for
execution after  2
￿
>0 units of time. Foreach mode m 2 Modes and each unitk 2 f0;:::; !max[m]g
there exists
exactly one source node v such that (v) = (m;k), and
at most one node v such that (v) = (m;k) and v has
more than one successor; such a node v has less than M
successors.
Let all numbers in G, i.e., mode periods as well as task and actuator
frequencies and !max[m], be bounded by n. For instance, for the
Giotto program GA, the largest constant n is equal to 8. The
number of sources of (V
E
s;h;E
E
s;h) is O(M  n), and the number
of sinks is O(M
2  n). Assuming, for simplicity, that the number
M of modes is bounded, the size of V
E
s;h is O(gs;h  n).
A G-generated S module Ss;h consists of a directed control-
ow graph (V
S
s;h;E
S
s;h), two node-labeling functions  and , and
an edge-labeling function . We require that the graph (V
S
s;h;E
S
s;h)
consists of chains of total length O(gs;h  n). Each control location
u 2 V is labeled by one of the following:
 (u) = dispatch, (u) 2 Taskss;h [ Msgss;h, and node
u has a successor u
0 such that (u;u
0) 2
￿
>0. If (u) 2
Taskss;h, then the execution of u dispatches the task (u).
Control proceeds to u
0 if (u) completes or the rst (u;u
0)
time units pass from the time at which the thread with this con-
trol location was created. If (u) 2 Msgss;h, then the anal-
ogous explanation holds for the transmission of the message
(u).
 (u) = idle and u has a successor u
0 such that (u;u
0) 2
￿
>0. The execution of u idles the processor h until (u;u
0) 2
￿
>0 time units pass from the time of thread creation.
 (u) = call and u has a successor u
0 such that (u;u
0) 2
Drvss;h. The execution of (u;u
0) calls driver (u;u
0).
 (u) =
￿ and u has no successor indicates thread termination.
A G-generated SCC module Ps;h for a supplier s and a host h
consists of a G-generated E module Es;h, a G-generated S module
Ss;h, and an annotation function s;h that maps each sink of the
control graph of Es;h to a node in the control graph of Ss;h. When
the E code execution arrives at a sink v, this creates a new thread
of S code which starts at control location s;h(v). Let V
E
h be the
union of node sets V
E
s;h over all suppliers s 2 S, i.e., the set of
all E code control locations on host h. Each function s;h maps a
sink node v
0 2 V
E
s;h to a source node s;h(v
0) 2 V
S
s;h such that
if (v;v
0) 2 E
E
s;h and (v;v
0) = future and (v;v
0) = (`;),
then the chain in (V
S
s;h;E
S
s;h) that starts from the node s;h(v
0)
does not contain numbers, i.e., clock timeouts in dispatch and
idle instructions, larger than `. According to the last condition,
if the next E code instruction is executed after ` time units, then
the chain of S code instructions describes the schedule for at most
the next ` time units. Note that if G is a single-mode program,
then both (V
E
s;h;E
E
s;h) and (V
S
s;h;E
S
s;h) consist of chains of size
O(gs;h). Lastly, a G-generated distributed SCC program P over a
set S of suppliers and a set H of hosts is a function that assigns to
each s 2 S and each h 2 H a G-generated SCC module Ps;h for
a supplier s and a host h.
Transition-system semantics. A state of a G-generated dis-
tributed SCC program P consists of a port valuation function r that
maps each port in PortsP to a value of the appropriate type, a pro-
gram counter function v that assigns to each host h 2 H a control
node vh 2 V
E
h , a status function c : Tasks [ Msgs !
￿
0 [ f?g,
a trigger function  that assigns to each host h 2 H a queue
h  (
￿
0  V
E
h )
 of future invocations, and a thread function
 that assigns to each host h 2 H a set h of threads. Each thread
(u;) 2 h consists of a program counter u 2 V
S
h and a num-
ber  2
￿
0 of time units for which the thread has been executed.
Let c be the function such that for each task t 2 Tasks, the sta-
tus c(t) 2
￿
0 indicates that t has been released and executed for
c(t)  0 time units; the status c(t) = ? indicates that t has been
completed (or not yet released). For a message  2 Msgs, c() is
dened analogously for the message release and transmission.
The appendix presents the semantics of a distributed SCC pro-
gram P by dening a transition system on the space of states of P.
Each transition represents either the execution of an E or S code
instruction on one of the hosts, or a time step. A series of E tran-
sitions corresponding to a block of E code instructions are taken
when a trigger becomes true. A completion S transition is taken
when a task or message completes; a timeout S transition, when
a timeout on a dispatch or idle instruction becomes true; and a
transient S transition, when an S code call instruction is executed.
For a given initial state q0, a trace of the distributed SCC
program P is an innite sequence q0;q1;::: of states of P such
that for all i 2
￿
0, there exists a transition from qi to qi+1. Let
ws;h : Taskss;h [ Msgss;h !
￿
>0 be the worst-case execution
or transmission time (wcet) function for the tasks and messages
of supplier s 2 S on host h 2 H, and let w be the set of such
functions for allsuppliers and all hosts. Atrace of P isan w-trace if
for each supplier s 2 S, host h 2 H, and each invocation of a task
or message x 2 Taskss;h [ Msgss;h, the invocation x completes
execution (transmission) within time ws;h(x).
6.2 Interface Compliance and Time Safety
For the compositional analysis of a distributed SCC program we
need the following two properties. Let G be a (multi-mode) Giotto
program, let Ts;h be a timing interface for a supplier s and a host
h, let Ps;h be the G-generated SCC module, and let ws;h be a
wcet function. The module Ps;h interface-complies with Ts;h if all
dispatch instructions of Ps;h execute in time intervals provided
by Ts;h. In our example each SCC module Ps;h dened by the
E and S code blocks in Fig. 6 and 8 interface-complies with the
timing interface Ts;h shown in Fig. 7, because the S code in Fig. 8
was generated as EDF S code with respect to this interface.
The module Ps;h is time-safe if (1) no driver reads from output
ports of a task (resp. message) assigned to supplier s on host
h before it completes execution (resp. transmission), and (2) no
driver writes to input ports of a task (resp. message) after it starts
execution (resp. transmission). This requirement ensures that all
task release and termination times of the original Giotto program
are maintained [10]. Let, for instance, the worst-case execution
(resp. transmission) times of all tasks (resp. messages) be 1ms.
Each SCCmodule Ps;h dened by the Eand S code blocks in Fig.6
and 8 istime-safe.For example, inPs2;h2,theinput portsof the task
Mixer are written at time 1ms (InDrv2 driver), its output ports are
read at 4ms (copy[MixSound] driver), and the task starts execution
at 1ms, but completes before 2ms.
We now give the formal denitions of interface compliance
and time safety as safety properties, so that it becomes clear how
to check them. A state of a distributed SCC program P with a
program counter function v and thread function  violates interface
compliance with Ts;h = (Ds;h;Xs;h) if there exists a thread
(u;) 2 h such that (u) = dispatch, (vh) = (m;k), and
either (1) (u) 2 Taskss;h and D
m
s;h(k  [m] + ) = 0, or
(2) (u) = Msgss;h and X
m
s;h(k  [m] + ) = 0. We say that
(Ps;h,ws;h) interface-complies with Ts;h if for all ws;h-traces   of
fPs;hg, no state of   violates interface compliance with Ts;h.
A state of a distributed SCC program P with a program counter
function v, status function c, and thread function  violates time
safety on (s;h) if there exists a task or message x 2 Taskss;h [
Msgss;h such thateither (a)vh has a successor v
0
h with (vh;v
0
h) =
call and (vh;v
0
h) = d (E code driver), or (b) there exists a(m2;0) (m2;1) (m2;2) (m2;3)
(m1;1) (m1;0)
Figure 10. Graph related to Ps;h for GA with additional mode m2
thread (u;) 2 h with (u) = call, u has a successor u
0,
and (u;u
0) = d (S code driver), and one of the following: (1)
Src[d]\Out[x] 6= ; and c(x) 6= ?, or (2) Dst[d]\In[x] 6= ; and
c(x) 6= 0. We say that (Ps;h,ws;h) is time-safe if for all ws;h-traces
  of fPs;hg, no state of   violates time safety on (s;h).
Checkinginterface complianceandtime safety.The paper [4]
discusses time safety checking for single-mode, single-CPU Giotto
programs. These results are here generalized to both the distributed
and multi-mode settings. For distributed single-mode programs G
we give pseudo-polynomial algorithms for checking the interface
compliance and time safety of each G-generated SCC module.
For distributed multi-mode programs the checks are sufcient. For
details and proofs the reader is referred to [20]. Let a G-generated
SCC module be given as a G-generated E module Es;h, a G-
generated S module Ss;h, and an annotation function s;h. We rst
construct a directed graph Ps;h by connecting the control graphs
of Es;h and Ss;h through edges from each sink of V
E
s;h (resp. V
S
s;h)
to a source of V
S
s;h (resp. V
E
s;h) determined by the map s;h and
control ow of Es;h. It can be shown that each graph Ps;h is acyclic
even if G is a multi-mode program [20]. For instance, consider the
Giotto program GA with the original mode m1 and the additional
mode m2 given in Fig. 3, in which the Mixer task is invoked every
2ms. Fig. 10 shows a graph in which each edge abstracts a chain
of O(gs;h) edges of the graph Ps;h.
We next construct a state-transition graph by annotating each
node of the graph Ps;h with a particular state of the SCC module
Ps;h. The graph Ps;h is acyclic, so the nodes can be sorted and
processed in topological order. Each source node of Ps;h (for
each mode there is exactly one such node) is annotated with the
state in which the trigger queue and thread set are empty and the
status function maps each x 2 Taskss;h [ Msgss;h to ? (recall
that c(x) = ? means that x has not yet been released). For the
other nodes of Ps;h we proceed by transforming the state of their
immediate predecessors. We do so by performing one or more
transition steps dened by the semantics of SCC programs (App.
A). Task execution-time nondeterminism in time transition steps is
eliminated by assuming that each task (or message) x completes
exactly after the time given by the wcet ws;h(x). If a node v has
more than one predecessor v
0, then the status function value at node
v, for each x 2 Taskss;h [ Msgss;h, is the least value among the
status function values for x at all predecessors v
0. So, for the nodes
with more than one incoming edge, we compute the task execution
time pointwise and conservatively.
Checking the states of the graph Ps;h offers a sufcient condi-
tion fortimesafety and interface compliance ofallexecutions ofthe
distributed SCC module Ps;h. If no state of the graph Ps;h violates
time safety and interface compliance, then the G-generated SCC
module (Ps;h;ws;h) interface-complies with Ts;h and is time-safe.
If this is not the case then, for a general multi-mode Giotto pro-
gram G, we cannot conclude that SCC module (Ps;h;ws;h) does
not interface-comply with Ts;h (or is not time-safe). This is be-
cause in the state construction of Ps;h different incoming edges
of a node may impose conservative approximations on different
tasks. Also, there may be unreachable modes [10]. However, if G
is a single-mode program, then the state-transition graph Ps;h is a
chain. So, if Ps;h does not interface-comply or is not time-safe at
some state q, then the trace along the chain up to q is a counterex-
ample. The size of Ps;h is O(gs;h  n), because both (V
E
s;h;E
E
s;h)
and (V
S
s;h;E
S
s;h) are of the same size. Constructing the transition
graph Ps;h, annotating it with states, and checking its states can be
done in O(gs;hn) time. Therefore, we have the following theorem.
THEOREM 1. Let G be a single-mode
2 Giotto program with all
numbers bounded by n. Let gs;h and Ts;h be the size of the part
of G and the timing interface assigned to supplier s on host h. Let
Ps;h and ws;h be the G-generated SCC module and wcet function
for supplier s on host h. It can be checked in time O(gs;h  n)
whether (Ps;h;ws;h)interface-complies withTs;h and istime-safe.
6.3 Distributed Code Generation Correctness
We show that LET semantics of a Giotto program is preserved by
the distributed SCC program generated according to Alg. 1 if each
SCC module satises interface compliance and time safety. If an
SCC program preserves the LET semantics of a Giotto program we
say that it implements the Giotto program.
Let G be a Giotto program, letT = fTs;h j s 2 S and h 2 Hg
be a feasible interface for G, let P = fPs;h j s 2 S and h 2 Hg
be a G-generated distributed SCC program, and let w = fws;h j
s 2 S and h 2 Hg be a wcet function for P. Let r
G
` and r
P
`
be the port valuation functions at time ` 2
￿
0 for G and P [3].
A trace of P and a trace of G are input-compatible (resp. output-
compatible) ifthey have the same sensor (resp.actuator) port values
at the same times, i.e., r
G
` (p) = r
P
` (p) for each p 2 SensePorts
(resp. p 2 ActPorts) and each time instant ` 2
￿
0. The pair (P,w)
implements the Giotto program G if for every w-trace of P and
every trace of G, input-compatibility implies output-compatibility
(i.e.,if,for all sensor inputs, they produce the same actuator outputs
at the same times). The pair (P,w) interface-complies to T if for
each supplier s 2 S and host h 2 H, the G-generated SCCmodule
(Ps;h,ws;h) interface-complies with Ts;h. We say that (P,w) is
time-safe if (Ps;h,ws;h) is time-safe for each s 2 S and h 2 H.
THEOREM 2. Let G be a Giotto program, let T be a feasible
timing interface for G, let P be the distributed SCC program
G-generated according to Alg. 1, and let w be a wcet function.
If (P,w) interface-complies to T and is time-safe, then (P,w)
implements G.
For the proof of this theorem we refer to [20]. Instead we give
informal explanation why interface feasibility, interface compli-
ance, and time safety ensure correctness of the implementation. If
interface feasibility is violated, e.g., the time windows on a host
are not disjoint, even if each supplier produces interface-compliant
and time-safe code, the host may be overloaded and miss deadlines
dened by the LET semantics. A similar outcome is possible if the
interface is feasible, and each supplier on each host generates an
SCC module that is individually time-safe, but it ignores the in-
terface. Lastly, if a module does not satisfy one of the time-safety
conditions, e.g., a time slot in the interface is not sufciently large,
then a task or message invocation may result in incorrect output.
The compositional nature of interface compliance and time safety
of (P,w) ensures that if, for some supplier s and host h, one mod-
ule Ps;h is modied, then for P to implement G it sufces to check
if (Ps;h,ws;h) interface-complies with Ts;h and if it is time-safe.
Combining Theorems 1 and 2, we have the following.
COROLLARY 1. Let G be a single-mode
2 Giotto program of size g
with all numbers bounded by n. It can be checked in time O(g  n)
if (P;w) implements G. Moreover, if (Ps;h,ws;h) is modied for
a single supplier s and host h, then it can be checked in time
O(gs;h  n) if (P;w) still implements G.
2For multi-mode Giotto the pseudo-polynomial check is only sufcient but
not necessary.Note that (Ps;h,ws;h) can be modied either by modifying Es;h
(i.e., modifying task invocation and/or environment interaction),
Ss;h (schedule), or ws;h (wcet). Suppose that in the audio example
the integrator wants to assign additional functionality to supplier s3
on host h2, say, mix with another synthesized sound with a pitch
twice as high. Supplier s3 implements a new task Generator2
(of two times higher frequency) with input driver InDrv4, and
modies the Smodule Ss3;h2 as shown below. Then, forcorrectness
of the entire program P, only the modied module Ps3;h2 needs to
be checked for interface compliance and time safety.
Ss3;h2(m1;0):
call(InDrv3)
call(InDrv4)
idle(2)
dispatch(Generator2;3)
dispatch(Generator;3)
Ss3;h2(m1;1):
call(InDrv4)
idle(2)
dispatch(Generator2;3)
dispatch(Generator;3)
7. Conclusion
We introduced timing interfaces and showed how they can be used
to distribute the code generation for Giotto programs and dis-
tributed target platforms. The integration of the individually com-
piled components is performed by individually checking the in-
terface compliance and time safety of each component. Our ap-
proach guarantees global timing requirements without solving a
global scheduling problem: as part of the continuing effort of the
Giotto project to trade performance for predictability and compos-
ability, the burden is shifted to the generation of timing interfaces.
There are related efforts [12, 13, 21], how they can be optimized
for different criteria is a topic for future research.
References
[1] H. Kopetz. Real-Time Systems: Design Principles for Distributed
Embedded Applications. Kluwer, 1997.
[2] http://www.exray-group.com; http://www.autosar.org.
[3] T.A. Henzinger, B. Horowitz, and C.M. Kirsch. Giotto: a time-
triggered language for embedded programming. In Proc. IEEE 91,
pp. 8499, 2003.
[4] T.A. Henzinger, C.M. Kirsch, and S. Matic. Schedule-carrying code.
In Proc. EMSOFT, LNCS 2855, pp. 241256, Springer, 2003.
[5] T.A. Henzinger and C.M. Kirsch. The Embedded Machine:
predictable, portable real-time code. In Proc. PLDI, pp. 315326,
ACM, 2002.
[6] C.M. Kirsch, M.A.A. Sanvido, and T.A. Henzinger. A programmable
microkernel for real-time systems. In Proc. VEE, ACM, 2005.
[7] C.M. Kirsch, M.A.A. Sanvido, T.A. Henzinger, and W. Pree. A
Giotto-based helicopter control system. In Proc. EMSOFT, LNCS
2491, pp. 4660, Springer, 2002.
[8] N. Halbwachs. Synchronous Programming of Reactive Systems.
Kluwer, 1993.
[9] A. Benveniste, L.P. Carloni, P. Caspi, and A.L. Sangiovanni-
Vincentelli. Heterogeneous reactive systems modeling and correct-
by-construction deployment. In Proc. EMSOFT, LNCS 2855, pp.35
50, Springer, 2003.
[10] T.A. Henzinger, C.M. Kirsch, R. Majumdar, and S. Matic. Time-
safety checking for embedded programs. In Proc. EMSOFT, LNCS
2491, pp. 7690, Springer, 2002.
[11] P. Caspi, et al. From Simulink to SCADE/Lustre to TTA: a layered
approach for distributed embedded applications. In Proc. LCTES, pp.
153-162, ACM, 2003.
[12] A. Mok and X. Feng. Real-time virtual resource: a timely abstraction
for embedded systems. In Proc. EMSOFT, LNCS 2491, pp. 182196,
Springer, 2002.
[13] I. Shin and I. Lee. Periodic resource model for compositional real-
time guarantees. In Proc. RTSS, pp. 213, IEEE, 2003.
[14] H. Kopetz and N. Suri. Compositional design of real-time systems: a
conceptual basis for the specication of linking interfaces. In Proc.
ISORC, pp. 5160, 2003.
[15] J. Rushby. Partitioning in avionics architectures: requirements,
mechanisms, and assurance. In NASA Contractor Report 209347,
SRI International, 1999.
[16] B. Hardung, T. Koelzow, and A. Krueger. Reuse of software in
distributed embedded automotive systems. In Proc. EMSOFT, pp.
203210, ACM, 2004.
[17] K. Karplus and A. Strong. Digital synthesis of plucked-string and
drum timbres. in Computer Music Journal 7, pp. 4355, 1983.
[18] V.Yodaiken. RTLinux Manifesto. In Proc. LinuxExpo, 1999.
[19] S. Lankes, A. Jabs, and M. Reke. A time-triggered Ethernet protocol
for real-time CORBA. In Proc. ISORC, pp. 215222, 2002.
[20] T.A. Henzinger and S. Matic. Distributed Schedule-Carrying Code.
Tech. Rep. UCB/CSD-04-1360, 2004.
[21] S. Shigero, M. Takashi, and H. Kei. On the schedulability conditions
on partial time slots. In Proc. RTCSA, pp. 166173, IEEE, 1999.
Appendix A. Formal Distributed SCC Semantics
In [4] we give an operational semantics of schedule-carrying code by
dening a state-transition system in which all port values are abstracted
away. Here we are interested in the input-output behavior of distributed
SCC, so we extend the formalism by taking into account port values and the
distributed nature of code. We present the interleaving semantics for SCC
modules of all suppliers on all hosts. To use the same notation for messages
as for tasks, let the message input ports In[[p]] formally be fpg, let the
message output ports Out[[p]] be fph j h 2 recHosts(p)g, and let
the message function task[[p]] be the identity function from the message
input to output ports. A state q = (r;v;c;;) has a transition to a state
q0 = (r0;v0;c0;0;0) if one of the following is true:
Completion S transition The state q is completion enabling, that is, there
exist a host h 2 H and a thread (u;) 2 h such that c((u)) = ?
and (u) = dispatch. Let the successor of u be u0. Then r0 = r ex-
cept that r0(Out[(u)]) = task[(u)](r(In[(u)])), (v0;c0;0) =
(v;c;), and 0 =  except that 0
h = (hnf(u;)g) [ f(u0;)g.
Transient S transition The state q is not completion enabling but transient
enabling, that is, there exist a host h 2 H and a thread (u;) 2 h
such that (u) = call. Let the successor of u be u0. Then r0 = r
except that r0(Dst[(u;u0)]) = drv[(u;u0)](r(Src[(u;u0)])),
(v0;c0;0) = (v;c;), and 0 =  except that 0
h = (hnf(u;)g) [
f(u0;)g.
E transition The state q is neither completion nor transient enabling but E
enabling, that is, there exists a host h 2 H and either (1) vh has no
successor and (0;) 2 h, or (2) vh has a successor v0
h. In case (1) let
(0;  v)bethe rstsuch pair inh.Thenp = p0,v0 = v except that v0
h =
 v, c0 = c, 0 =  except that 0
h = h n f(0;  v)g, and 0 = . In case
(2) one of the following: (a) (vh;v0
h) = call and r0 = r except that
r0(Dst[(vh;v0
h)]) = drv[(vh;v0
h)](r(Src[(vh;v0
h)])), c0 = c,
and 0 = ; (b) (vh;v0
h) = release and r0 = r, c0 = c except that
c0((vh;v0
h)) = 0, 0 = ; or (c) (vh;v0
h) = future and r = r0,
c0 = c, and 0 =  except that 0
h = h  f(vh;v0
h)g. In all three
cases,if v0
h isa sink,then 0 =  except that 0
h = h[f(h(v0
h);0)g;
if v0
h is not a sink, then 0 = .
Timeout S transition The state q is neither completion nor transient nor E
enabling but timeout enabling, that is, there exist a host h 2 H and a
thread (u;) 2 h such that (u) 2 fdispatch;idleg, the successor
of u is u0, (u;u0) 2
￿ 0, and (u;u0)  . Then (r0;v0;c;0) =
(r;v;c;), and  = 0 except that 0
h = (hnf(u;)g) [ f(u0;)g.
Time transition The state q is neither completion nor transient nor E nor
timeout enabling. Then r0(p) = r(p) for all p 2 PortsP n fpcg, and
r0(pc) = r(pc) + 1. For ` = r(pc), we call the function r` = r the
port valuation at time `. For each h 2 H, let Xh = fx j (u;) 2
h;(u) = dispatch;(u) = xg, and let  xh 2 Xh be the task or
message to be executed on h. Then v0 = v; the queue 0
h results from
h by replacing each trigger binding (;u) by (   1;u); the thread
set 0
h results from h by replacing each thread (u;) by (u; + 1); if
x 2 Taskss;h [ Msgss;h for some s 2 S, then c0(x) = c(x) + 1 or
c0(x) = ? if x =  xh,and c0(x) = c(x)if x 6=  xh.In case c0(x) = ?
we say that on the transition (q;q0), the task or message x completes
after execution time c(x) + 1.