Software in real-time embedded systems differs fundamentally from its desktop or Internet counterparts. Embedded computing is not simply computation on small devices. In most control applications, for example, embedded software engages the physical world. It reacts to physical and user-interaction events, performs computation on limited and competing resources, and produces results that further impact the environment. Of necessity, it acquires some properties of the physical world, most particularly, time.
gible tasks are competing for resources, the operating system must choose to grant the resources to one of them, and as a consequence, that task finishes sooner.
The process of choosing to which task to grant resources is called real-time scheduling. A typical strategy is to assign priorities to tasks and to execute the highest priority task that is eligible. Intuitively, priorities represent the relative importance of tasks at run time. Priorities can be statically assigned to tasks at design time, or they may be dynamically determined at run time. In today's embedded systems, given the run-time overhead of computing priorities, it is common to fulfill timing constraints by statically assigning priorities among the tasks.
Another powerful concept for dealing with timing properties is the notion of preemption, which is to pause a running task, say A, and execute another task, say B. By doing so, task B, although it is activated later than task A, can finish before task A finishes. Obviously, task B must have a higher priority to preempt task A.
Consider the example shown in Fig. 1 , where a Controller and a Supervisor are implemented on the same computer. Suppose that the controller is triggered by periodic samples, say, every 2 ms, and for each sampling input, the controller produces an output with a delay. For now, let's assume 4 of 30 the delay is fixed, say, 1 ms, from activation to activation. The supervisor task is only triggered once in a while (e.g., every second), and it takes a long time (>> 2 ms) to compute a new set of parameters to update the control algorithm. Since both tasks are implemented on the same embedded system, they share the computing resources. Assume it is unacceptable for the controller to stop producing any output for a long time, if the supervisor task keeps running,. A preemptive strategy is shown in Fig. 2 , where the controller task preempts the long-run supervisor task. In this figure, each box represents the execution of a task. The nonshadowed parts within the execution of the supervisor task indicate that its execution is preempted by the controller task. . . . . . .
of 30

REAL-TIME SCHEDULING
How to assign priorities to multiple tasks is a key part of the real-time scheduling problem.
This has been an active research area for more than 20 years, starting with the seminal work by Liu and Layland [4] . The goal of real-time scheduling is to devise a set of priority assignment rules to make sure that all tasks are finished before their deadlines.
Real-time scheduling algorithms typically make some assumptions about tasks and resources.
For example, Liu and Layland's original work makes the following assumptions:
• a single and arbitrarily preemptable resource (CPU);
• independent tasks;
• fixed and known task execution times;
• periodic task triggers (for rate monotonic scheduling);
• runability constraints −− each task must be completed before receiving the next trigger.
Under these assumptions, Liu and Layland derived rate-monotonic (RM) and earliest-deadline-first (EDF) scheduling policies. These algorithms are shown to be optimal (in terms of CPU utilization) for static and dynamic priority assignments, respectively. Further work in this area has developed more sophisticated timing analysis theories and has relaxed many assumptions in the original algorithms [5] - [9] . Most of these still rely on knowing all tasks's worst case execution time (WCET) and having arbitrary preemptability. When multiple resources are being managed, optimal scheduling becomes NP-hard [10] , so most methods deal with only one shared resource.
In reality, many of the assumptions in scheduling theories do not hold. Tasks may require multiple resources to execute, and they can be strongly coupled. For instance, in our controller/ supervisor example, the supervisor updates the controller's parameters by directly accessing them. Suppose that the lower priority supervisor task is writing to a set of parameters in the con-troller, and at the same time, the higher priority controller task is activated. The controller task cannot start immediately since its parameters may not be consistent. To ensure the integrity of the parameters, the controller can only preempt the supervisor after the update operations are finished. We call this situation partial preemptability. Partial preemptability not only complicates timing analysis in the sense that the response time of the controller is lengthened, it may also cause more serious problems, such as priority inversion.
PRIORITY INVERSION
The intuition behind assigning priorities to tasks is to prioritize resource utilization and obtain fast response for critical tasks. However, because of the partial preemptability of some tasks, blindly following the priority assignment and triggering high-priority tasks may cause a high-priority task to be blocked by low-priority tasks indefinitely.
Consider the following situation in our previous example, where there is one more task, say, fault diagnosis, which is independent of the controller and the supervisor, running on the same embedded computer. Suppose that the fault diagnosis task has an intermediate priority that is higher than that of the supervisor but lower than that of the controller. Suppose in addition that at some point, when the controller is blocked waiting for the supervisor to finish updating the control parameters, the fault diagnosis task is activated. Since the fault diagnosis task is independent of the supervisor and has a higher priority, it can preempt the supervisor and start executing. Now the controller, although having a higher priority than the fault diagnosis task, is in fact blocked by the fault diagnosis task. This situation is illustrated in Fig. 3 . Imagine that there are multiple intermediate-priority tasks that act like the fault diagnosis task here. Then the controller task may be postponed indefinitely.
Priority inversion problems are usually solved by the priority inheritance and priority ceiling protocols [11] . The basic idea of these protocols is to look into the content of each task, analyze critical sections (e.g., shared data access), and for each critical section, find the highest priority task that may access it. Call this highest priority value p. Then, if a lower priority task A enters this section, the priority of task A becomes p, so that no task of priority lower than p can preempt A. When the task leaves the critical section, its priority drops back to its normal value.
Priority inheritance and priority ceiling protocols successfully solve the priority inversion problem. By adding more constraints, such as requiring that all tasks need to grab resources in the same order, it also solves deadlock problems that may be caused by cross waiting on data access.
Thus, these protocols are widely implemented in real-time operating systems like VxWorks, QNX [12] , and Resource Kernel [13] . On the down side, the priority inheritance protocol may be hard to implement, and it adds run-time overhead to monitor priorities and critical sections. So some lightweight real-time kernels omit them and require software designers to take care of avoiding priority inversion and deadlock problems themselves. 
MORE PITFALLS
Using priorities as the only tuning parameter and applying priority-driven preemptive execution rules without considering the status of other tasks introduces many problems. Besides priority inversion, other pitfalls exist when the assumptions of real-time scheduling theories do not hold:
• Preemptive execution, especially with static priority assignment, are fundamentally fragile.
The timing behavior of tasks may be very sensitive to task activation time and their actual execution time. An early arrival of a high-priority task or an unexpectedly long execution of a task can have domino effects and add delays to the response time of all subsequent lower-priority tasks.
• The results of schedulability analysis may not be very useful. The typical answers from a schedulability analysis are the worst-case response time between the triggering and the finishing of the tasks. These values are required to be less than the deadlines to pass the schedulability test. However, schedulability analysis does not reveal how often the worstcase response time is met, what distribution it has, and what happens if it is greater than the deadline. In many control applications, the physical system is fundamentally robust, and missing a deadline occasionally may not cause catastrophic results.
• The worst-case execution time may not be the best representation of the execution time of a task. It could be much larger than the average execution time, and by using WCET for schedulability analysis, the results could be very conservative. As a consequence, the resources are not sufficiently used to be cost-effective.
• Pushing for fast response may not be optimal. Some hard-real-time algorithms may have strict requirements on the output time. An optimal result may only be achieved by emitting the output at a particular time. Early outputs may result in a suboptimal result, as well as late outputs. This is particularly the case for some multimedia applications and predictive control algorithms.
One fundamental problem is that there is not enough discipline in real-time programming.
Common embedded software development inherits empirical programming models from desktop software development and tries to patch timing properties by tuning priorities after fixing the functionality. Characteristics such as resource requirements, synchronization, and critical sections, which directly affect timing properties, are not explicit parts of the program specification.
Thus there is not much space for compilers and run-time systems to help ensure timing properties at run time. This in turn forces designers to work with worst-case execution time and worry about the domino effects, and to rely on exhaustive testing to develop confidence in a design.
TIME-TRIGGERED ARCHITECTURES
Timed-triggered architectures (TTA) were originally designed as system architectures and network protocols for safety-critical distributed embedded systems [14] . Various programming models use the same ideas to address the fundamental issues of real-time programming by using time as the only trigger of computation. They bring timing constraints explicitly to the programming model level, so that compilers and run-time systems can schedule and optimize the software to achieve timing determinism. We give two examples of time-triggered models in this section, port-based objects and Giotto, which significantly influence our design of the timed multitasking model.
PORT-BASED OBJECTS
In the port-based object (PBO) models [1] , tasks, called port-based objects, are time triggered, and the communication channels between them form a global data space. Once activated, a PBO is free to read from and write to the global data space. Read and write operations are atomic, and all computation within the PBOs is based on their internal variables. Although synchronization, using for example monitors and locks, is still necessary to guard the access of the global data space, it is managed outside of the objects, at the operating system level, instead of by the objects themselves. It is thus much easier to maintain the atomicity of the communications and avoid priority inversion problems.
The execution time of a PBO task may vary from activation to activation, and the time when the inputs are consumed and the outputs are produced may not be regular. The global data space has a state semantics, meaning that the value of a variable is preserved as long as it is not overwritten, and a newly written value will overwrite the old value regardless of whether the old one has been used by other tasks. An example of an execution trace of a controller is shown in Fig. 4 , where the controller is activated every 2 ms. Suppose that a successor task reads the controller's output at 1.5 ms and 3.5 ms in this timing diagram; then in the first period, it reads a fresh output from the controller, but in the second period, since a new output is not yet produced, it will use the controller output from the last cycle again. For example, suppose we specify that a controller is activated every 2 ms. In Giotto, this implies that the input data is consumed at the beginning of the 2 ms, the execution happens somewhere within one period, and the output is made available at the end of the period. A timing diagram is shown in Fig. 5 . Thus no matter how quickly or slowly the controller executes, as long as it can finish in 2 ms, the outputs will be produced at the exact time instants.
Tasks in Giotto execute concurrently, at least conceptually. Consider, for instance, a Giotto model as shown in Fig. 6 , where a task A with period 2 ms, feeds data to task B with period 2 ms.
Conceptually, the tasks begin executing simultaneously and end simultaneously. Thus, task B will Figure 5 . In Giotto, regardless of the exact execution time of an actor, the inputs are consumed and the outputs are produced at well-defined time instants.
Time (ms)
Controller Read Input . . . 2 4 Produce Output Execution 0 always see data provided by A in the previous cycle. Hence there is a precise 2-ms delay introduced by actor A. This one sample delay per task may not be desirable for some applications, but it yields a deterministic timing and functional behavior.
Notice that both PBO and Giotto models use the state semantics of communication, which has significant advantage to avoids unnecessary synchronizations among senders and receivers and allows tasks to be triggered at any time. However, state semantics may lose the notion of new events, such that some data may be processed more than once, and some others may be completely ignored. The problem is left to the designers in the PBO model, whereas it is made explicit by the one-sample delay in the Giotto model. Furthermore, pure time-triggered models require precise time synchronizations across distributed platforms, which may be a significant cost for complex systems.
TIMED MULTITASKING
This article describes an event-driven approach −− timed multitasking −− that tackles the real-time programming problem. Unlike in the PBO and Giotto models, a task in TM (called an actor) is executed when there are input events that fulfill certain conditions specified by the actor.
But like the Giotto model, it provides a predictable input/output timing so that the computational delays can be used as deterministic parameters in algorithm designs. The basic idea behind the TM model is simple: since the activation of a task depends either on other tasks or on interrupts, by controlling the time at which outputs are produced and triggering tasks with new events, we can effectively control both the starting time and stopping of each task, thus obtain deterministic timing properties.
ACTOR-ORIENTED PROGRAMMING
The TM model is built on top of a component-based approach, where components are called Ptolemy actors 1 . This model helps the designer to isolate atomic computations and identify interdependencies among software components. This notion of actors differs from Agha's actor model [16] in the sense that actors do not necessarily associate with their own thread of control.
In TM, an actor represents a sequence of reactions, where a reaction is a finite piece of com- 1. The name is due to its implementation in the Ptolemy project [15] .
Actor-oriented programming can be conveniently visualized using block diagrams, which are familiar to most control engineers. For example, the embedded system in Fig. 1 can be represented by a very similar actor model, as shown in Fig. 7 . There are two actors: a Supervisor and a
Controller. The supervisor has an input port, named data, and an output port, named param. The controller actor has two input ports, input and paramIn, and one output port. These ports decouple the threads of control among the supervisor, the controller, and the scheduler that manages their execution. For example, consider the link from the param port of the supervisor and the paramIn port of the controller. Suppose this link is implemented as a buffer, such that the output from the supervisor is first buffered at the paramIn port, and then the controller decides when to read it (e.g.
the next time the controller is activated). Although mutual exclusion between reads and writes on the port is still needed, the controller can always proceed using its internal state without waiting for the supervisor to finish writing, and can update parameters at the beginning of the next invocation. The supervisor task can now be arbitrarily preempted by the controller without worrying about priority inversion or about the consistency of parameter data.
The actor architecture can also make resource utilization explicit. A useful design pattern is to use specific actors for delegated resources. For example, suppose that two actors A and B both need to write to a same file, as shown in Fig. 8 . Obviously, an implicit locking mechanism is fileWriter, can be introduced to manage the file explicitly. By using this actor, the order of writing in actors A and B becomes visible and is formally managed in the same way as any other events in the system, and the actors no longer need to be partially preemptable.
TM PROGRAMMING CONCEPTS
The building blocks in the TM model are actors with further annotations. Actors in a TM model not only declare their computing functionality, but also specify their execution requirements in terms of trigger conditions, execution times, and deadlines. As illustrated in Fig. 9 , an actor is activated when its trigger condition is satisfied. If there are enough resources at runtime, then the actor will be granted at least the declared execution time before its deadline is reached.
The results of the execution are made available to other actors and the physical world only at the deadline time. This is sometimes called faster-than-real-time computation [17] in the sense that the results are computed well before it is required by the real-time constraints. Since the output may immediately trigger other tasks, whose deadlines are also known, we effectively control both starting time and stopping time of the actors, and provide deterministic timing behavior. In the cases where an actor cannot finish by its deadline, our model includes an overrun handler, much like the timing failure handlers in Chimera [18] , to preserve the timing determinism of all other actors and allow the actor who violates the deadline to come to a quiescent state.
Trigger Conditions
A trigger condition can be built using real time, physical events, communication packets, and/or messages from other actors. The trigger conditions are required to be responsible [20] , meaning that, once triggered, the actor does not need any additional data to complete its (finite) computation. The communication among the actors has an event semantics in which, unlike state semantics, every piece of data will be produced and consumed exactly once. Event semantics can be implemented by FIFO queues. Conceptually, the sender of a communication is never blocked on writing. The responsible trigger condition guarantees that if an actor is activated, there are enough data to complete a reaction, so actors will not be blocked on reading data.
Deadlines
The deadline of a reaction is expressed as a real-time value, indicating that the computational results are produced if and only if the deadline is reached. Explicitly expressing deadlines has two benefits:
• By knowing the deadlines of all actors and their triggering dependencies, designers know exactly what time delay their programs will introduce at runtime, so that they can be more confident in choosing algorithm parameters;
• Explicit deadlines are also useful in resource-aware algorithms, such as anytime algorithms [19] , which can provide results with different fidelities, depending on the computational time it has.
If a deadline is not specified, the results will be produced as soon as the computation has finished. This is useful to handle soft real-time actors or intermediate computational steps to provide prompt triggering for downstream actors.
Execution Time
An actor also declares its execution time, which is the amount of time it needs to finish its reaction, given that it is not preempted by any other actors. Unlike the deadline, which is given in terms of real time no matter how many times the actor is preempted, the execution time is measured with respect to the actual time spent on the actor's execution. This value is mainly used at compile time to determine schedulability and resource utilization. The execution time can be the worse-case execution time, so that the actor can always be finished. Alternatively, the declared time can be a typical execution time, so that resources may be better used, and the actor can rely on overrun handlers when deadlines are missed.
Overrun Handling
Overrun 
SOFTWARE SYNTHESIS
Building real-time embedded software using actors and the timed-multitasking model allows certain levels of analysis of timing properties and generation of run-time software and scheduling.
This formal step preserves the TM execution semantics and reduces the burden of writing errorprone code for event queues and multi-task synchronizations. We consider a single-processor platform at this time and discuss two aspects in the synthesis −− scheduling analysis and code generation.
Scheduling Analysis
General event triggered real-time systems with multiple shared resources are not amenable to compile-time schedulability analysis [10] . However, explicit actor partition and time determinism make real-time scheduling theories applicable to many TM models. When trigger conditions are predictable, such as periodic triggers, the execution time and the deadlines can be fed to scheduling algorithms for schedulability analysis and priority assignment. Notice that the semantics of TM does not require that a designer directly specify the priorities of actors. Typically, there are multiple priority assignment policies that can fulfill the timing requirements, and for any feasible scheduling policy, the execution result for a given set of inputs is exactly the same in terms of both time and value determinism. In this sense, the TM model is robust to scheduling policies.
Code Generation
Fully automated software synthesis is a very broad issue. In this article, we focus on generating the interfaces and interactions among actors into imperative languages like C, and leave the definition of the actor functionality as an open issue. In fact, after providing the interface and scheduling code, the actor code becomes single threaded and self-contained, which is much easier to manage. The generated code is linked with a TM run-time system, which will be described in the next section.
The first step of software synthesis is to distinguish two types of actors −− ones that respond to external events and ones that are triggered entirely by events produced by peer actors. By partitioning software components into actors, it is easy to make sure that these two types do not intersect. At runtime, we call the first type interrupt service routines (ISRs) and the second type tasks.
For code generation, they have different interfaces.
An ISR typically converts inputs from the outside world into events that triggers other actors.
In a TM model, an ISR usually appears as a source actor, or simply a port that transfers events into the model. These components do not have triggering rules. They are typically managed by lowlevel device drivers and are triggered immediately when interrupts occur. An ISR does not have a deadline, and an output is made immediately available as a trigger event to the downstream
actors. An ISR is synthesized as an independent thread, with an init() method that initializes hardware resources and a start() method that registers itself to the run-time system. A produceOutput() method is also generated for the ISR to produce trigger events for other actors.
Tasks, on the other hand, have a much richer set of interfaces than ISRs. In additional to init() and start() methods that initialize resources and states, there is a set of methods, shown in Fig. 10 , that defines of the split-phase reaction of a task, in this case, a Controller. Among the methods, isReady() returns a boolean that indicates whether the task is ready to execute when the method is called; getDeadline() assigns the deadline value if a deadline is specified; exec() is the main body of the reaction; stopExec() is the overrun handler; and produceOutput() produces events that may trigger other tasks. Obviously, this interface matches very closely to the TM programming model. In the interface generation step, for each TM actor that is not an ISR, a C file will be generated that contains the template of all methods listed. A TASK data structure is also generated, which defines a scheduling entry representing a task. In additional to the function pointers that point to the task methods, there are two variables: hasDeadline is a variable indicating whether the reaction should meet a deadline; priority is an integer indicating the ordering among triggered tasks, assuming a static priority assignment. The software synthesis process also generates the interaction relations among actors by sorting through ports and connections. In a TM model, ports are contained by actors. Thus, it is obvious to the code generator which task to trigger when a new event is received by a port. There are many ways to achieve event-triggered execution models. Here we assume an event dispatcher at the scheduler level, and use ports as proxies for the event dispatcher. An event is a global variable, containing the communicating data, a mutual-exclusion lock to guard the access of the variable if necessary, and a flag indicating whether the event has been consumed. Fig. 11 is an example of the code generated for the port paramIn of the actor Controller in Fig. 7 , assuming the data type of the port is a pair of floats, and assuming an underlying operating system that supports POSIX threads. A data structure for the events is generated, together with a set_Controller_paramIn() method that represents putting an event into the paramIn port. Figure 11 . Generated code that defines an event type, a port, and a triggering mechanism that activates the controller task when new events are sent to the port.
The last line in the generated Supervisor_produceOutput() template will call the set_Controller_paramIn() method with a new event. In this method, the data in the event is transferred to the global variable representing the port, and a task trigger is created and queued with a dispatcher. A run-time scheduler will use the dispatcher and trigger the task at a proper time.
TM RUNTIME SYSTEMS
The execution model of TM programs is a stylized use of priority-based multitasking execution, as seen in most RTOSs. The run-time scheduler, implemented as a highest priority task, uses a dispatcher to manage trigger events and to execute reactions. The scheduler is activated by an insertTriggeredTask(TASK_t* task) call when a port triggers a task. The scheduler then invokes the isReady() method of the corresponding task. If the task is ready, the scheduler will send the task's exec() to the underlying RTOS to schedule its execution. At the same time, it sets up a timer to monitor the task's deadline if necessary.
The run-time system for TM programs strictly obeys the deadlines for each actor if they are specified. It keeps track of the deadlines for all actors as timer interrupts, and when the deadline is reached, it asks the actor to stop execution and produce its outputs. If at the deadline time the actor is still executing, the overrun handler will be activated.
An overrun reaction can be terminated either gracefully or abruptly. A graceful termination may set a flag to ask the reaction to stop, and rely on the reaction code to check this flag at reason- 
FLIGHT CONTROL SYSTEM
The primary components in the flight control system are actuators, sensors, and a control computer. The actuators consist of servo motors controlling the collective pitch, cyclic pitch, throttle, and tail rotor. The primary sensors are:
• Inertial Navigation System (INS). The INS consists of accelerometers and rotational rate sensors, and provides estimations of the helicopter's position, velocity, orientation, and rate of rotation. Although the information is provided at a rate of 100 Hz, the error in the estimates could grow unbounded over time.
• Global Positioning System (GPS). The GPS solves the INS drifting problem by providing a more accurate position measurement, but more infrequently −− at roughly 5 Hz.
Both sensors push their measurement into the control computer. A Kalman filter, implemented in embedded software, fuses the INS-GPS readings to provide frequent and accurate estimates of the state of the helicopter.
The controller implements a modal state feedback algorithm, which partitions the helicopter trajectory into operation modes and uses state feedback to stabilize and drive the helicopter. The controller runs at 50 Hz, which is determined by the helicopter dynamics and the limitations of the actuators. In essence, the controller runs on every second output of the sensor fusion component.
Giotto has two difficulties with the flight control system. First, the clock of the GPS and INS are not synchronized with the control computer. If sensor fusion is implemented as a Giotto task of frequency 100 Hz, then when the task is triggered according to the control computer clock, the sensor inputs may not be fresh. Secondly, if the control task is modeled using Giotto as a 50 Hz task, then the controller output will only be available to the actuator by the end of a 20 ms period, even though the control algorithm can be finished in about 2 ms. The latter problem is solved in [22] by designing the controller as a 200 Hz task, where in three out of every four cycles, the controller does no computation. Even so, the control system introduces a total delay of 15 ms.
MODELING IN TM
TM is a suitable model for the UAV flight control system, since it is event-driven, so that no explicit clock synchronization is required between the sensors and the control computer. In addition, the explicit notion of deadline allows the timing properties of the fusion task and the controller to be expressed flexibly. There is no restriction that the output has to be produced just before the start of the next period.
The flight control system is modeled in the TM domain of Ptolemy II, and the helicopter dynamics is modeled in the continuous-time (CT) domain. Fig. 12 shows the model. The helicopter dynamics is implemented as a set of differential equations using the DifferentialSystem actor. Fig. 13 . Let k be an integer that increases by one for every 10 ms. Then, the deadline of the Fusion actor is ms in the cycle when the controller is not activated. Otherwise, the deadline is 
HARDWARE-IN-THE-LOOP SIMULATION
The controller is validated further in a hardware-in-the-loop (HITL) simulator implemented in the BEAR project [22] . Interfaces and schedules of the embedded software components are generated from the Ptolemy II model into C code running on Linux. The two input ports of the TM Controller are generated as interrupt service routines. The Fusion and Controller actors are generated into TM runtime tasks. The output port control_output is synthesized into a trivial task. Its 
CONCLUSION
This article reviews the challenges of developing embedded software for real-time control systems, and argues that timing properties should be introduced at the programming level to bridge the gap between algorithm development and real-time priority tuning. We further describe the timed multitasking model that uses an event-triggering mechanism and deadlines to provide deterministic timing behavior for embedded software. The actor-oriented programming model and a highly structured communication style allows interface and scheduling code to be generated from the high-level specification. A helicopter control system is designed as an example.
ACKNOWLEDGMENT
This work is part of the Ptolemy project, which is supported by DARPA, the State of California MICRO program, and the following companies: Agilent, Cadence Design Systems, Hitachi, and Philips. The authors would also like to thank Judy Liebman for providing the hardware-inthe-loop helicopter simulator.
