Abstract
Introduction
Embedded systems very often have to satisfy strict timing requirements. In the case of such hard real-time applications, predictability of the timing behaviour is an extremely important aspect. Frequently, such applications are implemented as distributed systems. This is the case, for example, with many applications in the automotive industry. Predictability of such a system has to be guaranteed globally, considering both the task schedules determined for the particular processing units as well as the timing of the communication between different components of the system.
Task scheduling and schedulability analysis has been intensively studied for the past decades. The reader is referred to [2] , [3] for surveys on this topic.
A few approaches have been proposed for a holistic schedulability analysis of distributed real-time systems, taking into consideration both task and communication scheduling. In [23] , Tindell provided a framework for holistic analysis of event-triggered task sets interconnected through an infrastructure based on a generic TDMA protocol. In [15] , the authors improve Tindell's analysis by allowing dynamic task offsets, which produced tighter bounds for task response times. Work in the area of scheduling and schedulability analysis diversified significantly by considering particular communication protocols, like the Token Ring protocol [20] [22] , the ATM protocol [8] [11] , CAN bus [7] [24] , or TTP bus [12] . In [17] and [18] we have developed a holistic analysis allowing for either time-triggered or event-triggered task sets communicating over a TTP bus. In addition to schedulability analysis, this work has also addressed the optimization of the TTP based bus configuration in order to fit the particular application.
Two basic approaches for handling tasks in real-time applications can be identified [13] : the event-triggered (ET) and time-triggered (TT). There has been a long debate in the real-time and embedded systems communities concerning the advantages of each approach and which one to prefer [1] , [13] , [26] . Several aspects have been considered in favour of one or the other approach, such as flexibility, predictability, jitter control, processor utilization, testability, etc.
The same duality is reflected at the level of the communication infrastructure, where communication activities can be triggered either dynamically, in response to an event (as is the typical case with the CAN bus [4] ), or statically, at predetermined moments in time (as in the case of TDMA protocols and, in particular, the TTP [13] ).
An interesting comparison of the TT and ET approaches, from a more industrial, in particular automotive, perspective, can be found in [14] . Their conclusion is that one has to choose the right approach depending on the particularities of the scheduled tasks. This means not only that there is no single "best" approach to be used, but also that inside a certain application the two approaches can be used together, some tasks being time-triggered and others event-triggered.
The fact that such an approach is considered for future automotive applications is also indicated by the recent activities related to the development and standardisation of bus protocols which support both static (ST) and dynamic (DYN) communication. Such a protocol has been suggested in [16] and [21] . A mixed protocol has been also proposed by a consortium, to be used as a standard in automotive applications [10] . In [6] , the authors describe the so called Universal Communication Model (UCM), a framework for modelling at a high level of abstraction the communication infrastructure in automotive applications.
Efficient implementation of new, highly complex distributed automotive applications entails the use of TT task sets together with ET ones, implemented on top of a communication infrastructure with a mixed ST/DYN protocol. Given its flexibility, such an approach has the potential of highly efficient, fine-tuned, and optimised implementations.
Our main contribution in this paper is related to the scheduling and schedulability analysis of distributed embedded systems implemented with both ET and TT task sets, which are communicating through mixed ST/DYN bus protocols. Such an analysis and scheduling procedure constitutes the fundament for any synthesis approach aiming at an efficient, highly optimised implementation of a distributed application which is also guaranteed to meet the timing constraints.
We also identified several design problems which offer the potential of significant optimization and which can be solved by efficient design space exploration, based on the timing analysis mentioned above. In order to illustrate the potential of such optimizations, we have looked more closely at one particular communication synthesis problem.
In the next section we present the architecture of the distributed systems and the application model that we are studying. Section 3 describes the holistic scheduling and schedulability analysis we have developed. Some specific optimization issues are presented in Section 4. Section 5 describes a particular optimization problem related to the bus access, while Section 6 presents some experimental results. The last section presents our conclusions.
System Architecture and Application Model

Hardware Architecture and Bus Access
We consider architectures consisting of nodes connected by a unique broadcast communication channel. Each node consists of a communication controller, a CPU, memories (RAM, ROM), and an I/O interface to sensors and actuators (see Figure 1) .
We model the bus access scheme using the Universal Communication Model [6] . The bus access is organized as consecutive cycles, each with the duration T bus . We consider that the communication cycle is partitioned into static and dynamic phases (Figure 1 ). Static phases consist of time slots, and during a slot only one node is allowed to send ST messages; this is the node associated to that particular slot. During a dynamic phase, all nodes are allowed to send DYN messages and the conflicts between nodes trying to send simultaneously are solved by an arbitration mechanism based on priorities assigned to messages. The bus access cycle has the same structure during each period T bus . Every node has a communication controller that implements the static and dynamic protocol services. The controller runs independently of the node's CPU.
Software Architecture
For the systems we are studying, we have designed a software architecture which runs on the CPU of each node. The main component of the software architecture is a realtime kernel which supports both time-triggered and eventtriggered activities. An activity is defined as either the execution of a task or as the transmission of a message on the bus. For the TT activities, the kernel relies on a static schedule table which contains all the information needed to take decisions on activation of TT tasks or transmission of ST messages. For the ET tasks, the kernel maintains a prioritized ready queue in which tasks are placed whenever their triggering event has occurred and they are ready for activation, or when they have been pre-empted.
The real-time kernel will always activate a TT task at the particular time fixed for that task in the schedule table. If at that moment, an ET task is running on that node, that task will be pre-empted and placed into the ready queue according to its priority. If no tasks are active, ET tasks are extracted from the ready queue and are (re)activated. ET tasks can pre-empt each other based on their priority.
The transmission of messages is handled in a similar way: for each node, the sending and receiving times of ST messages are stored in the schedule table; the DYN messages are organized in a prioritized ready queue. ST messages will be placed at predetermined time moments into a bus slot assigned to the sending node. DYN messages can be potentially sent during any dynamic phase. Conflicts due to simultaneous transmission of messages from different nodes are avoided, based on message priorities, by the communication controllers. In order to prevent the delay of an ST message by a DYN frame or the retransmission of a pre-empted DYN message, the DYN messages will be sent only if there is enough time available for that message before the dynamic phase ends.
TT activities are triggered based on a local clock available in each processing node. The synchronization of local clocks throughout the system is provided by the communication protocol.
Application Model
We model an application as a set of task graphs. Nodes represent tasks and arcs represent communication (and implicitly dependency) between the connected tasks. Each task is mapped on a certain node of the distributed application.
• A task belongs either to the TT or to the ET domain.
• Communication between tasks mapped to different nodes is performed by message passing over the bus. Such a message passing is modelled as a communication task inserted on the arc connecting the sender and the receiver tasks. The communication time between tasks mapped on the same node is considered to be part of the task execution time. Thus, such a communication activity is not modelled explicitly. For the rest of the paper, when referring to messages, we consider only the communication activity over the bus.
• A message belongs either to the static (ST) or to the dynamic (DYN) domain.
• All tasks in a certain task graph belong to the same domain, either ET, or TT, which is called the domain of the task graph. However, the messages belonging to a certain task graph can belong to any domain (ST or DYN). Thus, in the most general case, tasks Figure 2 shows an application modelled as two task graphs mapped on two nodes (processors).
In order to keep the separation between the TT and ET domains, which are based on fundamentally different triggering policies, communication between tasks in the two domains is not included in the model. Technically, such a communication is implemented by the kernel, based on asynchronous non-blocking send and receive primitives (using proxy tasks if the sender and receiver are on different nodes). The transmission and reception of such a message are not considered as communication tasks or respectively events in the context described by our model, therefore they are outside the scope of our holistic analysis. Such messages are typically non-critical and are not affected by hard real-time constraints.
Holistic Scheduling
Given an application and a system architecture as presented in Section 2, the following problem has to be solved: construct a correct static schedule for the TT tasks and ST messages (a schedule which meets all time constraints related to these activities) and conduct a schedulability analysis in order to check that all ET tasks meet their deadlines. Two important aspects should be noticed: 1. When performing the schedulability analysis for the ET tasks and DYN messages, one has to take into consideration the interference from the statically scheduled TT tasks and ST messages. 2. Among the possible correct schedules for TT tasks and ST messages, it is important to construct one which favours, as much as possible, the schedulability of ET tasks and DYN messages. In Section 3.1 we present the schedulability analysis for a set of ET tasks and DYN messages, considering a fixed given static schedule of TT tasks and ST messages. In Section 3.2 we discuss the construction of the static schedule which is driven by the objective of achieving global schedulability of the system. Three alternative scheduling heuristics are presented and they will be evaluated and compared in Section 6.
In order to keep the presentation reasonably simple and given the space limitations, we present here the analysis for a restricted model, in the sense that TT tasks are communicating only through ST messages, while the communication between ET tasks is only through DYN messages. This is not an inherent limitation of our approach and the analysis we have developed and implemented supports the general model (in [18] and [19] , for example, we have presented an approach to schedulability analysis of ET tasks communicating through ST messages).
Schedulability Analysis of the ET Sub-System Considering the Influence of a Given Static Schedule
An ET task graph Γ i is activated by an associated event which occurs with a period T i . Each activity τ ij (task or message) in an ET task graph has an offset φ ij which specifies the earliest activation time of τ ij relative to the occurrence of the triggering event. The delay between the earliest possible activation time of τ ij and its actual activation time is modelled as a jitter J ij (Figure 3 .a). Offsets and jitters are the means by which dependencies among tasks are modelled for the schedulability analysis. The response time of an activity τ ij is the time measured from the occurrence of the associated event until the completion of τ ij . Each ET activity τ ij has a best case response time R b,ij . The worst case response time R ij of an activity τ ij is determined by creating first a critical instant t c , which represents the starting point of the worst-case busy window w ij , a time interval which ends when τ ij finishes execution (Figure 3 .b).
Figure 2. Application Model Example
Node 1 :
Messages:
ST:
Tasks:
Figure 3. Model of the ET Sub-System
a) Tasks with offsets b) Response time and busy period w for task τ ij
During the busy window w ij , Processor(τ ij ) executes only task τ ij or higher priority tasks. ϕ ij is the time interval between the critical instant and the earliest time for the first activation of the task after this instant. Considering a set of data dependent ET tasks mapped on a single processor, the analysis in [15] computes the worst case response time R ij of a task τ ij , based on the length of its busy period, considering all the critical instants initiated by higher priority activities in Γ i and by τ ij itself, and all job instances p of τ ij which can appear in the busy window w ij :
where w ijk (p) is the worst-case busy window of the p-th job of τ ij , numbered from the critical instant t c initiated by τ ik ; ϕ ijk is the time interval between the critical instant initiated by τ ik , and the earliest time for the first activation of τ ij after this instant.
The value of w ijk (p) is determined as follows:
where B ij represents the maximum interval during which τ ij can be blocked by lower priority activities 1 , W ik (τ ij ,t) is the interference from higher priority activities in the same task graph Γ i at time t, and W * a (τ ij ,t) is the maximum interference of activities from other task graphs Γ a on τ ij . One problem that arises during the computation of response times is that the length of the busy window depends on the values of task jitters, which, in turn, are computed as the difference between the response times of two successive tasks (for example, if t ij precedes t ik in Γ i , then J ik = R ij -R b,ij ). Because of this cyclic dependency, the process of computing R ij is an iterative one: it starts by assigning R b,ij to R ij and then computes the values for J ij , w ijk (p) and then again R ij , until the response times converge to their final value.
Starting from the analysis in [15] , we had to consider the following additional aspects:
• The interference from the set of statically scheduled tasks.
• The computation of worst case delays for the messages communicated on the bus and the global schedulability analysis of the distributed task set.
We solve both aspects using an analysis similar to the one developed in [16] . First we introduce the notion of ET demand associated with an ET activity τ ij as the amount of CPU time or bus time which is demanded only by higher priority ET activities and by τ ij during the busy window w ij . In Figure 4 , the ET demand of the task τ ij during the busy window w ij is represented with H ij (w ij ), and it is the sum of worst case execution times for task τ ij and two other higher priority tasks τ ab and τ cd . During the same busy period w ij , we define the availability as the processing time which is not used by statically scheduled activities. In Figure 4 , the CPU availability for the interval of length w ij is obtained by substracting from w ij the amount of processing time needed for the TT activities.
During a busy window w ij , the ET demand H ij of a task τ ij is equal with the length of the busy window which would result when considering only ET activity on the system:
During the same busy window w ij , the availability A ij associated with task τ ij is: (w) and the demand are computed for a task τ ij : the busy window of τ ij starts at the critical instant q T i + t c initiated by task τ ab and ends at moment qT i + t c + w ij , when both higher priority tasks (τ ab , τ cd ), all TT tasks scheduled for execution in the analysed interval, and τ ij have finished execution.
The discussion above is, in principle, valid for both ET tasks and DYN messages. However, there exist two important differences. First, messages do not pre-empt each other, therefore, the demand equation is modified so that it will not consider the time needed for the transmission of the message under analysis (once the message has gained the bus it will be sent without any interference [16] ). Second, the availability for a message is computed by substracting from w ij the length of the ST slots which appear during the considered interval; moreover, because a DYN message will not be sent unless there is enough time before the current dynamic phase ends, the availability is further decreased with C A for each dynamic phase in the busy window (where C A is the transmission time of the longest DYN message).
Our schedulability analysis algorithm determines the length of a busy window w ij for an ET task or DYN message by identifying the appropriate size of w ij for which the ET demand is satisfied by the availability: H ij (w ij ) ≤ A ij (w ij ).
1. Such blocking can occur at access to a shared critical resource. 
Figure 4. Availability and Demand
This procedure for the calculation of the busy window is included in the iterative process for calculation of response times, presented earlier in this subsection. It is important to notice that this process includes both tasks and messages and, thus, the resulted response times of the ET tasks are computed by taking into consideration the delay induced by the bus communication.
After performing the schedulability analysis, we can check if R ij ≤ D ij for all the ET tasks. If this is the case, the set of ET activities is schedulable. In order to drive the global scheduling process, as it will be explained in the next section, it is not sufficient to test if the task set is schedulable or not, but we need a metric that captures the "degree of schedulability" of the task set. For this purpose we use the function DSch, similar with the one described in [18] :
where N is the number of ET task graphs and N i is the number of activities in the ET task graph Γ i .
If the task set is not schedulable, there exists at least one task for which R ij > D ij . In this case, f 1 > 0 and the function is a metric of how far we are from achieving schedulability. If the set of ET tasks is schedulable, f 2 ≤ 0 is used as a metric. A value f 2 = 0 means that the task set is "just" schedulable. A smaller value for f 2 means that the ET tasks are schedulable and a certain amount of processing capacity is still available. Now, that we are able to perform the schedulability analysis for the ET tasks considering the influence from a given static schedule of TT tasks, we can go on to perform the global scheduling and analysis of the whole application.
Static Schedule Construction and Holistic Analysis
For the construction of the cyclic static schedule for TT tasks and ST messages, we use a list-scheduling based algorithm [5] . Assuming that in our application we have N time-triggered task graphs Γ 1 , Γ 2 , ..., Γ Ν , the static schedule will be computed over a period T SS = LCM(T 1 , T 2 , ..., T N ). The input to the list scheduling algorithm is a graph consisting of n i instances of each Γ i , where n i =T SS /T i . A ready list contains all TT tasks and ST messages which are ready to be scheduled (they have no predecessors or all their predecessors have been scheduled). From the ready list, tasks and messages are extracted one by one to be scheduled on the processor they are mapped to, or into a static bus-slot associated to that processor on which the sender of the message is executed, respectively. The priority function which is used to select among ready tasks and messages is a critical path metric, modified for the particular goal of scheduling tasks mapped on distributed systems [17] . Let us consider a particular task τ ij selected from the ready list to be scheduled. We consider that ASAP ij is the earliest time moment which satisfies the condition that all preceding activities (tasks or messages) of τ ij in graph Γ i are finished and Processor(τ ij ) is free. The moment ALAP ij is the latest time when τ ij can be scheduled. With only the TT tasks in the system, the straightforward solution would be to schedule τ ij at ASAP ij . In our case, however, such a solution could have negative effects on the schedulability of ET tasks. What we have to do is to place task τ ij in such a position inside the interval [ASAP ij , ALAP ij ] so that the chance to finally get a globally schedulable system is maximised.
In order to consider only a limited number of possible positions for the start time of a TT task τ ij , we take into account the information obtained from the schedulability analysis described in Section 3.1, which allows us to compute the response times of ET tasks. We started from the obvious observation that statically scheduling τ ij after an ET task τ kl has finished its execution will guarantee that task τ ij will not interfere with τ kl . Thus, we consider as alternative start times for τ ij the response times of all ET tasks which finish their execution inside the [ASAP ij ,
The moment referred by ASAP ij was added to alternative_start_times so that the set of alternative start times of a TT task will not be empty even if no ET tasks finish their execution during the interval [ASAP ij , ALAP ij ].
We illustrate the choice of possible start times of a TT task τ ij in Figure 5 where three ET tasks
Statically scheduling τ ij at time R k,l avoids the interferences from τ ij to τ kl ., while scheduling τ ij even later, at R k,l+1 , will guarantee that τ ij does not interfere with either τ kl or τ k,l+1 .
After identifying the set of candidate start times of a task, we have to select one of them as the static schedule for that task. Two aspects have to be considered in this context: 1. The interference with the ET activities should be minimised; 2. The deadlines of TT activities should be satisfied.
In order to evaluate the first goal, the value of the function DSch (see Section 3.1) is computed for each alternative start time t after performing the schedulability analysis of the ET task set considering the influence from the TT
Figure 5. Selection of Alternative Start Times
time on Processor(τ ij ) tasks, with τ ij scheduled at t. As will be shown in the following section, a global cost function is computed, which combines both goals defined above, and, based on a greedy approach, the start time of the task will be selected.
The scheduling algorithm is presented in Figure 6 . If the selected TT activity extracted from the ready_list is a task τ ij , then the alternative_start_times are evaluated and the algorithm selects the one which generates the smallest value of the cost function. When scheduling an ST message extracted from the ready list, we place it into the first bus-slot associated with the sender node in which there is sufficient space available. If all TT tasks and ST messages have been scheduled and the schedulability analysis for the ET tasks indicates DSch ≤ 0, then the global system scheduling has succeeded.
For the case that no correct schedule has been produced, we have implemented a backtracking mechanism in the list scheduling algorithm, which allows to turn back to previous scheduling steps and to try alternative solutions. In order to avoid excessive scheduling times, the maximum number of backtracking steps can be limited.
In the following subsections we present three alternative ways to compute the cost function which drives the heuristic in Figure 6 . The three alternatives are identified as MxS1, MxS2 and MxS3 (from mixed scheduling).
MxS1
Scheduling a TT task τ ij inside its [ASAP ij , ALAP ij ] interval will, of course, guarantee that deadlines related to this particular task are satisfied, and that there exists the possibility that a valid static schedule can be constructed for the system. However, due to the data dependencies, scheduling τ ij later inside [ASAP ij , ALAP ij ] decreases the probability of finding a feasible static schedule for the tasks further down. This is why, for the evaluation of the alternative start times of a TT task τ ij (line 08 in Figure 6 ), we introduced a cost function which combines the degree of schedulability of the ET activities (DSch in Section 3.1) with a second metric which captures the "risks" taken by scheduling τ ij at later times:
where t is one of the alternative start times, A and B are normalization constants, and slack(t, τ ij ) 
Compute H ET as the sum of lengths of each of the intervals in .
It is easy to notice that if the slack has a very small value (even negative), then the first term in function f (the one depending on time t) has a much greater weight on the value of f. Consequently, earlier start times for τ ab will be preferred. On the other hand, if there is more available processor time than needed (in other words, slack has a high value), the function f will depend mainly on the value of the second term, thus the main aspect taken into consideration will be the influence of TT activities on the ET ones, which is captured by DSch.
The static scheduling algorithm will select, among the alternative start times, that time t for which the value of the cost function f is minimum.
MxS2
The schedulability analysis algorithm described in Section 3.1 is applied very often during the static scheduling procedure presented in Figure 6 , both in order to compute the values of the possible start times (line 04) and the Cost associated with each such start time (line 08). In order to reduce the amount of time needed for scheduling, we experimented with an algorithm which uses the schedulability analysis only for determining the set of start_times(line 04), while the evaluation process in line 08 is based on a simpler version of function f. In MxS1, when the alternative start times of a TT task are evaluated, running the schedulability analysis returns the value DSch which reflects the amount of new interference that has been introduced in the ET subsystem at a global level. The simpler function f, which we use in this second algorithm, avoids calling the global schedulability analysis for each 
possible start time of a TT task τ ij and considers only the interferences produced by τ ij on the ET tasks mapped on Processor(τ ij ):
where the value of DSch (as expressed in Section 3.1) is computed (on line 4, Figure 6 ) before τ ij has been scheduled, and ∆DSch is the amount of interference introduced by τ ij on the ET tasks mapped on Processor(τ ij ):
, where R kl is the response time of an ET task τ kl before τ ij has been scheduled and R' kl is an approximation of the response time of τ kl after τ ij has been scheduled at time t.
We estimate that, depending on the time t when a TT task τ ij is scheduled, the response time of an ET task τ kl mapped on the same Processor(τ ij ) either remains unchanged (is not influenced at all) or is increased with a value up to the worst case execution time C ij of the TT task. Figure 7 presents which are the situations when the response time of an ET task τ kl remains unchanged and when it is increased because of the influence of a TT task τ ij . The cases represented in Figure 7 .a) show that when a TT task τ ij is scheduled at time t so that its associated execution interval [t, t + C ij ] does not intersect with the time interval where an ET task executes in the worst case [φ kl , R kl ], then we estimate that after scheduling τ ij at t, the response time for τ kl will be the same, R kl ' = R kl . However, if the intersection is not empty (like in the cases in Figure 7 .b), then R kl '= R kl + ∆DSch kl . The value for the increment used in the function f'(t,τ ij ) will be computed as ∆DSch =
Σ∆DSch kl , for all τ kl in the ET domain and Processor(τ kl ) = Processor(τ ij ).
In MxS2, the schedulability analysis of the system is called only once for each TT task (step 04), which will lead, as we will see in Section 6, to faster computation times.
MxS3
List scheduling, which is the basis for our scheduling algorithm, is a constructive method that builds the static schedule table incrementally, by adding one TT task or ST message at a time. In the previous two versions of the algorithm (Section 3.2.1 and Section 3.2.2), at each step, the effect of the static schedule, including the newly introduced task, on the ET subsystem is measured by function DSch. However, the problem is that the available static schedule is not complete when estimating, for a given task τ ij , the global influence of TT activities on the set of ET ones. For the alternatives MxS1 and MxS2 we have chosen the following simple approach: for evaluating the influence of the decision of which alternative start time to select for τ ij , we consider only that part of the static schedule which already has been built, up to that particular moment. The selection is fair, as the same conditions are applied to all alternative times; however, it is inaccurate, since a part of the final static schedule is ignored when taking the decision. For the alternative MxS3 we have considered a solution which tries to improve on this lack of accuracy by considering the whole set of TT activities when evaluating the degree of schedulability of the ET tasks and messages. This is solved by considering an approximate static schedule for the yet unscheduled TT activities. Therefore, a preliminary step is performed in preparation of the algorithm in Figure 6 .
First, we build an initial static schedule by using a simpler and faster version of the algorithm in Figure 6 . In this version, the response times of the ET activities are computed only once in the beginning of the algorithm and the evaluation of possible start times is performed using a simple function like in MxS2. This step allows us to rapidly obtain a static schedule which will be at the basis of the second step of our approach.
After the preparation step, we run the algorithm in Figure 6 , but whenever schedulability analysis of the ET subsystem is performed, we consider that the interfering static schedule not only contains the TT activities which were scheduled so far, but all the TT tasks and ST messages in the system. We obtain such a complete static schedule by considering:
• the start times of the TT tasks/ STmessages scheduled so far in this second step; • for the unscheduled TT tasks/ ST messages, we consider their start times as identified in the first step of the algorithm Figure 8 illustrates the way we obtain such a complete static schedule. The static schedule considered during the schedulability analysis of the ET subsystem (Figure 8 .c) contains all the TT tasks and ST messages in the system. Such a complete schedule is obtained by putting together 
System Optimization
Considering a hard real-time system like the one described in Section 2, several design problems emerge. There are, of course, the classical issues as selection of an architecture (e.g. number and kind of nodes), the mapping of tasks on the processing nodes, or the assignment of priorities to ET tasks and DYN messages [1] , [9] , [25] . However, due to the heterogeneous ET and TT nature of the application and the mixed synchronous/dynamic bus protocol, some new, very interesting problems can be identified:
• Partitioning of the system functionality into TT and ET activities. During the design process, a decision should be made on which tasks and messages will be implemented as TT/ET and ST/DYN activities, respectively. Typically, this decision is taken, based on the experience and preferences of the designer, considering aspects like the functionality implemented by the task, the hardness of the constraints, sensitivity to jitter, etc. There exists, however, a subset of tasks/messages which could be assigned to any of the domains. Decisions concerning the partitioning of this set of activities can lead to various tradeoffs concerning, for example, the size of the schedule table or the schedulability properties of the system.
• Determining the optimal structure of the bus access cycle. The configuration of the bus access cycle has a strong impact on the global performance of the system. The parameters of this cycle have to be optimised such that they fit the particular application and the timing requirements at the task level. Parameters to be optimised are the number of static and dynamic phases during a communication cycle, as well as the length and order of these phases. Considering the static phases, parameters to be fixed are the order, number, and length of slots assigned to the different nodes. The optimization problems identified above can be approached once the holistic scheduling technique presented in Section 3 is available. In the next section we illustrate this by considering a particular problem related to bus access optimization.
Bus Access Optimization
We consider an application and an architecture like the one described in Section 2. The designer has mapped the tasks on the nodes of the system and has set the bus cycle according to his best knowledge. After running the holistic scheduling presented in Section 3, it turns out that a correct static schedule for the TT tasks and ST messages has been generated, but the ET task set is not schedulable. One of the reasons for this could be that there is not sufficient bandwidth allocated for the communication of messages between ET tasks. The problem to be solved is to find a structure of the bus cycle such that more bandwidth is allocated to the dynamic phases with the goal to improve the schedulability of ET tasks while maintaining a correct static schedule.
As a first step, the optimization algorithm transforms some parts of the static phases into dynamic phases. For each static slot in the bus cycle and for each round in the static schedule we transform the periodically unused part of the slot into a dynamic phase (see Figure 9 ).
After this initial step, various bus cycle configurations are explored by splitting and merging bus phases. Figure  10 illustrates the operations on dynamic phases. Three possible outcomes are shown for both the splitting and the merging example. We have implemented a simulated annealing based algorithm which applies successive splitting and merging transformations with the goal to improve the schedulability of the ET task set and the constraint of achieving a correct static schedule for TT tasks. The objective function driving the algorithm is the function DSch introduced in Section 3.1
Experimental Results
For evaluation of our scheduling and analysis algorithm we generated a set of 3600 tests representing systems of 2 to 10 nodes. The number of tasks mapped on each node varied between 10 and 30, leading to applications with a number of 20 up to 300 tasks. The tasks were grouped in task-graphs of 5, 10 or 15 tasks. Between 20% and 80% of the total number of tasks were considered as event-triggered and the rest were set as time-triggered. The execution times of the tasks were generated in such a way that the utilization on each processor was between 20% and 80%. In a similar manner we assured that 20% and up to 60% of the total utilization on a processor is required by the ET activities. All experiments were run on an AMD Athlon 850MHz PC. 
Figure 10. Operations on Dynamic Phases
The first set of experiments compares the three versions of the holistic scheduling algorithm we proposed in Section 3.2. In Figure 11 .a) we illustrate the capacity of MxS1 and MxS2 to produce schedulable systems, compared to that of MxS3. For example, in the case of a 60% load, MxS2 was able to generate 18% and MxS1 16% less schedulable solutions than MxS3. In addition, for each heuristic, we computed the quality of the identified solutions, as the percentage deviation of the schedulability degree (DSch MxS ) of the ET activities in the resulted system, relative to the schedulability degree of an ideal solution (DSch ref ) in which the static schedule does not interfere at all with the execution of the ET activities:
In other words, we used the function DSch as a measure of the interference introduced by the TT activities on the execution of ET activities. In Figure 11 .b), we present the average quality of the solutions found by the three algorithms. For this graph we used only those results where all three algorithms managed to find a schedulable solution. It is easy to observe that the solutions obtained with MxS3 are constantly at a minimal level of interference. The heuristics MxS1 and MxS2 produce solutions in which the TT interference is considerably higher, resulting in significantly larger response times of the ET activities and consequently to a decrease of the schedulability degree by 7-13%. Not surprisingly, our experiments prove that the heuristic MxS3 is the most accurate and consequently produces results of the best quality. MxS2, which uses local approximation for the evaluation of the ET schedulability, has a slightly lower quality than MxS1.
In Figure 12 we illustrate the average execution times of the three scheduling heuristics. According to expectations, MxS2 is the fastest of the three heuristics, while MxS 3 is slightly slower than MxS1. In conclusion, the heuristic MxS3 is the one which offers the best solutions at an acceptable computation time. MxS2 is very fast and can be used in certain particular cases like, for example, inside a design space exploration loop with an extremely large number of iterations. MxS3 has been used for the set of experiments presented in the rest of this section.
For the evaluation of the bus access optimization heuristic in Section 5, we generated a total of 400 applications, each of them consisting of 100 tasks mapped on 10 processor nodes. The percentage of ET tasks was 20%, 40%, 60%, or 80% of the total number of tasks. Processor utilisation was 60% or 80%. The bus bandwidth was equally divided between the dynamic and the static phases and the static phase was equally divided in a number of slots identical with the number of nodes. This set of experiments concerns the potential of the bus access optimization discussed in Section 5. For this purpose we selected that part of the generated applications for which the ET component resulted unschedulable. Table 1 shows the results after running our optimization heuristic for this application set. As can be observed, the average improvement of the schedulability obtained by bus access optimization is between 22% and 29% for the tests with balanced numbers of ET and TT activities (the two central columns), with an average optimization time below 6 minutes. For unbalanced distributions, the improvement can be even much larger. As discussed in Section 5, these improvements have been obtained considering only a very limited optimization issue, namely the distribution of bandwidth between the static and the dynamic phases. This demonstrates the huge optimization potential of the different design problems discussed in Section 4.
Finally, we considered a real-life example implementing a vehicle cruise controller and a control application related to the Anti Blocking System. The cruise controller consists of 42 TT tasks mapped over 5 nodes. The second control system consists of 30 ET tasks which are mapped on 3 of the same 5 nodes. Initially, the bandwidth on the communication bus is equally divided between the static and dynamic phases. The scheduling of the system took 4 seconds and resulted in a correct static schedule and an Number of Tasks unschedulable ET domain. After running the bus access optimization, the schedulability (expressed in terms of the function DSch) has improved by more than one order of magnitude, resulting in a completely schedulable system. The optimization was solved in aproximatively 4 minutes.
Interference DSch ref DSch MxS -DSch ref ---------------------------------------------------
Conclusions
Distributed embedded systems based on mixed static/ dynamic communication protocols are becoming a new standard for automotive applications. Such systems typically run applications consisting of both ET and TT tasks. We have presented a holistic scheduling and timing analysis approach for this class of systems. A static cyclic schedule is constructed for TT tasks and ST messages and the schedulability of ET tasks and DYN messages is verified. The static schedule is constructed in such a way that it fits the schedulability requirements of the ET domain. We have identified a new class of system optimization issues typical for the heterogeneous systems considered in the paper. In particular, we have considered a bus access optimization problem and have shown that the system performance can be improved by carefully adapting the bus cycle to the particular requirements of the application.
