We 
Introduction
Modern real-time systems are increasingly becoming distributed and heterogeneous. They consist of multiple processing elements, hardware accelerators, memory units and communication subsystems. Each of these components run tasks that have very diverse execution/communication requirements and timing constraints. They support different scheduling and resource arbitration policies and often interact with the physical world (via sensors and as actuators). As a result, timing and performance analysis of such systems is a challenging problem.
There has been previous efforts to use multiple languages and formalisms for specifying such heterogeneous real-time and embedded systems (e.g. see [20] ). However, the issue of composing different analysis techniques has not been sufficiently explored so far. Recently, there has been some progress on this front. For example, [10] outlined a scheme where some of the components of a system were analyzed using purely simulation-oriented techniques (based on SystemC models), while the remaining components were modeled and analyzed using a set of algebraic equations based on the formalism called Real-Time Calculus (RTC) [4] . The main contribution of [10] was an interface between these two classes of components at the analysis level. While the RTC-based analysis represented the arrival patterns of data and event streams in an abstract fashion, the SystemC-based models required concrete input traces to trigger the simulation. To compose the two analysis techniques, one must be able to inter-convert the two models. Deriving abstract representations of arrival patterns from concrete traces (the SystemC-to-RTC conversion) is relatively straightforward (e.g. by using standard event models such periodic with jitter). However, a transformation in the other direction, i.e. deriving a small set of representative traces from an abstract event model (the RTC-to-SystenC conversion) is a challenging problem, the details of which may be found in [10] . A similar approach was presented in [16] , where certain system components were modeled and analyzed using SDF graphs [9, 11] with the remaining ones being subjected to classical real-time schedulability analysis techniques.
Contributions of this paper:
Following this line of work, in this paper we propose an interfacing technique to compose purely functional analysis schemes with state-based modeling and analysis methods. While the former are based on solving a set of algebraic equations, the latter rely on reachability analysis on the state space of an automatatheoretic model [7] . The systems we analyze are made up of a number of different components that communicate via unidirectional data or event streams which might be buffered. Each of these components is either a processing or a communication element. They support concurrent tasks which might have variable execution demands and process one or more incoming event/data streams.
Suppose a specification of such a system is given in terms of its architecture, the tasks mapped onto each of the components, their execution demands and the scheduling policies used. Further, suppose the timing properties of each incoming stream have also been specified. The goal then is to compute performance metrics such as the maximum endto-end delay suffered by the stream, minimum buffer requirements and the utilization of the different components. The Real-Time Calculus (RTC) framework was proposed earlier to analyze such setups in [4] , which was further extended in subsequent papers (e.g. see [5, 18] ). The key feature of this framework is that it allows a very general modeling of event streams and resource availability beyond the classical event models such as periodic, sporadic, periodic with jitter, etc. Another important feature -which is particularly relevant for performance analysis -is that, rather than recording the precise arrival times of events, RTC uses a count-based abstraction which specifies upper and lower bounds on the possible number of events that can arrive (or can get processed) within any pre-specified time interval length. This count-based abstraction captures in a very natural way, bursty arrival patterns of events/data, variable execution demands and irregular resource availability.
RTC-based models use upper and lower bounds on the number of events that arrive at a component (processing or communication element) to get processed within any time interval of a specified length. Such bounds are represented as functions and are used to estimate the workload to be supported by the component. Similar functions are also used to represent the service offered by the component. While such functional modeling allows for efficient analysis, the main drawback of this approach is that it can not model any statebased information. For example, a common feature such as "the processor stalls when the output buffer is full" cannot be modeled since the service offered by the processor in this case depends on the state (fill level) of the buffer.
To get around this problem, but at the same time retain the count-based abstraction, in [6] we proposed an automata-theoretic model called Event-Count Automata (ECA). ECAs are syntactically similar to timed-automata [1] but have very different semantics. ECA models represent families of arrival patterns of data/event streams as an automaton whose language is the set of integer sequences representing all possible permissible arrival traces. The service offered by a component is also captured in a similar fashion using an automaton. Performance/timing analysis questions related to this model can be formulated as standard model checking problems [3, 8] . This model easily and transparently overcomes the limitations of the RTC framework in terms of expressiveness and the set of analyzable properties. However, as is the case with most state-based modeling techniques, it suffers from the crippling state explosion problem.
In this paper we propose to combine the advantages of RTC and ECA-based modeling. The idea is to subject the components that do not require state-based information to RTC-based modeling analysis and the remaining to ECA-
Model System Architecture P2 Figure 1 . Overview of the proposed scheme. based modeling and analysis. The main contribution of this paper is a combined system-level analysis method derived by creating interfaces between these two types of models.
To bring out the basic ideas underlying our method, consider the architecture shown in Figure 1 . Suppose the processing element P 1 has been modeled as an ECA, perhaps due to the fact the number of items it processes in unit time depends on the fill level of its output buffer B2. Suppose further, the arrival pattern of the data stream being buffered in B1 as well as the pattern of service available at the processing element for this task have been modeled in the RTC framework as the arrival curve α and the service curve β respectively (definitions of which are outlined later in the paper). Finally, assume that items arriving in B2 are further processed by the processor P 2 which has been modeled as an RTC component whose semantics is captured by the function f P 2 .
We first use an RTC-ECA interface to convert α (β) into ECA α (ECA β ). We then analyze the performance of the small network of ECAs consisting of ECA P 1 , ECA α and ECA β . We note that there is no buffering between ECA β and ECA P 1 and in fact, these two automata will be collapsed into a single ECA in the performance model. Next we use an ECA-RTC interface to convert the event stream being buffered at B2 by the ECA network to produce the arrival curve α . This is then combined with f P 2 to perform an RTC-level analysis at the next stage.
In this way, we switch between ECA-based and RTCbased modeling and analysis to obtain a system-level performance analysis method. In a heterogeneous distributed system where there is a good mix of RTC and ECA components, our method will avoid the state explosion that would be caused by converting the whole system into a large network of ECAs. At the same time, it would avoid the accuracy penalty incurred if the system were to be modeled as large network of RTC components. It could also be the case that the pure RTC alternative is not available due to the fact that the semantics of some of the processing elements are state-dependent or because the performance property we are validating cannot be formulated in the RTC framework.
Before concluding this section, we would like to point out that our approach is orthogonal to the previous efforts on full-system performance analysis, for example, using holistic schedulability analysis [2, 12, 13, 14, 17] . Although these efforts were also directed towards analyzing systems with multiple resources and scheduling policies, the underlying analysis techniques used for the different components were similar and were mostly based on classical event models. Our goal, on the other hand, is to combine two fundamentally different performance models for system-level analysis.
Organization of this paper:
The rest of this paper is organized as follows. In the next section, we briefly outline the RTC framework, following which we describe ECAbased modeling and analysis. Sections 3 and 4 contain the core technical results of this paper, i.e. the composition of RTC and ECA models and the construction of the interfaces between them. Finally, the advantages of this combined method are brought out in Section 5 with the help of some experimental results. The paper concludes by outlining some directions for future work.
Performance Models
This section outlines the Real-Time Calculus (RTC) and Event Count Automata (ECA) based modeling and analysis. Figure 2 gives a high-level overview of both the schemes. It shows how a processing element processing an event/data stream is modeled. At the model level, load models describe the timing characteristics of the event streams. Since the analysis aims at providing results for a family of possible input streams instead of a (single) concrete event stream, this information must be represented in an abstract way (e.g. using a periodic or periodic with burst or an even general event model). Service models provide information about the resources available within a system (e.g. their capacities like processor or bus bandwidth). Lastly, processing models represent the processing semantics (e.g. the scheduling and arbitration policies) that is used to execute the application tasks. A performance model of a whole system is obtained by connecting the load models and service models to the corresponding models of processing components. The main distinction between RTC and ECA-based models is that while the former uses functions to represent load, service and processing models, the later relies on explicit state-based (automata-theoretic) modeling. 
Real-Time Calculus
Here, an event stream is described using upper and lower arrival curves, α u (∆) and α l (∆), which provide upper and lower bounds on the number of events that are seen on the event stream within any time interval of length ∆. In particular, there are at most α u (∆) and at least α l (∆) events within the time interval [t, t + ∆) for all time instances t. If R(t) denotes the number of events that arrive at a resource during the time interval
Analogously, a resource is characterized using upper and lower service curves, β u (∆) and β l (∆), which provide upper and lower bounds on the available resource in any time interval of length ∆. The unit of resource depends on the kind of shared resource, for example processing cycles (computation) or bytes (communication), which can be transformed to the same unit as the arrival curves (number of events). The formulation presented in this paper assumes the service curves and arrival curves are being both expressed in terms of number of events. A more technical definition of arrival and service curves is given in the next section. The processing semantics of a processing/communication component (e.g. a processor or a bus) can be captured using an RTC component as shown on Figure 3 . Here, an arrival curve α(∆) enters the component and is processed using the service curve β(∆). The output is an arrival curve α (∆) (which denotes the arrival pattern of the processed events) and the remaining resource is expressed as the service curve β (∆). Internally, the RTC component is specified by a set of functions, that relate the incoming arrival and service curves to the outgoing arrival and service curves. These functions are dependent on the processing semantics of the component (see e.g. [4] ). Similar representations exist for resource scheduling disciplines such as EDF, TDMA, and GPS, when multiple streams or tasks are being processed by the component. The outgoing arrival curves might serve as input to another component, denoting further processing of the event stream. Similarly, the outgoing service curve represents the service that is available to other tasks running on the same component. Various performance properties can be computed analytically using this performance model such as end-to-end delays and buffer requirements. These methods have been implemented as a toolbox which is based on Matlab and Java (see [19] ).
Event Count Automata
An Event Count Automaton (ECA) too describes arrival patterns of event streams (as well as arbitrary service patterns). It may be viewed as a device which records the arrival pattern of a stream and decides to accept or reject the stream. The accepted streams constitute the family of arrival patterns specified by the ECA. An ECA is an ordinary automaton augmented with count variables. These are integer variables that are used to keep track of the number of events that are seen on an event stream (or the number of events that can be processed by the available resource) over certain time intervals. In particular, an ECA consists of a set of states (modes), a set of count variables, and a transition relation that specifies possible transitions from one state to another and a guard associated with each transition which must be satisfied if the transition is to be taken. Each state has a rate vector [l, u] , which specifies that at least l and at most u events arrive in every unit of time when the system is in the state. The guard associated to a transition is a logical conjunction of formulas of the form x ≤ C or x ≥ C, where x is a count variable of the ECA and C is a constant.
ECAs take a transition at discrete time instances. During every unit interval, an ECA stays at its current state and counts the number of events that arrive from the event stream associated with it. At the end of the current unit interval, it increases all its count variables by the number of events that have arrived in this interval. It then checks the guards associated with the out-going transitions from the current state. If one or more of the guards evaluates to true, the ECA will take -instantaneously and nondeterministically-one of the enabled transitions to the next state. If no transition is enabled then, it will stay put in the current state. During this move, it may reset some of its count variables (as specified in the transition) to zero. The purpose of the resets is to adjust the time interval over which the count variable keeps the information. Hence, at any instant, the current value of the count variable x denotes the number of events that have arrived since x was last reset. For modeling convenience, a count variable may also be reset to a non-zero value determined by a linear expression over the count variables. Figure 4 shows an ECA that models an event stream that has at most 7 events arriving in every unit time interval when in mode S 0 , at least 1 event and at most 6 events arrive per unit interval when in mode S 1 and exactly 5 events per unit interval arrive when in mode S 2 . The behavior of an ECA that describes a resource is similar, except that the ECA counts the number of resource units (e.g. processor/bus cycles) available instead of the events seen on a stream. A run of the automaton will consist of a sequence of the form s 0 n 1 s 1 n 2 . . . n k s k where s o is an initial state and n i is the number of events that have arrived during the interval [i − 1, i). If s k is an accepting state, then we say Figure 4 . An example ECA. that the ECA "accepts" the event stream n 1 n 2 . . . n k . It is in this sense, that an ECA defines a family of event streams. More details about the syntax and semantics of ECAs may be found in [6] .
For system-level modeling, an ECA is augmented with a set of input and output buffers, and an update function using which items are removed from the input buffers and added to the output buffers during the execution of a transition. Various scheduling policies can be described using such update functions. The performance model consists of a network of ECAs (augmented with update functions) communicating with each other via shared buffers. Figure 5 depicts the ECA model of a system consisting of two input streams processed by a processor using some scheduling policy. The arrival patterns of the two streams are specified by the ECA 1 and ECA 2 while the semantics of P, the processing element, is captured by ECA P . Such ECA networks can be analyzed by converting them into Petri nets or might serve as input programs for any of the verification tools that perform efficient state space explorations to do model checking. 
Composing RTC and ECA Models
In this section, we describe our compositional method that combines RTC-based and ECA-based modeling and analysis into a single framework. The abstract model for each component in the system can be represented using either the RTC or the ECA formalism, depending on the processing semantics of the computing/communication resource in the component. The performance model of the whole system is obtained by connecting the components together to reflect the flow of data and resources, with interfaces between the RTC components and ECA components. The construction of the performance model for a given system consists of the following steps: (i) Identify all components of the system and based on the processing semantics of the components, select between RTC and ECA mod- (ii) Describe the components using the selected models. The construction of the component models is the same as outlined in Sections 2.1 and 2.2. (iii) Connect the component models using interfaces into a network that captures the performance aspects of the whole system.
Typically, RTC modeling is used for components in which the processing of the streams depends solely on known information about the event streams and the available computing resource. For example, components that process event streams based on their predefined priorities (e.g. rate monotonic, earliest deadline first) or sharing slots (e.g. TDMA). On the other hand, components in which the processing of the event streams depends on state information of the system or synchronization between the streams are modeled using ECAs. Examples of such components are the ones that implement blocking-read/write at some of the buffers, the ones in which the resource given to the streams are adjusted at run-time based on the fill-levels of the buffers (multi-level service) or ones that contain ANDtask activation (synchronization between multiple streams).
RTC and ECA components are connected through the RTC-ECA interfaces. An RTC-ECA interface converts RTC models into ECA models and ECA-RTC interfaces work the other way round. The algorithms for building these interfaces are explained in the next section.
An illustrative example:
We now present an example to illustrate typical hardware/software architectures that can be modeled and analyzed using our approach. Figure 6 shows an architecture consisting of three processing elements P E 1 , P E 2 and P E 3 onto which an MPEG-2 decoder has been partitioned and mapped. The architecture implements a picture-in-picture (PiP) application in which two concurrent video streams are being decoded and displayed on the same output device. The first and the second processing elements implement the variable length decoding (VLD) and inverse quantization (IQ) tasks of the decoder, each of which processes a different video stream. P E 3 implements the inverse discrete cosine transform (IDCT) and the motion compensation (MC) tasks. It processes both the output streams from P E 1 and P E 2 using a Fixed Priority scheduling algorithm.
The first stream (s HR in Figure 6 ) is associated with a higher frame resolution and generates a higher workload on P E 3 , compared to the lower resolution video s LR . The compressed bitstreams arrive over a network and enter the . system after some initial processing at the network interface. They are processed by P E 1 and P E 2 and their outputs, which contain partially decoded macroblocks, are fed as input to P E 3 . The fully decoded video streams after being processed by P E 3 are written into two playout buffers B HR and B LR associated with the high and low resolution streams respectively. These buffers are finally read by the display device at pre-specified constant frame rates r HR and r LR . Here, blocking-write is implemented at the play-out buffer associated with s HR . In particular, P E 3 is allowed to write into buffer B HR only if the buffer is not full. In this system, we are interested in answering questions such as (i) What is the maximum fill-level of the buffer b 4 ? (ii) What is the maximum end-to-end delay of the stream s LR ?
As can be seen from the architecture, the system consists of three components, corresponding to P E 1 , P E 2 and P E 3 . Since the processing of the input stream s HR on P E 1 is done in a greedy fashion and does not depend on any state information, we use RTC to describe the component. Similarly, RTC is used to describe the component corresponding to P E 2 . The processing on P E 3 , however, depends on the internal state of the output buffers, which cannot be modeled using RTC. Therefore, we use ECA to model this component.
Based on the above observations, the two input streams are modeled using arrival curves α HR and α LR . The two resulting arrival curves α 1 and α 2 are then transformed into ECA 1 and ECA 2 , respectively, using the RTC-ECA interface. Figure 7 sketches the performance model of the system. The maximum backlog at b 4 is computed using ECAbased analysis on the ECA component. The maximum endto-end delay of s LR is the summation of the end-to-end delay of α LR on P E 2 component and the end-to-end delay of ECA 2 on the ECA component. Clearly, various other analysis questions can also be answered based on this same performance model. In the next section, we present algorithms for building the above-mentioned interfaces between RTC and ECA components.
RTC-ECA Interfacing
This section presents the construction of the interfaces between RTC and ECA components. In particular, we will show how to construct: (i) from an arrival (service) curve, an ECA which models the same set of arrival (service) pat-terns described by the curve. This is done in Section 4.1.
(ii) from an ECA component, an arrival (service) curve which models all arrival (service) patterns of the output stream (remaining service) of the ECA component. This is done in Section 4.2.
From RTC to ECA
Our translation relies on a specific representation of arrival/service curves, which we first describe. A discrete arrival and service curve, say γ(∆), ∆ ∈ N ≥0 , can be compactly represented as a tuple v = {Σ A , Σ P } where y 1 ), (x 2 , y 2 ) , . . . , (x n , y n ) with x i < x i+1 is a segment sequence describing an initial aperiodic part of the curve γ(∆) for 0 < ∆ ≤ x n , and
is a segment sequence describing a periodic part of the curve γ(∆) that starts after the aperiodic part, i.e. ∆ > x n .
The periodic part of γ is defined by the period P = x p m − x n and the vertical offset E = y p m − y n between two consecutive repetitions of the periodic section defined by Σ P . The first occurrence of Σ P starts at (x n , y n ). The curve γ(∆) corresponding to v can now be determined as follows: The constraints given by an upper (lower) service curve corresponding to v are exactly the same as that of an upper (lower) arrival curve, except that the conditions are on the number of events that can be processed by the resource.
Given a compact representation v of an arrival (service) curve, we would like to construct an ECA A corresponding to the curve. In particular, an arrival (service) pattern of a given stream (resource) should satisfy the curve v if and only if it is an event stream accepted by A. Since guards on count variables are the means for an ECA to allow/reject a particular arrival pattern, we need to find a suitable set of count variables for A, the necessary and sufficient conditions on these variables under which an arrival pattern is said to satisfy v, i.e. accepted by A.
Constructing an ECA corresponding to an arrival curve: Let v = {Σ A , Σ P } be the compact representation of an arrival (service) curve as described above. The ECA that models the event streams con- 
R is the reset of the count variables associated with the transitions. The reset of the count variables f j and g h along each transition (s k , s k ) is defined as: Lemma 4.1. Let v be an arrival curve, and A be the corresponding ECA constructed using the procedure described above. Let τ = e 1 e 2 e 3 . . . be an event stream, with e i denoting the number of events that arrive in the time interval [i − 1, i). Then, τ satisfies the constraints given by v iff τ is accepted by A.
For the detailed technical basis of the above construction and a proof of this lemma, we refer the reader to www.comp.nus.edu.sg/∼phanthix/papers/rtss07.pdf. Figure 9 . The ECA model of the arrival curve v.
The ECA that models the upper service curve corresponding to v is exactly the same as the ECA which models an upper arrival curve v, with a different interpretationthe automaton counts the number of events which can be processed by the resource, and hence it models all possible service patterns described by v. As an example, Figure 9 shows the ECA corresponding to an arrival curve
With a minor modification, we can also construct an ECA that models a given lower arrival (service) curve. In particular, the ECA in this case has a similar structure as the one modeling the upper curve, with the same set of count variables. The guards and resets associated with the transitions are modified as follows: (i) Replace "max" by "min" in all the resets g h ← max(g h − E, f xn ). (ii) Replace the "≤" sign by "≥" in all the guards. The correctness of this construction follows the same reasoning as the one used for the upper bound case.
From ECA to RTC
Here, given an ECA component, we would like to compute a safe approximation 1 of the arrival curves (α u , α l ) of an output stream and the remaining service curves (β u , β l ) of a processing element from the component. For simplicity of exposition, the class of curves constructed in this section is a subset of the general curves presented in previous section. In particular, an approximated curve will have an initial aperiodic part and the periodic part will contain just a single entry (a long term period). This class of curves is also sufficient for representing most practical applications.
To reconstruct the arrival curves, we first introduce a specification size W , up to which the aperiodic part is defined. For each window of size ∆, 0 < ∆ ≤ W , we compute the maximal and minimal number of output events seen in each window, resulting in α u (∆) and α l (∆), respectively. 1 ECAs are more expressive than arrival/service curves. In addition, to capture the long-term behavior (long-term average arrival rate) of the stream, we select a window size W * which is large enough (W * W ) and compute the arrival curves for this window size. The arrival curves are then safely extended for arguments larger than W as follows:
These curves can now be refined by guaranteeing superand subadditivity (i.e.
. To this end, we perform a stepwise update of the upper and lower arrival curves for all ∆, starting at ∆ = 2:
The computation of the remaining service curves is done in the same manner by observing the service left from the component instead of the output stream. We compute the maximal and minimal number of output events seen in any interval of a specified size ∆ using a model checking procedure which we now outline. First we augment the ECA component with an ECA which observes the number of events written into the output buffer at every time unit. As shown in Figure 10 , this ECA has two states and two count variables t, s.
The count variable t is used as the timer whereas s is used to count the number of output events from the instant the timer is reset. The variable r on the figure is the number of output events written into the output buffer observed at every unit interval. Initially, the observer ECA is in "idle" state. At the end of every time unit, it non-deterministically chooses to stay put at the current state or move to the "busy" state.
When the ECA moves to the "busy" state, it resets the count variable t (the timer) and s. When the ECA is in "busy" state, it increases s by the value of r (the number of output events observed) at the end of each time unit. When ∆ time units have elapsed, the ECA moves back to the "idle" state. Since the rate vector associated with both states is [0, 0], the ECA generates no events and therefore when t = ∆, s contains the number of events of the output stream in an interval of length ∆. Since the interval is non-deterministically chosen by the ECA, the maximum and minimum values of s are the maximum and minimum number of events in any interval of length ∆, or α u (∆) and α l (∆), respectively. Using standard verification tools, we can verify if s is at least C 1 and at most C 2 , where C 1 ≤ C 2 are two prespecified values. This can be achieved by model checking the following LTL specification: G(s ≥ C 1 ∧ s ≤ C 2 ). The maximum and minimum values of s are then obtained using binary search on the values of C 1 and C 2 .
In addition, there are components which can be modeled using RTC, even though the analysis result thus obtained will be a pessimistic one. For example, in the AND-task activation example that we consider in Section 5.1. Now suppose that we need to compute the arrival curve of the output event stream of an ECA, then we may be able use RTC to model the component and use RTC analysis technique to derive the output arrival curve. The computed curves can be used as the initial value for the curves that we need to compute. The above procedure is then run on these initial bounds to get a tighter result.
Thus, the use of RTC for computing initial approximations is not applicable to all components but only for components which can pessimistically be modeled using RTC. When it can be used, the curve construction process will be speeded up.
Experimental Results
In this section, we illustrate how our compositional method can help to improve the performance analysis of practical real-time systems using two examples. The first example concerns a simple artificial system with AND-task activations, whereas the second one looks into the example Picture-in-Picture (PiP) application already described in Section 3. Our analysis aims at answering two important questions: (i) what is the maximum backlog at a buffer in the system? and (ii) what is the end-to-end delay of a stream? Two tools have been used to perform the necessary calculations: the MPA/RTC Toolbox [19] and the Symbolic Analysis Laboratory (SAL) [15] . We used Perl script to describe the interface between RTC and ECA components and to combine the analysis results from the two components.
The freely-available Real-Time Calculus (RTC) Toolbox is a toolbox for system-level performance analysis of distributed real-time and embedded systems. Built on top of Matlab, the toolbox supports most min-plus and max-plus algebra operators and it provides a library of functions for Modular Performance Analysis with Real-Time Calculus (see [4] ).
The Symbolic Analysis Laboratory (SAL) is a verification tool for transition systems. The key parts of the tool consist of an intermediate language for describing transition systems and a collection of state-of-the-art LTL model . AND-task activation. checkers. An ECA component can be easily described in the tool as a composition of different modules, each of which is an ECA. The count variables of an ECA are coded as integer variables and the guards on the ECA transition are written as guarded commands. Finally, the system properties of an ECA component can be defined as temporal logic formulas, which will be verified using one or a combination of the three model checkers: the symbolic model checker based on BDDs, the bounded model checker based on SAT (Boolean Satisfiability Problem) solving, and the infinite bounded model checking based on SMT (Satisfiability Modulo Theories) solving.
Case study 1: AND-task activation
The system contains two processing elements PE 1 and PE 2 , which run two tasks T 1 and T 2 . Two input streams, s 1 and s 2 enter the system at PE 1 and PE 2 , respectively. s 1 will be first processed by T 1 and the partially processed stream (denoted as s 1 ) then enters PE 2 . At PE 2 , the AND-task T 2 is only activated if there are events from both streams s 1 and s 2 . When activated, it takes an event from each stream, processes them and produces an output event for the playout buffer P B. The system is depicted in Figure 11 . In this example, both streams are periodic with burst/jitter and the processing elements have a bandwidth of one (i.e. one event can be processed per unit time). Since task T 1 processes a single input stream s 1 , it can be easily analyzed using Real-Time Calculus. Task T 2 , however, requires synchronization/correlation between the streams, which cannot be naturally modeled using Real-Time Calculus. As a result, we decide to use RTC to model PE 1 and ECA to model PE 2 and connect them using an RTC-ECA interface. This interface converts the arrival curves of the output stream s 1 after being processed by T 1 into an ECA, which will be fed as input to the ECA component. Figures 12 and 13 sketch the maximum backlog at T 2 and end-to-end delay of s 1 (on the y-axis) with respect to different execution times of T 2 (on the x-axis)for the three methods: (1) RTC for both PEs (2) ECA for both PEs and (3) RTC for PE 1 and ECA for PE 2 . Since the ECA-based only method performs an exhaustive search, its analysis result is always tight. The RTC-based only and the combined RTC+ECA methods both compute safe upper bounds of the backlog and delay; however the result obtained using only RTC is much more pessimistic than using RTC+ECA, as shown on the figure. On the other hand, although the ECA-based analysis produces optimal results, it requires long running times. Using our combined RTC+ECA approach, we are able to reduce the analysis time by an average of 23% for the calculation of the maximum backlog of T 2 and by 78% for the end-toend delay calculation of s 1 , as shown in Figure 14 . In other words, while achieving optimal or nearly optimal results, our method is substantially faster than using an ECA-only modeling and analysis. Note that the interface computation is done offline and was not counted into the analysis time. Although this may add additional cost to the running time of our method, we only need to do it once, before the analysis is repeated for several values of system parameters (e.g. different execution times of T 2 ). 
Case study 2: PiP application
In this case study, we look at the PiP application described in Section 3. Our main objectives are to estimate the maximum fill-level of the buffer b 4 and the end-to-end delay of the lower resolution stream s LR (please refer to Figure 6 ) with different sizes of the playout buffer B HR . In our experiments, both the incoming streams have the same frame resolution of 704×576 pixels (the display device is assumed to perform window resizing for the video stream being displayed at the smaller window). They arrive at the system at a constant bitrate of 8 Mbps and the playout buffers are read at a constant rate of 25 frames/second. The frequencies for PE 1 , PE 2 and PE 3 are set as 1.3 GHz, 1.25 GHz and 3 GHz, respectively. Figures 15 and 16 show the maximum backlog at buffer b 4 and the end-to-end delay of the lower resolution stream for different sizes of the playout buffer PB HR . As shown in these figures, the analysis result obtained using our method is relatively close to the optimal result given by ECA-only analysis. On the other hand, as can been seen in Figures 17  and 18 , our method provides a substantial improvement in terms of analysis time. Note that the comparison with RTConly analysis is omitted as the system cannot be modeled using this approach. In summary, we have shown through two examples that by using our compositional approach, we are able to model systems which cannot be modeled using an RTConly framework, or which suffer from substantially long analysis times when modeled using ECAs only. Our proposed approach leads to acceptable accuracy (in terms of performance estimation) and at the same time has reasonable run times.
Concluding Remarks
In this paper we proposed an interfacing technique for composing functional and state-based performance models for distributed real-time systems. Given the growing complexity and heterogeneity of real-time and embedded systems, we believe that composing such fundamentally different modeling and analysis techniques will increasingly become important. As already mentioned, our current implementation relies on a number of independently-developed tools -such as the RTC Toolbox (developed in Matlab and Java) and SAL -which have been hooked together using Perl script. We plan to further develop this tool-chain, and at the same time add user-interfaces to aid in the specification of large-scale real-time systems. We also plan to provide hooks to such a tool-chain that may be used to connect other related tools, such as those for architecture design space exploration.
