Cyber-Physical Systems require distributed architectures to support safety critical real-time control. Hermann Kopetz' Time-Triggered Architectures (TTA) have been proposed as both an architecture and a comprehensive paradigm for systems architecture, for such systems. To relax the strict requirements on synchronization imposed by TTA, Loosely Time-Triggered Architectures (LTTA) have been recently proposed. In LTTA, computation and communication units at all triggered by autonomous, non synchronized, clocks. Communication media act as shared memories between writers and readers and communication is non blocking. In this paper we pursue our previous work by providing a unified presentation of the two variants of LTTA (token-and timebased), with simplified analyses. We compare these two variants regarding performance and robustness and we provide ways to combine them.
INTRODUCTION
Embedded electronics for safety critical systems has experienced a drastic move in the last decade, particularly in industrial sectors related to transportation (aeronautics and space, automobile, and trains, trams, or subways). In the past, each different function required its own set of sensors and actuators, its controller, and its dedicated set of wires. This architecture, referred to as Federated Architecture, has proved safe and robust by ensuring built-in partitioning between the different functions. Federated Architectures could not be sustained in the late 90's, however, due to the drastic increase in number and complexity of functions and their interdependence. This lead to shifting to Integrated Architectures [19] where several functions are hosted in a same computing unit and some functions are distributed across different computing units. Computing units as well as communication media can be standardised, thus allowing for drastic reduction in computing devices and wiring. While this move from Federated to Integrated Architectures opens new possibilities for further increase of embedded electronics in future embedded systems, it raises a number of challenging issues: The folding of different functions over shared computing units and the sharing of communication media can cause undesired interferences; Since system integration involves a mix of hardware, communication infrastructure, middleware, and software in a complex way, mismatch and failure to meet overall requirements emerges as a high risk at that very late stage of system development; Since the overall system design relies on a layered view of the system, with several levels of abstraction corresponding to different computing or communication paradigms, it is not clear at all how detailed design can indeed match system level specifications.
To address these problems as a whole, studies regarding System Architecture have been developed since the late 80's. Most remarkable is the Time-Triggered Architecture (TTA) developed by Hermann Kopetz and his school [16, 17] . TTA builts on a vision of the system in which physical time is seen as a first class citizen and as a help, not an ennemy. The Model of Computation and Communication (MoCC) of TTA is that of strong synchrony: the system is equipped with a discrete logical time, that is consistently maintained throughout the overall system. Strong synchrony is achieved by maintaining strictly synchronized physical clocks throughout the distributed architecture, up to a certain maximum accuracy -which in turn specifies the finest granularity of the discrete time in a TT Architecture. Having the precise MoCC of strong synchrony makes the deployment of an application easy, provided that the latter is also based on the same MoCC. Fortunately, Simulink/Stateflow and Scade, which are standard tools in use in these industrial sectors, are examples of formalisms obeying the synchronous MoCC. In addition, TTA offers more possibilities to address the above discussed difficulties. Firstly, time can be used as a help in building fault tolerance services with its redundancy management and fault detection and mitigation. Secondly, time is also a help for partitioning, and for integrating components visible through their interfaces: Time Division Multiplexing (TDM) is a well established technique to grant a function access to communication or computing. TDM is also at the very core of task scheduling.
However, the TTA approach carries cost and timing penalties that may not be acceptable for some applications. Indeed, jitter with smaller delays is preferred to fixed but longer delays for distributed control applications [3] . Also, TTA is not easily implementable for long wires (such as in systems where control intelligence is widely distributed) or for wireless communications. Finally and most importantly, re-designs are costly, due to the need for a global re-design of Time-Division multiplexing of the different functions or tasks. Hence, even for the safety critical hard real-time layers where TTA seems appropriate, it may not always be accepted.
Hence, there has been growing interest in less constrained architectures, such as the Loosely Time-Triggered Architecture (LTTA) [6] . LTTA is characterized by a communication mechanism, called Communication by Sampling (CbS), which assumes that: 1/ writings and readings are performed independently at all nodes connected to the medium, using different local clocks; and 2/ the communication medium behaves like a shared memory. See Figure 1 for an illustration. LTT architectures are widely used in embedded systems industries. The authors are personally aware of cases in aeronautics [21] , nuclear, automation, and rail industries where the LTTA architecture with limited clock deviations has been used with success. It is indeed the architecture of choice for railway systems, in which tracks are used as the communication medium and computing systems are carried by the trains and work autonomously.
By not requiring any clock synchronization, LTTA is not blocking both for writes and reads. Hence, risk of failure propagation throughout the distributed computing system is reduced and latency is also reduced albeit at the price of increased jitter and drift [3] . However, data can be lost due to overwrites or alternatively duplicated because reader and writer are not synchronized [2, 22, 9] . Issues regarding the use of LTTA for distributed continuous control are discussed in [5] . If, as in safety critical applications that involve discrete control for operating modes or protection handling, data loss is not permitted, then special techniques must be developed to preserve the semantics of the specification.
The LTT bus based on CbS was first proposed in [6] and studied for a single writer-reader pair; [20] proposes a variation of LTTA where some master-slave re-synchronization of clocks is performed. LTT architecture of general topology was studied in [2, 22] , using techniques reminiscent from back-pressure [8, 7] and elastic circuits [10] . In a different direction, [18] developed an alternative approach where upsampling is used in combination with "thick" events as a way to preserve semantics. This approach, which is more time-based as compared to [2, 22] , was further developed and clarified in [9] .
In this paper, we cast the two variants of LTTA in a unified framework. We simplify the analyses of [22] and we compare the respective merits of these two variants. Finally we advocate blending them for different layers of the architecture and show how this can be safely done.
The paper is organized as follows. LTTA and Communication by Sampling are presented in Section 2. This is followed in Section 3 by the definition of a synchronous application and the problem of the preservation of synchronous semantics at LTTA deployment. The next two sections are the core of this paper. The two variants of LTTA are developed: back-pressure based in Section 4 and time based in Section 5. The two architectures are compared in Section 6 and we discuss why it would make sense blending them, and how this can be performed. Finally the simplifying assumption we consider is relaxed in Section 7.
LTTA AND ITS ARTIFACTS
Communication by Sampling (CbS). For each variable x, y, or z, there is one shaded bus behaving as a shared memory, one writer and zero or more readers.
LTTA relies on Communication by Sampling, which is illustrated on Figure 1 and formalized as the following set of assumptions: Assumption 1. Figure 2 : Sensing multiple signals, for distributed clocks subject to independent drifts and jitters. Referring to Figure 1 , we show the case of A1 reading two boolean inputs originating from A2 and A3, respectively, and computing their conjunction. Cases 1 and 2 correspond to two different outcomes for the local clock of A1.
on Figure 2 for the case of combinational functions. We show here the case of A1 reading two boolean inputs at and bt originating from A2 and A3, respectively, and computing their conjunction at ∧ bt. Cases 1 and 2 correspond to two different outcomes for the local clock of A1. Observe that the result takes three successive values f, t, and f for case 1, whereas case 2 yields the constant value f. The origin of the problem is that events attached to different signals can be separated by arbitrary small time intervals -in Figure 2 the problem comes from the very close jumps for at and bt. Increasing the clock rate cannot prevent this from occurring. Due to the above artifacts, the semantics may not be preserved while deploying an application over an LTT architecture and protocols are needed to cope with this issue. These are presented and studied in Section 4 and subsequent ones.
DEPLOYMENT AND SEMANTICS PRE-SERVING
In this section we formalize the problem of preserving the semantics of a synchronous application when deploying it on an LTT Architecture.
The Synchronous Application
We are given an underlying set Z of variables. Our specifications for discrete controllers are modeled using dataflow diagrams. That is, they consist of a network of computing nodes of the following form:
In (1), k is the discrete time index, u 1 , . . . , u p , v 1 , . . . , v q ∈ Z are the input variables of the node, X ⊆ Z is the tuple of state variables of the node, y ∈ Z is the output variable of the node, and f, g are functions. This model can capture multiple-clocked systems simply by extending the domains of variables with a special value denoting absence. Absence can be reasoned about using type systems such as Lustre or Signal clock calculi. "Absence" values are not published.
Nodes N1, . . . , Nn can be composed by output-to-input connections to form systems, i.e., networks of nodes, denoted by S = N1 . . . Nn. Systems can be further composed in the same way, denoted by S1 . . . SI .
Node (1) is abstracted as the following labeled directed graph:
A branch y UD ← X indicates that y depends on X through a Unit Delay, whereas a branch y ← v indicates a direct dependency. Systems S = N1 . . . Nn are abstracted as the union of the associated graphs G(S) = G(N1) ∪ · · · ∪ G(Nn), and the same holds, inductively, for G(S) when S = S1 . . . SI . Combinational loops are prohibited:
Assumption 3. We require that no loop exists in G(S) involving branches not labeled by delay symbols UD.
Referring to decomposition S = S1 . . . SI , consider G(S) = G(S1) ∪ · · · ∪ G(SI ). Using Assumption 3, erasing, in G(S), branches labeled by delay symbols UD yields a partial order denoted by . For z a vertex of G(S), define its level (z) as being
The level is illustrated on Figure 3 . 
LTTA Deployment and Semantics Preserving
Deploying the application S1 . . . SI of formula (2) over a strictly synchronous architecture is straightforward. In such an architecture, computing units and communication media are all triggered by a unique periodic global clock. The different computing units compute in lock steps -we call them reactions -according to the global clock. Each node is assigned some computing unit for its execution. Then, the computation of the different variables is scheduled within each reaction, by respecting partial order . In this case, variables are updated at each reaction. We can instead cluster a (possibly variable) number of successive clock ticks together to form macro-reactions and update the variables once in each macro-reaction. Now, inside each macro-reaction, one need to schedule the computation of the different variables by respecting partial order . This leaves room for computing different variables at different clock ticks.
All the above designs implement correctly the application function, which consists in mapping input streams to output streams. We say that application semantics is preserved.
The question is: if, instead, deployment is performed over the LTT Architecture of Figure 1 , how can application semantics be preserved? As extensively discussed in Section 2, CbS communication by itself will not offer this. In the next two sections we propose two different protocols on top of the CbS infrastructure to ensure semantics preserving. We now complete this section with further assumptions and notations.
Further Assumptions and Notations
The following assumptions will also be considered, see 
and communication delays are bounded from above:
Assumption 6. For each computing unit, executions take at most one clock cycle and a computing unit which starts executing freezes its input data.
We stress that communicated variables are not updated at each reaction of the application, but only when required by the application. Consequently, a processor does not know a priori which variable update it is supposed to see at a given reaction.
Regarding notations, x, y, z, X, etc. denote variables. Nodes are indexed by 1, . . . , n. We can always group the output variables of a node into one tuple, which we regard again as a single variable. Thus, without loss of generality, we can assume that each node writes into a single variable labeled with the index of the node.
In the following sections we present two protocols that were proposed on top of CbS communication in order to ensure that the deployment of synchronous applications over resulting LTT Architectures preserves the semantics. The first protocol is an adaptation of elastic circuits in hardware, and the second one is a softening, time based, adaptation of the original TTA. For each of them we indicate the needed assumptions.
BACK-PRESSURE LTTA
Assumptions: Throughout this section, Assumptions 1 and 6 must hold regarding the architecture. The application for deployment satisfies Assumptions 3 and 4.
Elastic circuits were proposed in [11, 14, 10] as a semantic preserving architecture in which Kahn Process Network [15] type of execution is performed using bounded buffers. This is achieved by relying on a mechanism of back-pressure [7] by which readings from a buffer by a node is acknowledged to the writer using a reversed virtual buffer. Petri net Nji of Figure 4 depicts how a link j → i with a 2-buffer is implemented in an elastic circuit for running a synchronous application with a 1-delay communication. Back-pressure places and arcs of this net are dashed, to distinguish them from the corresponding direct places and arcs, which are solidsolid and dashed places and arcs both obey the usual net semantics. Only direct places model data communication, back-pressure ones are there to prevent from buffer overflow at the link j → i.
In our study, however, we cannot make direct use of elastic circuits since the activation of nodes in elastic circuits is triggered by tokens, not by autonomous non-synchronized quasi periodic clocks as in LTTA. To adapt to the constraints of LTTA, the authors of [22] have proposed to enhance elastic circuits with a skipping mechanism that we present now under the name of back-pressure LTT Architecture. The model of this architecture is developed using Petri nets, which we assume 1-safe throughout this section. Some slight deviation from Petri net semantics will be needed to capture priority arising in certain conflicts.
Modeling the links. Modeling the nodes. Reactions at node i are captured by the net Ni shown on Figure 5 , which is composed of a two-step "read; write" together with a local skipping mechanism. The following holds regarding this skipping mechanism, which was proposed in [22] in order to avoid computing units getting blocked -blocking is replaced by skipping:
The skipping mechanism is triggered by the local clock at node i.
Transitions with labels ri and wi have priority over transition with label skip i .
Observe that, while net Ni in isolation evolves according to a purely logical time, net Ni is triggered by the local clock κ i of the considered node. The composition indicated on figure 5 by the symbol × is by superimposing transitions having same label, thus forcing the synchronisation of the corresponding transitions in the composed nets. In particular, when clock κ i of node i has a tick, then action ri or wi is fired if enabled, and otherwise skip i is fired, expressing that node i keeps silent at that tick. 
where the product is obtained by superimposing transitions having identical labels. Net N (without the skipping mechanisms) yields the elastic circuit implementing the original synchronous application according to the Kahn Process Network semanticswhich is known to preserve synchronous semantics. Observe that net N exhibits no conflict and is thus an event graph (also sometimes called marked graph). With our assumption of single-delay communications, net N is indeed 1-safe.
On the other hand, net N is the back-pressure net modeling our back-pressure LTTA, which involves the skipping mechanism. In the remainder of this section, we first analyse the preservation of synchronous semantics by N and then we study its performance.
Preservation of synchronous semantics
This result was first proved in [22] , for more general architectures. We give here a very simple and direct proof, by explicitly using the associated elastic circuit N . Theorem 1. Net N preserves synchronous semantics.
Proof. Net N is indeed an elastic circuit which is known to implement a synchronous program with Kahn Network Process semantics; 2-bounded buffers can be used on the links since direct links all have a logical 1-delay by Assumption 4.
1 Observe that this first property only relies on assumptions 3-6; it does not use Assumption 1 nor conditions (5,6) regarding clocks and communication delays.
Let LN be the language of net N , i.e., the set of all its firing sequences, and similarly for L N . The following fairness condition is assumed:
it is not possible that transition skip i of the skipping mechanism fires repeatedly for ever.
Using (11), the projection of the language LN over alphabet {ri, wi | i = 1, . . . , n} coincides with the language L N , which proves the preservation of synchronous semantics.
Observe that fairness condition (11) is indeed much weaker than the conjunction of (7) and Assumption 5.
Performance bounds
Assumptions: Assumption 5 is in force for the derivation of performance bounds.
Let us first focus on elastic circuit N defined in (9) . Assume the following conditions for this elastic circuit -they are similar to (5) and (6): (5') there exist lower and upper bounds T min and T max for the interval between any two successive firings of a black transition related to node i (it can be a read ri or a write wi); this is captured by assigning these bounds for the duration of the two transitions wj and ri in Figure 4 .
(6') the time spent in direct or back-pressure places of Nji is bounded by τ max, for any link j → i (direct places are the solid ones in figure 4 , whereas back-pressure places are dashed). Corresponding bound is assigned to the time spent in the place depicted in thick blue in Figure 4 .
Following classical results on event graphs [13] (Chapter 6.7, p247), [4] (Chapter 2.5) or (max,plus) algebras [12] (Chapters 21-26), worst case throughput λ N of net N is given by the minimal ratio number of tokens/time over all cycles of the event graph, that is:
Next, net N consists in adding, at each node i of net N , the skipping mechanism shown on figure 4. Now, assume that conditions (5) and (6) hold for net N , namely:
(5) there exist lower and upper bounds Tmin and Tmax for the interval between two successive firings of the skipping mechanism at any node; (6) the time spent in any place of Nji is bounded by τmax for any link j → i.
We claim that, when synchronizing with all the local skipping mechanisms, net N inherits the following values for its bounds T max and τ max mentioned in (5') and (6'):
Indeed, bound (13) is reached by node i having slowest clock, since this node does not need to skip. Bound (14) is reached when the latest token reaches an input place of node i but net Ni fired just before. Combining (5',6'), (12) , and (13,14) yields:
Theorem 2. The worst case throughput λN of net N is 1/λN = 4Tmax + 2τmax
Recall that Theorem 1 only requires fairness condition (11) and Theorem 2 only requires the upper bounds in (5,6), but not the lower bounds. Performance results are provided in [22] for more general architectures (with arbitrary buffer sizes). However, the proof we give here is much more straightforward. General architectures are analysed in Section 7.1.
Issues of blocking communication.
The skipping mechanism ensures that computing nodes themselves never get blocked due to the failure of other nodes or communication. However, net N exhibits blocking read communication between the different computing nodes of the architecture. This means that, when focusing on the effective communication of fresh data at a given node, blocking does still occur in net N . This observation actually motivated considering the alternative, time-based, LTT Architecture that we propose and analyse in the next section.
TIME-BASED LTTA
Assumptions: Assumptions 1-6 all hold throughout this section, for both the preservation of semantics and performance bounds.
Time-based LTTA relies on an original idea of P. Caspi [18, 9] . Aim of this protocol is to ensure a clean alternation of writing and reading phases throughout the architecture, see Figure 6 . The synchronization principle used in time-based LTTA is illustrated on Figure 7 . Figure 7 shows two time lines, for two communicating nodes (1) and (2) . Ticks of the local clocks are figured by the short thick vertical bars (note the jitter). Magenta rectangles depict reading-and-computing periods; they correspond to r k−1 , r k , r k+1 in Figure 6 . At the beginning, node (1) is the fastest. Thus, it waits for a certain amount of ticks and then it publishes its computed value; this is indicated by the red dashed arrow pointing to (2) . Upon noticing this publication, node (2) responds with its own publication, indicated by the blue dashed arrow pointing to (1) . Meanwhile, node (1) keeps frozen to make sure that node (2) had enough time to publish its own fresh value, if any (fresh values need not be produced at every reaction of the synchronous application). Then it can repeat reading-and-computing. In the second round, node (2) is the fastest and publishes first. And so on.
Observe that the two publications in blue are not based upon time. They rather react to noticing a publication by another node. The key observation is that fast nodes slow down by waiting a number p of ticks of their local clocks, whereas slow nodes accelerate by actively synchronizing over fast nodes' publications. To summarize, in order to achieve sufficient synchronization to preserve synchronous semantics, tokens are used for speeding-up only whereas time is used for slowing-down. The key issue is to tune p to the smallest value that is sufficient to ensure Figure 6 (actually we will use different waiting times for the writing and reading periods). The architecture implementing this protocol is detailed next, by successively describing its links and nodes, using again the same Petri net framework.
Modeling the links. Figure 8 shows net Mji, which models CbS communication for directed link from node j to node i -note the read arc 2 ingoing to transition ri. In this net, reads and writes can occur concurrently and asynchronously.
. . . Focus on the half token-ring M w i of Figure 9 , which controls writes by node i. Label "∀j : Πj" indicates that a token is put in special place labeled Πj for every j. Observe that the places labeled w , has priority over the conflicting private transition.
The special square-round shaped place Πj exists for all j.
We take the special convention that tokens are overwritten (not added) in the special place Πj, so this place holds 0 or 1 token. Place Πj stores the information that some node has published after having spent enough ticks. Focus next on figure 10 . Publications are controlled by the nets P w ik , k ∈ {1, . . . , q − 1} (left) and P r i (right). The labels on the arcs indicate the amount of tokens needed in the preset of this arc for the postset transition to firelabel 0, 1 means "0 or 1". Each firing consumes the token, if any. Thus the special place Πi has its token consumed by node i when writing strictly before last stage q. In contrast, when writing at last, the special place Πi is reset whatever its content is -this is to avoid a node self-reacting to its own publication.
The protocol sitting at node i is then modeled by the net
where the product is by superimposing nodes (both transitions and places) with identical label. 3 Nodes with no label are considered private. Regarding triggering policy in net Mi, the following holds:
When enabled, transitions are triggered by the local clock of node i.
We show in Figure 11 the result of performing product (16) for the case of two nodes i = 1, 2, and p = q = 2. The role of the different components of Mi is as follows: considering the right hand side of (16) The network. The overall net modeling time-based LTTA is given by
Note that the time-based LTT Architecture involves no skipping mechanism. Let L N be the language of net N defined in (9) and similarly for LM. Project LM over subalphabet {ri, w 1 i , . . . , w q i | i = 1, . . . , n} by erasing transitions not belonging to this set. Then, identify, in this projected language, the conflicting transitions w k i for k = 1, . . . , q by renaming them all wi and we finally call the resulting language LM.
Preservation of synchronous semantics
We then need to translate to net M assumptions 1-6 regarding the architecture and conditions (5) and (6) -the place of net Mji for each link j → i; -publication places Πi, for i = 1 . . . n.
If the above assumptions regarding net M are in force, then the following theorem holds, which expresses the preservation of synchronous semantics:
Theorem 3. The following conditions on integers p and q ensure that LM = L N :
Proof. The conclusion of Theorem 3 (namely, that LM = L N holds) is an immediate consequence of the following two properties, which hold for i = 1 . . . n, see Figure 6 : Property 1 (reads). The kth firing of transition ri occurs only after after all places j ∈ {1. . .n} as in Figure 9 have been written k − 1 times.
Property 2 (writes).
The kth firing of one of the (conflicting) transitions w 1 i . . . w q i occurs only after transitions rj, j = 1 . . . n have all been fired k times.
We prove the above two properties by induction over k > 0. Assume they are true up to k − 1.
Suppose the first (k − 1)st writing by some node occurs at real-time t. Then we claim that the last (k − 1)st writing by some node occurs at latest at time min (t + qTmax , t + τmax + Tmax)
The first term in the min corresponds to an "autistic" node i that sees no publication and thus writes by firing transition w q i after having performed its (k − 1)st reading ri, which must have occured before t by induction hypothesis. To derive the second term in the min, pick a node that is "latest to awake": this node just missed the publication by the earliest node, which was made available at latest at t + τmax. It can notice this publication at latest within one period Tmax and then it fires.
Then, the earliest kth reading cannot occur before t + pTmin. Hence condition (19) , which expands as pTmin ≥ τmax + Tmax, ensures that property 1 keeps valid at the kth round. On the other hand, the latest kth reading cannot occur later than t + min (qTmax , τmax + Tmax) + pTmax
Finally the earliest kth writing cannot occur earlier than qTmin after the earliest kth reading. Hence it cannot occur before t + pTmin + qTmin. Hence condition (20) , which expands as qTmin ≥ τmax + p(Tmax − Tmin), ensures that property 2 keeps valid at the kth round.
Performance bounds
Performance bounds are easily derived from the conditions of Theorem 3, which are assumed to be in force:
Theorem 4. Worst case throughput λM of net M is given by 1/λM = (p + q )Tmax, where p and q are the optimal values for p and q according to inequalities (19) and (20) .
Issues of blocking communication
Since net Mji has a read arc, it exhibits no blocking read. On the other hand, net Mi defined in (16) possesses a circuit shown in red in Figure 11 . This circuit, however, involves no synchronization outside node i. It can therefore never be blocked. Hence, net M is free from blocking communication between different nodes. If a node or communication link fails by getting silent, then the nodes reading from that node or that link will just proceed to their local computations using old data provided as backups by CbS communication, in combination with fresh data from live links and nodes. Time-based LTTA is thus fully non-blocking, although the application enters a degraded mode in case of silent failure of a node or link, by using outdated data from its failed input links. 
HYBRID LTT ARCHITECTURES
In this section we compare the above two types of LTT Architectures and study their blending.
Discussion and comparison

Throughput
Let us first start with some comparison regarding throughput. Firstly, lower bound for throughput in the back-pressure architecture is always given by Theorem 2, that is 1/λN = 4Tmax + 2τmax. On the other hand, from Theorem 4, performance of the time based architecture depends on the type of situation:
For low-level safety-critical real-time control, delay and jitter are non-zero but small relative to nominal period. This then yields 1/λM = 4Tmax. Observe that this outperforms back-pressure architecture and that transmission delays do not matter as long as they remain small.
Another situation of interest is when communications are distant, reflected by 
Robustness
Recall that net N is the right abstraction for communication in back-pressure architecture. This net is subject to blocking communication. This means that, if one node gets stuck, then all nodes will keep skipping for ever, thus computing with outdated constant values and outputting nothing. The overall application is then stuck, despite computing nodes are not blocked. This implies that time-based monitoring must be added on top of the architecture, e.g., by means of watchdogs that can be used by neighbor nodes to detect the fail-stop of one node. This, however, causes slow-down and loss of performance.
In contrast, time based architecture can still survive in degraded mode without any slow-down in case a node has experienced fail-stop. Outdated values will be used by its neighboring nodes but the rest of the system is still at work.
Flexibility
Back-pressure architectures are very flexible since semantics preserving does not depend on their timing characteristics. In particular, hardware characteristics can be changed without retuning the back-pressure protocol. The same holds regarding the adding or removal of nodes and links in the application or architecture.
Summary
Since the two architectures are similar in performance regarding timing, the preferred architecture depends on the relative importance of robustness versus flexibility for the application at hand. Since complex applications typically combine both cases for different parts of the system, it makes sense to consider blending the two architectures. How can this be performed?
Blending the two architectures
In the architecture, partition the links into time-based ones and back-pressure based ones. Nodes that are adjacent to at least one time-based link are marked time-based as well and are implemented according to net Mi defined in (16) . Other nodes are marked back-pressure based and are implemented according to net Ni of Figure 5 .
For the links, several cases occur. Homogeneous links, for which all adjacent nodes are of the same kind as that of this link, are implemented according to net Nji of Figure 4 or net Mji of Figure 8 , depending on the case. Now, heterogeneous links, which are adjacent to nodes of different kinds, must be slightly adapted. For a heterogeneous link j → i ending at a node i marked time-based, its associated net is obtained by equipping net Nji with a skipping mechanism at its sink transition labeled ri; this mechanism ensures that node i can perform its reads at its own pace, according to time-based protocol. Symmetrically, for a heterogeneous link j → i originating from a node j marked time-based, its associated net is obtained by equipping net Nji with a skipping mechanism at its source transition labeled wj; this mechanism ensures that node j can perform its reads at its own pace, according to time-based protocol.
In the resulting hybrid architecture, clocks of time-based nodes and delays of time-based links are subject to the conditions of Theorem 3. In contrast, no condition is required for the clocks and delays of the back-pressure part -of course, the performance depends on them.
EXTENSION TO GENERAL ARCHITEC-TURES
In this section we indicate how to modify the results of the previous sections when Assumption 4 is relaxed.
Back-Pressure LTTA
Corresponding study was already performed in [22] . We, however, like to extend our simpler analysis to this case as well. To this end, we simply replace net Nji of Figure 4 by the net with same name in Figure 12 . Observe that the link of Figure 12 captures pipelining. Buffer size must be w j r i Figure 12 : Back-pressure net Nji associated to a directed link j → i of the architecture, for the case of a buffer size 5 and 2 delays. For colors and thickness of circles, see figures 4 and 5. not smaller than the number of delays. Having done this we can again carry on the study of section 4. Theorem 1 is still valid. On the other hand, Theorem 2 regarding performance must be reformulated as follows. Consider net N defined as in (9) but with links Nji redefined according to the principles of Figure 12 . In this figure, the two places (depicted in thick blue lines) are assigned a delay τmax -this is to capture worst transmission delay, for both data and back-pressure information. Construct net N according to formula (9) . For each circuit σ of net N , define
where T(σ) = def number of (read or write) transitions in σ P(σ) = def number of thick places in σ and T max and τ max are defined in (13, 14) . Then, let κ( N ) be the maximum of all κ(σ) for σ ranging over all circuits of N . Using the same references as for deriving (12) , the following theorem holds, regarding performance:
Theorem 5. The worst case throughput λN of net N is 1/λN = κ( N ). The throughput can be computed in O((n + m)
3 ) where n is the number of nodes in the network and m is the total number of tokens of the event graph.
Time-Based LTTA
In this section we relax Assumption 4. However, our study does not encompass pipeling (in contrast to the previous section, which was fully general). The study of pipeling in time-based LTTA requires further investigation. Relaxing Assumption 4 requires retuning parameters p and q of the protocol of section 5, using the notion of level introduced in (4) . Let L be the maximal level in the considered synchronous application. Then the synchronous semantics is preserved if every reaction follows the schedule r0w0r1w1. . .rLwL, where r and w denote the readings and writings by nodes of level , respectively, see Figure 13 . This figure also illustrates the principle for tuning the parameters p and q. Parameter p must ensure that, for a node of level , reading will keep frozen while w , r +1 , w +1 , . . . , rL, wL, r 0 , w 0 , . . . , w −1 is being performed, where "prime" refers to the next reaction. It is indeed enough to enforce the following properties: Property 3 (reads). The kth firing of transition ri for a node of level occurs only after all places j ∈ {1. . .n} as in Figure 9 have been written k − 1 times and places j of level < have all been written k times. 
Then, the earliest kth reading by a node of level cannot occur before t + pTmin. Hence the following condition p ≥ (L + 1)(τmax + Tmax) Tmin
which expands as pTmin ≥ (L + 1)(τmax + Tmax), ensures that property 3 keeps valid at the kth round. On the other hand, the latest kth reading cannot occur later than t + min (qTmax , τmax + Tmax) + pTmax ≤ t + τmax + (p + 1)Tmax (26)
Finally the earliest kth writing by a node of level cannot occur earlier than qTmin after the earliest kth reading.
Hence it cannot occur before t+pTmin +qTmin. Hence condition (20) , which expands as qTmin ≥ τmax + p(Tmax − Tmin), ensures that property 4 keeps valid at the kth round.
Theorem 6. Conditions (25) and (20) ensure LM = L N , which expresses the preservation of synchronous semantics.
Finally, Theorem 4 regarding performance still holds, but with the values for p and q being given by Theorem 6.
CONCLUSION
H. Kopetz' TTA was the first proposal for a MoCC-based architecture suited to distributed hard real-time systems involving feedback control. Of course, as explained in, e.g., [17] , TTA cannot be used as the single architectural paradigm in a complex, multi-layered, embedded system.
LTTA was proposed as a softening of TTA for the very same layers. In fact, the objective of LTTA is to offer an abstraction that emulates TTA. This paper has unified the work done on LTTA [22, 9] , by proposing a single framework where the two existing variants of LTTA can be cast. This study reveals that the back-pressure based LTTA is more flexible but less robust against failures than time-based LTTA. It makes therefore sense to use different versions of LTTA for different parts of the system. We thus have proposed a way of blending the two architectures while maintaining the essential properties of preservation of semantics. Further work is needed, regarding time based LTTA, to address heterogeneous infrastructures where ensuring the global bounds arising in Assumption 5 may be a problem.
