It is difficult to construct correct models for distributed large-scale service-oriented applications. Typically, the behavior of such an application emerges from the interaction and collaboration of multiple components/services. On the other hand, each component, in general, takes part in multiple scenarios. Consequently, not only components, but also their interaction protocols are important in the development process for distributed systems. Coordination models and languages, like Reo, offer powerful ''glue-code'' to encode interaction protocols. In this paper we propose a novel synthesis technique, which can be used to generate Reo circuits directly from scenario specifications. Inspired by the way UML2.0 sequence diagrams can be algebraically composed, we define an algebraic framework for merging connectors generated from partial specifications by exploiting the algebraic structure of UML sequence diagrams.
Introduction
Service-oriented applications consisting of services that may run on large-scale distributed platforms are notoriously difficult to construct. It is well-known that most service-oriented applications rely on a collaborative behavior among their constituent services/components, and this implies complex coordination. Therefore, construction of these applications crucially depends on deriving a correct coordination model that specifies the precise order and causality of the actions of their constituent services. For example, in an online banking scenario, a user can log into the system only after the account information such as the account number and password are verified to be valid. Given the strong role that coordination of components/services plays in such applications, important questions from the software engineering perspective include:
• What are the connectors in an application that coordinate the behavior of its components/services? • What does a service oriented development process look like? • How can one systematically generate connectors from interaction specifications?
In this paper we address these questions by using Reo as the coordination language in service oriented applications, and show how correct Reo circuits (connectors) can be synthesized automatically from scenario-based interaction specifications.
Scenarios represent a global view of interactions among the components (in the broadest sense) within a system. Each scenario corresponds to a single temporal sequence of interactions among system components/services and provides a partial system description. Scenarios are close to users' understanding and they are often employed to refine use cases and provide an abstract view of the system behavior. In recent years, scenario based languages such as UML Sequence Diagrams (SDs) [30] , message sequence charts (MSCs) [17, 18] , and Live Sequence Charts (LSCs) [13] , have become popular for expressing behavioral requirements of applications. In this paper we focus on scenarios represented as UML sequence diagrams. However, our synthesis approach can be easily generalized to use other alternatives, such as HMSC [28] .
The idea of using scenario descriptions, such as UML SDs, to generate operational models and/or executable code, of course, is not new [15, 16, 25, 32] . We briefly describe some related work in this area in Section 6. However, most of the existing work takes an endogenous approach for coordination.
As an example, consider a use case scenario in a simple bank ATM, that involves a user, an ATM, and a number of remote processes, including a PIN verifier. The scenario describes that after feeding his card into the ATM, (1) the ATM asks the user to enter his PIN; (2) the ATM sends the user ID and his PIN to the PIN verifier; (3) the PIN verifier verifies the PIN to determine the validity of the access request; (4) the PIN verifier sends its (Allow/Deny/Confiscate) response back to the ATM; and (5) depending on the content of the response, the ATM proceeds to either allow or deny user access, or confiscates his card, presumably, after 3 unsuccessful attempts entering a wrong PIN. The common view of the transformation of this scenario to executable code yields an ATM process and a PIN verifier process that directly communicate with each other: the ATM process contains a ''send ⟨ID, PIN⟩ to PINverifier'' instruction somewhere in its code, implementing step 2; and the PIN verifier process contains a ''send response to ATM'' instruction somewhere in its code, implementing step 4, above.
These direct communication instructions implement the coordination protocol described in the scenario in an endogenous form.
Endogenous models implement/express a protocol only implicitly, through fragments of code in disparate entities that are hardwired to specifically realize that protocol. Suppose now that in a later version of this system, it is decided that the messages sent to the PIN verifier process must also be sent to some monitoring process, or instead of the PIN verifier to another more sophisticated process (e.g., one that tells the ATM to confiscate the user's card if the number of successive wrong PIN access attempts by the same card ID through all involved ATMs within, say, a 24 h sliding window, exceeds a threshold). Such changes to the protocol can easily be reflected in the SD specifications. However, implementing them requires invasive changes to a variety of independent software units that comprise the participating processes; worse, these changes may necessitate other less obvious changes that affect other software units and processes that are not directly involved in the modified portion of the protocol. Thus, small, ''local'' changes to a protocol can propagate through large spans of software units, touching them in ways that may invalidate their previously verified properties. Not only are such invasive modifications generally undesirable, in many cases they are impractical or even impossible, e.g., when they involve legacy code or third party providers.
Alternatively, the exogenous view of a scenario (e.g., the EnterPwd SD in the top right corner of Fig. 4 ) imposes a purely local interpretation on each inter-process communication, implementing it as a pure I/O operation on each side, that allows processes to communicate anonymously, through the exchange of untargeted passive data. For instance, Fig. 1 shows the behavior skeleton of the four processes involved in the EnterPwd SD, mentioned above. Observe that these processes are not hardwired to directly communicate with each other. Replacing exchanges of targeted messages with simple I/O localizes the range of their impact. This makes processes engaged in a protocol oblivious to changes in the protocol and their peers that do not directly impact their own behavior. Having expunged all communication/coordination concerns out of the parties involved, exogenous models relegate the task of conducting the required coordination to a (centralized or distributed) coordinator glue code that establishes the necessary communication links among the parties and engages them in the specified protocol. Reo is a good example of an exogenous coordination language that can be used to develop such glue code.
The scheme that we advocate in this paper for generating coordination models from UML sequence diagrams uses the exogenous view in its interpretation of these scenario specifications. To our knowledge, this approach is novel and no other work has considered scenario specifications for exogenous coordination. Our approach is structural and embodies the advantages inherent in the exogenous models of coordination: coordinated processes are strictly isolated from the dependencies on their environment and the details of the protocol that do not involve their individual behavior. This leads to more reusable processes, and an independent, explicit coordination protocol that can in turn be reused to coordinate these or similar processes.
In [4] the problem of synthesizing Reo circuits from given automata specifications is discussed. In [7] we provide an approach for synthesizing constraint automata from scenario specifications represented as UML sequence diagrams. However, taking constraint automata as the bridge between scenarios and connectors, may introduce superfluous serialization in the synthesis process and result in unnecessarily complex Reo circuits, even for simple scenarios. Our work in the remainder of this paper goes one step further toward bridging the gap between low-level implementations and scenario-based specifications, by generating Reo circuits directly from UML SDs. This approach can reduce the redundancy in our previous work and make the resulting Reo circuits more compact and efficient. Furthermore, the proposed translation from UML SDs to Reo circuits in this paper is structural and therefore preserves the nature of the interaction specification inherent in a UML SD. This is an advantage over the translation from constraint automata to Reo circuits which cannot recover parallelism. In fact, in constraint automata, parallelism is lost and impossible to retract.
There is plenty of research on providing semantics for UML 2.0 sequence diagrams [14, 26, 33] . Here we use the coalgebraic approach in [33] for defining the semantics of basic sequence diagrams and formally define the operators for combining sequence diagrams. As pointed out in [33] , the coalgebraic approach is compositional, and leads to the coinductive proof style which provides an elegant way to check the bisimulation and refinement relations between models. On the other hand, the coalgebraic semantics of Reo is investigated in [6] . However, in [6] , the semantics of a Reo circuit is given not as a single coalgebra, but as a relation on the timed data streams (final coalgebras) on different nodes. This semantics precisely specifies the initial intuition behind Reo connectors. The declarative, relational nature of this semantics is one of its strengths; nevertheless, it also makes it difficult to operationalize and build formal proofs of bisimulation/simulation relations. Fortunately, the operational semantics of Reo has been investigated in [9] by using an automata-based formalism, called constraint automata. In a constraint automaton, states represent Reo configurations (which are determined by the contents of the buffers) and transitions encode maximally-parallel stepwise evolution. Transition labels show maximal sets of active nodes and sets of data constraints. It is well known that state-based systems such as automata and transition systems, can be described by coalgebras [19, 31] . Therefore, in this paper we adopt such a coalgebraic interpretation of Reo circuits, where the state space of a coalgebra has the same meaning as in its constraint automata semantics, which turns the coalgebra into an abstraction of its corresponding constraint automaton. One benefit of using this coalgebraic interpretation is that the correctness of the mapping from UML to Reo in our synthesis approach can, in principle, be formally judged by comparing their semantics.
The remainder of this paper is organized as follows: Section 2 contains a brief summary of Reo. In Section 3 we present the relevant features of UML Sequence Diagrams. We explain the construction of Reo circuits from given scenario specifications represented by UML Sequence Diagrams in Section 4. In Section 5 we prove the correctness of our synthesis approach by providing a bisimulation between the coalgebras for UML Sequence Diagrams and the synthesized Reo circuits. In Section 6, we present related work and compare it with our approach. Finally, Section 7 concludes the paper.
Reo
Reo [2] is a channel-based exogenous coordination model wherein complex coordinators, called connectors, are compositionally constructed from simpler ones. We summarize only the main concepts in Reo here. Further details about Reo and its semantics can be found in [2, 6, 9] .
Complex connectors in Reo consist of a network of primitive connectors, called channels. A connector provides the protocol that controls and organizes the communication, synchronization and cooperation among the components/services that it interconnects. Each channel has two channel ends. There are two types of channel ends: source and sink. A source channel end accepts data into its channel, and a sink channel end dispenses data out of its channel. It is possible for the ends of a channel to be both sinks or both sources. Reo places no restriction on the behavior of a channel and thus allows an open-ended set of different channel types to be used simultaneously together. Each channel end can be connected to at most one component instance at any given time. Fig. 2 shows the graphical representation of some simple channel types in Reo. A FIFO1 channel represents an asynchronous channel with one buffer cell which is empty if no data item is shown in the box (this is the case in Fig. 2 ). If a data element d is contained in the buffer of a FIFO1 channel then d is shown inside the box in its graphical representation. A synchronous channel has a source and a sink end and no buffer. It accepts a data item through its source end iff it can simultaneously dispense it through its sink. A lossy synchronous channel is similar to a synchronous channel except that it always accepts all data items through its source end. The data item is transferred if it is possible for the data item to be dispensed through the sink end, otherwise the data item is lost. For a filter channel, its pattern P ⊆ Data specifies the type of data items that can be transmitted through the channel. Any value d ∈ P is accepted through its source end iff its sink end can simultaneously dispense d; all data items d / ∈ P are always accepted through the source end, but are immediately lost. The P-producer is a variant of a synchronous channel whose source end accepts any data item, but the value dispensed through its sink end is always a data element d ∈ P.
There are some more exotic channels permitted in Reo: (A)synchronous drains have two source ends and no sink end. No data value can be obtained from drains since they have no sink end. A synchronous drain can accept a data item through one of its ends iff a data item is also available for it to simultaneously accept through its other end as well, and all data a b accepted by the channel are lost. An asynchronous drain accepts data items through its source ends and loses them, but never simultaneously. (A)synchronous Spouts are duals to the drain channels, as they have two sink ends. Complex connectors are constructed by composing simpler ones via the join and hiding operations. Channels are joined together in nodes. A node consists of a set of channel ends. The set of channel ends coincident on a node A is disjointly partitioned into the sets Src(A) and Snk(A), denoting the sets of source and sink channel ends that coincide on A, respectively. Nodes are categorized into source, sink and mixed nodes, depending on whether all channel ends that coincide on a node are source ends, sink ends or a combination of the two. The hiding operation is used to hide the internal topology of a component connector. The hidden nodes can no longer be accessed or observed from outside. A complex connector has a graphical representation, called a Reo circuit, which is a finite graph where the nodes are labeled with pair-wise disjoint, non-empty sets of channel ends, and the edges represent their connecting channels. The behavior of a Reo circuit is formalized by means of the data-flow at its sink and source nodes. Intuitively, the source nodes of a circuit are analogous to the input ports, and the sink nodes to the output ports of a component, while mixed nodes are its hidden internal details. Components cannot connect to, read from, or write to mixed nodes. Instead, data-flow through mixed nodes is totally specified by the circuits they belong to.
A component can write data items to a source node that it is connected to. The write operation succeeds only if all (source) channel ends coincident on the node accept the data item, in which case the data item is transparently written to every source end coincident on the node. A source node, thus, acts as a replicator. A component can obtain data items, by an input operation, from a sink node that it is connected to. A take operation succeeds only if at least one of the (sink) channel ends coincident on the node offers a suitable data item; if more than one coincident channel end offers suitable data items, one is selected non-deterministically. A sink node, thus, acts as a non-deterministic merger. A mixed node nondeterministically selects and takes a suitable data item offered by one of its coincident sink channel ends and replicates it into all of its coincident source channel ends. At most one component can be connected to a (source or sink) node at a time. The I/O operations are performed through interface nodes of components which are called ports.
Example 1 (Sequencer). Fig. 3(a) shows an implementation of a sequencer by composing five synchronous channels and four FIFO1 channels together. The first (leftmost) FIFO1 channel is initialized to have a data item in its buffer, as indicated by the presence of the symbol e in the box representing its buffer cell. The actual value of the data item is irrelevant. The connector provides only the four nodes A, B, C and D for other entities (connectors or component instances) to take from. The take operation on nodes A, B, C and D can succeed only in the strict left-to-right order. This connector implements a generic sequencing protocol: we can parameterize this connector to have as many nodes as we want simply by inserting more (or fewer) Sync and FIFO1 channel pairs, as required. Fig. 3(b) shows a simple example of the utility of the sequencer. The connector in this figure consists of a two-node sequencer, plus a pair of Sync channels and a SyncDrain channel connecting each of the nodes of the sequencer to the nodes A and C , and B and C , respectively. The behavior of the connector can be seen as imposing an order on the flow of the data items written to A and B, through C : the sequence of data items obtained by successive take operations on C consists of the first data item written to B, followed by the first data item written to A, followed by the second data item written to B, followed by the second data item written to A, and so on.
In the remainder of the paper, we discuss the synthesis problem of Reo circuits where the input specification of the desired coordination is given by UML sequence diagrams, as presented in the next section.
UML sequence diagrams
UML Sequence Diagrams are used to model the dynamic behavior of systems. Graphically, a UML SD has two dimensions: an horizontal dimension representing the components participating in the scenario, and a vertical dimension representing time. Every component has a vertical dashed line called its lifeline. SDs focus on the message interchange among a number of lifelines. An SD describes an interaction by focusing on the sequence of messages exchanged during a system run. See Fig. 4 as an example of sequence diagrams that describe the interactions in the login phase of an online banking scenario. A UML SD is represented as a rectangular frame labeled by the keyword sd followed by the name of the interaction. The vertical lines in the SD represent lifelines for the individual participants in the interaction.
A message defines a particular communication between lifelines of an interaction. It can be either asynchronous (represented by an open arrow head) or synchronous (represented by a filled arrow head). Additionally, there are two special kinds of messages: lost (respectively, found) messages (not shown in Fig. 4 ), which are denoted by a small black circle at the end (respectively, the start) of the arrow representing the message. Note that what we are interested in is the coordination between components/services, so we consider only a subset of the UML2.0 SDs. What is left out of this subset consists of the following two parts:
(1) The internal behavior of processes represented as actions within the same lifelines in SDs (like the check action in Fig. 4) will not be considered in the synthesis process. This is justified because we intend the synthesized coordination in our approach to reflect only the inter-process interaction in a system. We specifically do not wish to obtain a global state machine that intertwines both the behavior of the components in a system and the interactions among them. (2) The neg, ignore and critical operators on sequence diagrams. These are not operators for composition of sequence diagrams, since they take as input a single sequence diagram and ''refine'' its semantics either by permitting more behaviors or by ruling out certain behaviors.
The internal behavior or internal actions within a lifeline in an SD (such as check in SD EnterPwd in Fig. 4 ) constitute an assumption that the component (Bank in Fig. 4 ) must fulfill. For the example, the check in EnterPwd means that after receiving the message verifywithBank, the bank must check the validity of the account number and password pair. This can be modeled by a constraint automaton for the bank. For the synthesis of the Reo circuit that captures the inter-process communication in the system, this constraint automaton is irrelevant. However, to verify the UML SD of a system via Reo model checkers, the automata for the system components (like the bank) may become important as well, for instance, to prove certain properties using the assume/guarantee paradigm. Given that in this paper we focus on the construction of the Reo circuit, we ignore such internal behavior, and assume that the automata for the components still can (and will) be used for the analysis of UML SDs.
Syntax
The signature of a basic UML sequence diagram is defined as follows: 
where for a set S, card(S) returns the cardinality of S.
•
• is a relation such that a tuple (l 1 , ⟨m, t⟩, l 2 ) ∈ E represents a message m exchanged between l 1 and l 2 , and t denotes the type of the message (synchronous or asynchronous).
• ≤ ⊆ Loc × Loc is a partial order capturing the relative positions of locations within each lifeline in the diagram.
Note that in general, for a relation in E to represent a communication between participants in a sequence diagram, its source and target locations cannot be the same, i.e., the following property is assumed:
Observe that this assumption does not rule out a process sending a message to itself: it only requires for the source and the target locations, which may still reside on the same lifeline, to be different.
Within this model the following function returns the next location in a particular lifeline. Formally,
where < stands, as usual, for the largest irreflexive subset of ≤.
Let l 1 , l 2 range over Loc, and Σ m be the set of communication events executed concerning messages exchanged in a sequence diagram sd. For every relation (l 1 , ⟨m, t⟩, l 2 ) ∈ E, where ⟨m, t⟩ ∈ Mes, there are two events in Σ m , corresponding to the sending and receiving of m, which occur at locations l 1 and l 2 respectively. Each event e ∈ Σ m has one of the following forms:
(1) ⟨l 1 → l 2 , m⟩ -l 1 sends the asynchronous message m to l 2 , (2) UML SDs may contain sub-interactions called interaction fragments that can be structured and combined using interaction operators. There are several possible operators in UML for composing sequence diagrams, such as alt, opt, strict, par, seq and loop. Depending on the operator used, an interaction fragment contains of one or more operands. For loop and opt, the fragment has exactly one operand, while the other operators have several operands. 1 1 As mentioned earlier, a few more operators are given in [30] than the ones we consider here, for instance neg, ignore and critical. These operators take as input a single sequence diagram and ''refine'' its semantics either by permitting more behaviors or by ruling out certain behaviors, but they are not used for composition of sequence diagrams. The negative operator neg designates that the fragment represents traces that are invalid; the ignore operator designates that there are some messages that are not shown within the fragment, which are insignificant and can appear anywhere in the traces; the critical operator designates that the fragment represents a critical region, which means that the traces of the region cannot be interleaved by other event occurrences (on those lifelines covered by the region). Such operators provide some sort of coercion which restricts or expands the underlying possible behaviors. They are useful, for example, for verifying system properties and test case construction. We can easily handle the cases for these operators in our framework. For simplicity, we will not consider these operators further in this paper. Definition 2. The syntax for sequence diagrams is defined as follows:
where sd denotes a basic sequence diagram, and alt, par, strict, seq, opt and loop are the interaction composition operators.
Semantics
The semantics for a sequence diagram sd = (I, Loc • , Mes, Loc ini , loc, E, ≤) can be defined in terms of coalgebras as (C, ⟨ϵ, α⟩ :
, where C is the set of the possible configurations of sd, ϵ : C → P(Σ m ) denotes the set of active events in a configuration, and α : C −→ C Σ . Thus, the semantics of a sequence diagram is a coalgebra of the functor
together with an initial configuration c 0 , which is given by the tuple of initial locations c 0 = ∏ l∈Loc ini l. The set of initially active events is
A configuration of a sequence diagram denotes a global state, composed of the local states of its participants. For every configuration, there is a set of active events that may happen in that configuration.
Definition 3.
A configuration c of a sequence diagram is a tuple of the local states (locations) of its participants.
Suppose C denotes the set of all possible configurations, For any event e ∈ Σ m , the location at which e happens is defined by π(e) = l iff e = ⟨l(· · · )⟩. This notation generalizes for a set of events
We now define α, the curried version of α, by enumerating all possible transition schemes. For synchronous messages, the events modelling both sending and receiving occur simultaneously (i.e., in an atomic, non interruptible way): no other event can occur in between. So if the current configuration is c and both the sending event e = ⟨l 1 l 2 , m⟩ and its corresponding receiving event e = ⟨l 2 l 1 , m⟩ are active, i.e., e ∈ ϵ(c), e ∈ ϵ(c), then we have
For asynchronous messages, however, when the sending event occurs, the location of the sender will be updated to the next location in its lifeline, while the locations of the other participants remain unchanged. The sending event is therefore removed from the set of active events. On the other hand, the corresponding receiving event will be added to this set. Furthermore, the events at the next location of the sender's lifeline will become active in the new configuration. If
Dually, when an asynchronous message is received, the receiver changes to the next location in its lifeline, while the locations of all other participants remain unchanged. Formally, if e = ⟨l 1 ← l 2 , m⟩ is active in configuration c, we have
The case of a lost message, represented by the event e = ⟨l → •, m⟩, is similar to the asynchronous communication:
the sender updates its location and e is removed from the set of active events. However, no corresponding receiving event becomes active. Similarly, for a found message, when a receiving event e = ⟨l ← •, m⟩ occurs, only the location of the receiver is updated and e is removed from the set of active events. Both cases are, therefore, handled by
assuming the corresponding events are enabled in configuration c. Additionally, note that for a coregion in one lifeline in a sequence diagram the order of event occurrences is not significant. In our model, a coregion is taken as one location. Let l be the location for a coregion, we use Σ l = {e | π (e) = l} to denote all the events that are active in the coregion. If e ∈ Σ l , then
The semantics of an interaction fragment depends on the operator used in its definition, as informally described in the UML superstructure specification [30] . These operators have been formally investigated in [33] , which leads to an algebra for building new sequence diagrams from existing ones. The denotations given in the following paragraphs are similar to those in [33] for the operators. However, since the functor type is different than the one in [33] , 2 the definitions of all the
, the definitions are given as follows: Strict sequential composition: strict(sd 1 , sd 2 )
The transition structure in
where the transition structure is defined as
The purpose of opt(sd 1 ) is to offer an alternative between an empty scenario (in which 'nothing happens') and the activation of its (sole) operand, sd 1 . To formalize its meaning we need to introduce a new event -skip -to capture the absence of effective behavior. The event skip does nothing but terminates successfully. Then
The transition structure is defined as
2 The reasons for the difference are: (1) the functor used in [33] also takes internal actions within one lifeline into consideration, but this is not useful for our synthesis of connectors; (2) the functor makes explicit that a set of enabled events is present in the initial state, i.e., before any interaction occurs; and (3) the carrier of the corresponding final coalgebra takes a quite simple form ν = P(Σ m ) Σ * , i.e., functions that relate each Σ-trace to the set of enabled events upon completion of its execution. 3 To avoid an excessive notational burden, we use the same syntax for the combinator over sequence diagrams and its denotation in the proposed semantics.
Semantically, an option is equivalent to a nondeterministic choice between two operands where one operand has non-empty content and the other is empty. So when opt(sd 1 ) is at c 0 , the choice between skip and e ∈ Σ 1 is nondeterministic.
Choice: alt(g
Denoting an alternative form of aggregation of sequence diagrams, it requires that c 
where x is a configuration in C , and e is an event in either Σ 1 or Σ 2 .
Weak sequential composition: seq(sd 1 , sd 2 )
The case for weak sequential composition seq(sd 1 , sd 2 ) for sd i , i = 1, 2 is a bit more demanding because its definition depends on whether the operands share any lifelines. If such is the case, then for every identifier s ∈ I 1 ∩I 2 , all event occurrences on s in sd 1 must happen before those on s in sd 2 . However, all other events in sd 1 and sd 2 on lifelines not in I 1 ∩ I 2 may occur in any order. Note that if the operands involve disjoint sets of participants (i.e., I 1 ∩ I 2 = ∅), the weak sequencing reduces to a parallel merge.
Assume an identifier s, such that I 1 ∩ I 2 = {s}, and the functions loc 1 and loc 2 assigning locations to the instances in sd 1 and sd 2 , respectively. Let loc(s) = loc 1 (s) ∪ loc 2 (s). Furthermore, and without loss of generality, let C 1 = loc 1 (s) × L and C 2 = loc 2 (s)×K be the set of configurations for sd 1 and sd 2 respectively, where
The transition structure is given by
We use ϵ as an abbreviation for seq(ϵ 1 , ϵ 2 ), and for any c ∈ C ,
This definition can be easily generalized to an arbitrary number of shared lifelines in sd 1 and sd 2 .
Loop: loop(g : sd 1 )
Finally, the semantics of the iteration combinator is given by
, and with the following transition structure
From UML sequence diagrams to Reo
We now address the issue of constructing Reo circuits from scenario specifications represented by UML SDs. Since the source and the sink nodes of a Reo circuit are used for components to exchange data through write and take operations, we first need to identify the node set N of a circuit involved in an interaction. Assume the participants (components) involved in the interaction are represented by the set of lifelines L = {p 1 , . . . , p n }. For simplicity, and without loss of generality, we assume every component has only one input and one output port connected to the corresponding sink and source nodes of the Reo circuit. Therefore, our starting point is a description of a component connector by its source nodes C 1 , . . . , C n and sink nodes D 1 , . . . , D n , such that each component p i can write messages to the node C i and take messages from the node D i .
Additionally, the interaction behavior coordinated by the connector is described by a set of UML SDs.
In the sequel, let N = {C 1 , . . . , C n } ∪ {D 1 , . . . , D n } contain all nodes attached to the components involved in a scenario specification, where we assume that the C i 's are source nodes and the D j 's are sink nodes. Our goal is to construct a Reo circuit R with source nodes C 1 , . . . , C n and sink nodes D 1 , . . . , D n , such that the behavior represented by the scenario specification is permitted by the communication protocol encoded in R.
For the construction of R, we first consider the construction of Reo circuits for basic sequence diagrams without interaction operators. Assume that there are n lifelines p 1 , . . . , p n in a basic SD. Every lifeline p i represents an individual participant in the interaction, and we can derive an order of event occurrences along the lifeline, which is significant as it denotes the order in which these events occur.
Reo circuits for individual participants
The first step in our synthesis approach consists of deriving a sequencer for every lifeline p i in a basic sequence diagram.
If there are k events (sending and receiving of messages) on a lifeline p i (and thus k locations l 1 , . . . , l k ), then the sequencer corresponding to p i also has k nodes (e.g., A 1 , . . . , A k ), the order of which corresponds to the order of the events/locations on p i . Without loss of generality, we assume here that the component/service/process that implements the behavior described by p i produces/consumes all of the k messages corresponding to these k events through two separate, dedicated I/O ports (e.g., C i for output port and D i for input port). Next, we link each of the two ports of the process that implements p i to its corresponding nodes of the sequencer using synchronous drains and either synchronous or filter channels. If the event corresponding to a location l j involves the sending of a message m, then a filter with pattern m is used to link the node (output port) C i to the synchronous drain channel connected to that location's respective sequencer node A j . On the other hand, if the event involves receiving a message, then a synchronous channel is used to link the node (input port) D i to the synchronous drain channel connected to that location's respective sequencer node A j . Note that when a lifeline has only one location, i.e., there is only one sending or receiving event on the lifeline, there is no need for a sequencer; in other words, a one-node sequencer (and its attached synchronous drain) all degenerate into a single node. 5 shows an example of our approach. There are two participants p 1 and p 2 involved in the scenario. The interaction between them is shown by the messages m 1 , m 2 and m 3 , which are all synchronous in this example. We first consider p 1 , whose behavior involves first sending message m 1 , then receiving message m 2 and finally sending message m 3 , sequentially. There are three events happening on the lifeline, so we introduce a 3-node sequencer, where the first and the last nodes are connected to the node C 1 by two filters respectively, and the node in the middle is connected to D 1 via a synchronous channel. The patterns m 1 and m 3 on the filters ensure that p 1 can write only these two messages out, and the synchronization between the nodes of the sequencer and the nodes of the channels connected to C 1 and D 1 ensures that the sending of m 1 happens first. Note that on the p 1 side, there is no restriction on the channel A 2 D 1 , like a filter, to ensure that the message received through this channel is m 2 . This is guaranteed by the filter in the synthesized part for p 2 , as shown in Fig. 6 . In other words, the type of a message is always guaranteed by its sender.
Reo circuits for basic SDs
After we derive the sequencers for all participants from their respective lifelines in isolation, we can connect their respective nodes A i , B j , . . . pairwise by synchronous or asynchronous Reo channels, according to the types and the order of messages, as defined in a basic SD. If (l 1 , ⟨m, t⟩, l 2 ) ∈ E represents the exchange of a message m between location l 1 and l 2 , m is a synchronous (respectively, asynchronous) message, then a synchronous (respectively, asynchronous) channel is used to link the nodes corresponding to l 1 and l 2 . The direction of the channel is consistent with the direction of the message exchange, i.e., the source (respectively, sink) node of the channel corresponds to the source (respectively, target) location of the message.
For example, the Reo circuit on the right-hand-side in Fig. 7 is the result of composing the Reo circuits in Figs. 5 and 6, according to the basic SD on the left-hand-side of Fig. 7 , where all messages are synchronous. In the synthesized connector for the whole SD, source nodes C i and sink nodes D i are attached to p i respectively. Component p 1 can write messages m 1 and m 3 to the source node C 1 , and receive message m 2 from the sink node D 1 . The filter connected to C 1 and the sequencer ensure that p 1 receives some message after it sends out the message m 1 and before it sends out the message m 3 . From the synchronous channel between A 2 and B 2 , and the filter C 2 B 2 , we know that the message received by p 1 is m 2 .
Messages in a UML SD can also be asynchronous, which are graphically represented by open arrowheads, such as the message displayindexpage and displayonlineBank in the SD UserArrives in Fig. 4 . There are different possibilities for the ordering of events for asynchronous messages, as shown in Fig. 8 . Since the order of asynchronous message passing may be different, it is not possible to use only one asynchronous channel for all asynchronous communications. In Fig. 9 , we give the Reo circuits for the scenarios in Fig. 8(a) and (b) respectively. The FIFO1 channels are used for asynchronous messages, where the ordering of events is controlled by the topology of the Reo circuits.
There can be a coregion area in a lifeline in UML SDs where the order of event occurrences on the lifeline is not significant. Fig. 8(c) shows an example of a coregion. In this case, the corresponding Reo circuit is as shown in Fig. 10(a) , in which an exclusive router EXR is needed, which is, in turn, composed of five synchronous channels, two lossy synchronous channels and a synchronous drain, as shown in Fig. 10(b) .
One participant in a scenario can send the same message multiple times. Fig. 11 shows an example. In this case, an exclusive router EXR can be used on the side of the sender, where the messages through the two sink nodes of EXR are ordered by the sequencer, and connected to the nodes corresponding to the different receivers, respectively. Another possible solution is to use lossy synchronous channels on the sender side where every time when the message m 1 is sent by p 1 , it can be transmitted through only one of the branches, and lost by the other ones. The order is decided, again, by the sequencer.
Messages in UML SDs can be lost. Lost messages are messages with known sender, but the reception of the message does not happen. Such a situation can be captured by the Lost connector as shown in Fig. 12 , where the source node B lost can take a message from outside, and lose it in the synchronous drain. Such a component can be integrated in the synthesized connector by connecting the node B lost to the node A i synchronized with the sequencer for the sender of the lost message.
A message can also be found. A found message is a message whose receiving event occurrence is known, but has no sending event occurrence. This is because the origin of the message is outside the scope of the participants. We can describe such messages by the Found connector in Fig. 12 whose found message is m. The sink node A found can be connected to some node B i for the receiver of the message m in the corresponding Reo circuit.
Composing Reo circuits following UML operators
So far our Reo circuits focused on basic SDs. Next, we describe our compositional construction of the Reo circuits of basic SDs following a structural induction approach. To structure the connectors according to the operators in UML SDs, we use a general structure for the Reo circuits as shown in Fig. 13 : R sd is the Reo circuit for a basic SD sd, which is obtained as in the previous section. In this construction, six (3 × 2) more nodes are added to R sd :  A sd and  B sd are two nodes synchronized with the nodes of the sequencers inside R sd , that correspond to the first sending event and the last receiving event in sd, respectively. 4 If the source node A sd is fed from outside with some data element, then it is put into the buffer between A sd and A sd . As soon as A sd takes the data element from the buffer, the subcircuit R sd is ''activated'' by the first message received through some C i . Similarly, the communication via the subcircuit stops as soon as a data element arrives at B sd , which puts it into the buffer between B sd and B sd . Thus, data-flow at the sink node B sd can be viewed as a signal that R sd has terminated. As an example, the generalized Reo circuit for the connector in Fig. 7 is shown in Fig. 14 .
Assume we have already constructed the circuits for the interaction fragments (basic SDs) of a sequence diagram SD. We now explain how to construct a Reo circuit R SD for the whole diagram. Note that in the transformation rules for alt, par, strict and seq, we only consider the case for two operands. In fact, using the graph transformation approach proposed in [20, 24] to instantiate parameterized Reo circuits, these rules extend to an arbitrary number of operands. For SD = alt(g 1 : sd1, g 2 : sd2), the Reo circuit R SD is obtained by combining R sd1 and R sd2 with a replicator connected to three filters and one exclusive router. The patterns on the filters correspond to the two guard conditions g 1 , g 2 , and the conjunction of ¬g 1 and ¬g 2 . The data item d to be transmitted via the channel A SD A SD is related to the guard condition that may be obtained from other nodes (i.e., g 1 and g 2 in Fig. 15 ). In this case, there is another Reo circuit R connected to A SD , in which a FIFO channel  A sd  B sd is used for the control flow (see Fig. 14 as an example) . If a data item (message) d is used as a parameter in the guard condition g 1 and g 2 , and it is transmitted through node A i in R, then we can move the source channel end of the FIFO1 channel  A sd  B sd in R from node  A sd to node A i to get the data item d related to the guard conditions, so that it can be used in the alternative choice. If only one guard condition is satisfied, then the corresponding Reo circuit (R sd1 or R sd2 ) will be activated. If both conditions are satisfied, the token will flow through the exclusive router, then a non-deterministic choice will be made at the exclusive router, and one of the two subcircuits will be activated by the corresponding filter and synchronous drain. If neither condition is satisfied, the token will go through the filter with pattern ¬g 1 ∧ ¬g 2 from A SD to B SD directly.
For SD = opt(sd), the Reo circuit R SD is obtained by combining R sd and a FIFO1 channel with an exclusive router that chooses to either activate the behavior of the operand sd or skip the fragment while nothing happens (Fig. 16 ).
For SD = par(sd1, sd2), the Reo circuit is obtained by combining R sd1 and R sd2 with a replicator, which represents a parallel activation of both operands, where the internal FIFO channels in R sd1 and R sd2 (those connected to A sdi and B sdi inside the boxes, which are not drawn in the picture) ensure that in the combination, the events in the two branches can be interleaved.
In parallel composition, if there exists some common participant p i in sd1 and sd2, then R sd1 and R sd2 should also have shared nodes C i and D i , which are obtained by merging the nodes with the same name in the two Reo circuits (as shown in Fig. 18(a) ). For the source node C i , if there is some message m sent by p i in both sd1 and sd2 (as shown in Fig. 18(c) ), then in the resulting Reo circuit R SD , the filters will be replaced by one filter and one exclusive router, as shown in Fig. 18(b) . For all the filters with source end C i whose pattern P does not appear in another operand, they will be kept the same in the resulting circuit R SD .
For SD = strict(sd1, sd2), the Reo circuit is obtained by combining R sd1 and R sd2 as in Fig. 19 , where the FIFO channels in R sd1 and R sd2 ensure that in the combined connector none of the events in sd2 can happen before the communication in sd1 has finished.
The case for weak sequencing seq(sd1, sd2) is more complex, because the definition of the seq operator depends on whether the operands share some lifelines. If a common lifeline p i exists in both sd1 and sd2, then all the event occurrences on p i in sd1 must happen before those on p i in sd2. However, for the event occurrences on different lifelines in the two operands, they may occur in any order. If the operands involve disjoint sets of participants, the weak sequencing reduces to a b c a parallel merge, as shown in Fig. 17 . Otherwise, suppose they share a lifeline p i , and the sequencers in the circuits for sd1 and sd2 for p i have n 1 and n 2 nodes, respectively. Then we can add two more nodes A w and B w to the two Reo circuits R sd1 and R sd2 , respectively, and synchronize them with the nodes of the sequencers inside the two Reo circuits such that they correspond to the last event on lifeline p i in R sd1 and the first event on p i in R sd2 . The two nodes A w and B w are ordered by adding a FIFO1 channel and a SynchDrain between them, as shown in Fig. 20 . Note that the resulting Reo circuit in Fig. 20 for weak sequencing can be optimized by replacing the two sequencers in R sd1 and R sd2 corresponding to the same lifeline p i with one sequencer with n 1 + n 2 nodes, where the first n 1 nodes are synchronized with the n 1 nodes of the sequencer for p i in R sd1 using synchronous drains, and the last n 2 nodes are similarly synchronized with the n 2 nodes of the sequencer for p i in R sd2 . The two sequencers Sequencer i sd1 and Sequencer i sd2 together with the synchronous drains connecting them can then be removed in the resulting Reo circuit R SD since the ordering information is now kept by the new sequencer, as shown in Fig. 21 .
For SD = loop(sd), the Reo circuit is obtained from R sd as in Fig. 22 . The connector gEXR is a variant of the exclusive router in Fig. 10(b) , where we replace the lossy synchronous channels by two filters with the patterns g and ¬g respectively, where g is the guard condition of the loop. If g is satisfied, then the loop iterates, otherwise, it stops. 
Example 2.
To illustrate our approach, we consider the sequence diagram for the on-line banking example in Fig. 4 . The generated Reo circuit is given in Fig. 23 . Note that for simplicity, we do not give the Reo circuit for the whole scenario. Instead, we show the structure of the connector for the whole scenario in Fig. 23(a) and give one subcircuit R Login in Fig. 23(c) , which corresponds to the basic sequence diagram Login. Fig. 23(b) shows the internal structure of the guarded exclusive router gEXR used in Fig. 23(a) . Details of the other building blocks (subcircuits) in Fig. 23(a) are similar to Fig. 23(c) , and can be easily obtained by our synthesis approach.
One step further
Our construction can be easily extended to treat timing constraints in UML SDs. Since the semantics of UML on such messages is ambiguous and can have different meanings, we shall not be exhaustive here, but rather suggest a possible step in this direction, and investigate the approach in our future work. As an example, we consider a message m{0..t} which states that the message m is constrained to last between 0 and t time units. If the receiver of the message is not ready to accept it in t time units, the message can either be lost or be stored in some queue, waiting for the receiver to process it. We assume that the message with a time constraint m{0..t} can always be successfully transmitted to the receiver side and waits to be processed. For such an interpretation, we just need to connect the nodes A i and B j which are internal nodes of the synthesized circuit under construction, to the nodes for the sender and the receiver of the message, respectively via a P-producer (where P is the singleton data set {expire}), a synchronous drain and a timer channel with early expiration. Such a timer channel allows the timer to produce its timeout signal through its sink end and reset itself when it consumes a special ''expire'' value through its source [3] . As an example, we replace the message m 1 in Fig. 8(a) by m 1 {0. .10}, and show the resulting Reo circuit in Fig. 24 .
Correctness
To provide a soundness proof of our approach, we use the framework of coalgebraic bisimulations. For this purpose, we provide a coalgebraic interpretation for the Reo circuit of a sequence diagram and then establish a homomorphism from the coalgebra semantics of a sequence diagram to the coalgebra assigned to its Reo circuit. The bisimulation relation between the coalgebras for sequence diagrams composed by interaction operators and the corresponding Reo circuits can be obtained from the bisimulations between the component coalgebras. Recall that the denotational semantics of a sequence diagram can be viewed as a coalgebra (C, ⟨ϵ, α⟩) where C is the set of possible configurations.
Let is decided by the buffers of the FIFO channels. The FIFO channels in the synthesized Reo circuits can be separated into two categories: channels for the control structure and channels for the communication behavior. Taking Fig. 24 as an elements only in the case of channels that contribute to the communication behavior, i.e., the messages. Since the actual data values in the buffers of the channels that comprise its control structure cannot be observed through the ports of the synthesized connector circuit, we abstract away from these data values and consider only the empty or non-empty status of such FIFO channels as the indicator of a state. Then the behavior preservation for the synthesized circuit can be witnessed by the bisimulation relation between the two coalgebras [31] . Note that here the coalgebraic interpretation for Reo circuits is different than the coalgebraic semantics defined in [6] , where the semantics is expressed in terms of relations on infinite Timed Data Streams, i.e., relations on final coalgebras. Our approach uses one coalgebra, instead of a relation on multiple coalgebras, for the semantics of a Reo circuit, and this coalgebra is an abstraction of the constraint automaton that captures the operational semantics of the Reo circuit [9] .
In the sequence diagram sd, for every lifeline p i corresponding to the instance i ∈ I, suppose card(loc(i)) = n, then according to the construction in Section 4.1, the sequencer on the p i side also has n external nodes and n FIFO channels inside. 
For a state u ∈ U and a buffer buf ∈ u, elem(buf ) returns the element in buf . Thus the function ρ is defined as we can assume that the buffer buf F is in u after ⟨l
Proof. We prove bisimulation equivalence of the coalgebras of sd and its Reo circuit R sd by establishing a homomorphism
All we have to prove is that h is a coalgebra homomorphism, i.e., that
ϵ(c) = ρ(h(c))
We first give the proof for Eq. (4). Let c = ∏
The corresponding results for lost and found messages and co-region location can be similarly obtained.
We now provide the proof for Eq. (5) 
As the induction step, we assume that ρ(h(c)) = ϵ(c) and show that for any e, ρ(h · α(c, e)) = ϵ(α(c, e)).
We first consider e = ⟨⟨l 
The proof for lost and found messages and co-region can be similarly obtained.
We obtain the equivalence result for arbitrary sequence diagrams by an induction on the structure of the sequence diagrams as in the following theorem. There is a slight change in the proof style used for Theorem 1 and Proposition 1. For Proposition 1, the bisimulation between each sequence diagram and its translation is shown by means of a homomorphism between their corresponding coalgebras. In the proof of Theorem 1, however, we construct the concrete bisimulation relation directly instead of using the graph of a homomorphism.
Note that the composed R sd has the general structure as in Fig. 13 , which has three FIFO channels besides the ones in the sequencers and for the message passing between the external ports. Let buf AA , buf A  B , and buf BB be the corresponding buffers in these channels. Then the semantics of R sd can be defined as
where U = 2
Buf ∪{buf AA ,buf 
Note that for weak sequential composition we consider only the case for I 1 ∩ I 2 ̸ = ∅, the case for I 1 ∩ I 2 = ∅ can be obtained by the proof for parallel composition.
strict: strict(sd 1 , sd 2 ) The semantics of R strict(sd 1 ,sd 2 ) is given by the coalgebra (U, ⟨ ρ,  β⟩, u 0 ), where
Because of the structure of R strict(sd 1 ,sd 2 ) , when the circuit is in state u 0 , since the sink end of A SD A SD is connected to the source end of A sd 1 A sd 1 by a synchronous channel, and  ρ(u 0 ) = ∅, the circuit can take an internal transition by transmitting the token in the buffer buf A SD A SD to the buffer buf A sd 1 A sd 1 , which makes the state change to {buf
Similarly, when the circuit is in state {buf 
From the above discussion, we know that there is an internal transition from The semantics of R par(sd 1 ,sd 2 ) is given by the coalgebra (U, ⟨ ρ,  β⟩, u 0 ), where
There is an internal transition from u 0 to u 
), the result can be similarly obtained. Analogously, we can obtain that for any (c, u) ∈ ≈, (par(α 1 , α 2 )(c, e),  β(u, e)) ∈ ≈ and  ρ(u) = par(ϵ 1 , ϵ 2 )(c). So ≈ is a bisimulation relation, and
Option: opt(sd 1 ) The semantics of R opt(sd 1 ) is given by the coalgebra (U, ⟨ ρ,  β⟩, u 0 ), where 
According to the semantics, we have
and for e ∈ Σ 1 , if ρ 1 (β 1 ({buf
Similarly, we can obtain that for any (c, u) ∈ ≈, 
When the circuit is in state u 0 , and the token in A SD A SD satisfies g i , the token can be transmitted through the filter A SD A sd i by an internal transition, which changes the state from u 0 to {buf 
If neither of the guard conditions is satisfied by the token in buf A SD A SD , then skip happens, which moves the token in buf A SD A SD to buf B SD B SD , and the state changes from u 0 to u F . In this case, we have 
According to the definition, we have
and if e ∈ Σ 1 0 ,
, e) and β 1 ({buf
For the case c 0 g 2 ∧ ¬g 1 and c 0 g 1 ∧ g 2 , the results can be similarly derived. We can also obtain that for any (c, u) ∈ ≈, (alt(α 1 , α 2 )(c, e),  β(u, e)) ∈ ≈ and  ρ(u) = alt(ϵ 1 , ϵ 2 )(c . We consider the Reo circuit R seq(sd 1 ,sd 2 ) as given in Fig. 20 . Its semantics is given by the coalgebra (U, ⟨ ρ,  β⟩, u 0 ) as follows, where
and if e ∈ ρ
), the result can be similarly derived. Analogously, we can obtain that for any (c, u) ∈≈, (seq(α 1 , α 2 )(c, e),  β(u, e)) ∈ ≈ and  ρ(u) = seq(ϵ 1 , ϵ 2 )(c). So ≈ is a bisimulation relation, and
Loop: loop(g : sd 1 ) The semantics of R loop(g:sd 1 ) is given by the coalgebra (U, ⟨ ρ,  β⟩, u 0 ), where
When the circuit is in state u 0 , and the token in A SD A SD satisfies g, there is an internal transition from state u 0 to {buf 
, e) (here we assume that loop(ϵ 1 )(α 1 (c 0 , e)) ̸ = ∅) and β 1 ({buf
Similarly, we can derive that for any (c, u) ∈ ≈, 
Related work
One closely related work is the synthesis of adapters in component based systems. The authors of [39] propose an approach to modify the interaction mechanisms that are used to glue components together by integrating the interaction protocol into components. However, this approach acts only at the signature level. The work reported in [8, 34] goes beyond the signature level and supports protocol transformations in the synthesis process, but the initial coordinator being synthesized behaves only as the ''no-op'' coordinator, which requires the assembly of new components to enhance its protocol for communication.
Brogi et al. [12] set a formal foundation for the adaptation of heterogeneous components that may present mismatching interaction behavior. Session types are used to cope with heterogenous descriptions of component interfaces. An adaptor can be automatically generated from an adaptor specification, which establishes a correspondence between messages in different components. However, the adaptor specification in their approach requires a good deal of implementation details such as correspondences among methods (and their parameters) of different components.
In [10] , an approach to scheduler synthesis for discrete event physical systems using supervisory control is proposed, where supervisors are defined as processes and the allowable executions of a system are specified as a set of traces. The supervisory controller interacts with the running system and makes it conform to the specification, which is given as a collection of languages that can be intersected to yield a global specification. A supervisor is synthesized to restrict the system's behavior by synchronizing the events in the system. Our approach goes beyond behavioral restriction, and our synthesized circuits can interact with system components through different communicating mechanisms encoded in the channels.
A number of approaches for the synthesis of state-based models from scenario descriptions have been developed. For example, the authors of [23] present a state-chart synthesis algorithm, but their approach does not support High-Level Message Sequence Charts (HMSC), which provide a composition mechanism very close to UML2.0 SDs. The authors of [36, 37] propose an approach to synthesize LTS models from MSC specifications, where the mechanism for communication among components is synchronous. The authors of [25] use MSCs for service specifications and propose an algorithm for synthesizing component automata from specifications. In [15, 16] , the problem of synthesizing state machines from LSC models was tackled by defining the notion of consistency of an LSC model. A global system automaton can be constructed and then decomposed. However, this approach suffers from the state explosion problem due to the construction of the global system automaton, which is often huge in size because of the underlying weak partial ordering semantics of LSC. The authors of [32] combine the LSC notation with Z, and propose a synthesis approach for generating distributed finite state designs from the combined specifications.
The authors of [27] propose an interactive algorithm that can be used to generate flat state-charts from UML sequence diagrams. In [22] , the authors also provide an interactive algorithm to generate state-charts from multiple scenarios expressed as UML collaborations. In [35] , the existing LTS synthesis algorithms are extended to produce Modal Transition Systems from the combination of properties (expressed in temporal logic) and scenarios. An algebraic approach was adopted in [40] to synthesize state-charts of components from sequence diagrams, but it takes only the operators alt, seq and loop into account, and does not consider any of the other UML2.0 operators on SDs.
Regardless of the scenario notations used (MSC, LSC or UML), all scenario-based synthesis approaches focus only on generating the state-based models for separate components, or a global state machine for the whole system. These approaches differ from ours as (1) we are concerned about the coordination aspects in distributed applications instead of the behavior models for separate individual components, and (2) our synthesized connectors also provide the actual protocols used for communication among components/services in the system, and our components do not need to contain any protocol information. Therefore, changes in the communication protocol caused by system evolution require us to change only the connector implementation, without changing any of the components that are not directly involved in the evolution. Furthermore, the framework of synthesizing Reo circuits from scenario specifications provides a certain flexibility in the synthesis process. When we modify the scenario specification (for example, adding, removing or changing a sequence diagram), part of the previous synthesized Reo circuits can be reused. Since every scenario described by UML SDs captures only a possible system behavior and it is possible to add more scenarios during the development, the specification can be taken as complete at some point, after which we can block whatever is not given in the specification.
Conclusion
In this paper we have presented a novel approach for constructing Reo circuits from scenario specifications represented as UML sequence diagrams. This work extends our previous work described in [4] that presents an algorithm for automatically synthesizing Reo circuits from constraint automata specifications, and the results in [7] that show the synthesis of constraint automata from UML sequence diagrams. The method described in this paper allows us to derive a Reo circuit as the implementation of the coordinator for a concurrent system directly from UML scenario specifications, which can have greater structural fidelity compared to the connectors generated by using CA as a bridge, and thus higher reusability.
In [4] , we have shown how to synthesize Reo circuits from constraint automata specifications. On the other hand, an algebraic approach for generating constraint automata from scenario specifications has been proposed in [7] . However, like most program-generated code, the synthesis of a Reo circuit from a constraint automaton, as reported in [4] , generally yields verbose circuits that do not ''look natural'' to the human eye. Therefore, generating a Reo circuit from a constraint automaton synthesized from UML2 SDs yields Reo coordinator circuits that may not easily correlate back to their original SD specifications. The merit of synthesizing Reo circuits directly from SDs lies in the greater structural fidelity between the resulting Reo circuits in this approach and their original SD specifications. The new contribution in this paper is that we go one step further and generate Reo circuits directly from scenario specifications represented by UML sequence diagrams. There is substantial benefit in this work which bridges the gap between requirements and implementation of coordination in the development of large-scale, distributed systems. From the presentation of the synthesis approach, we can easily conclude that the size of the generated Reo circuit, i.e., the number of channels in the circuit, is linear to the size of the corresponding sequence diagrams in terms of the number of lifelines and messages.
Among our next steps is the automation of the synthesis approach described in this paper. We already have a set of integrated, visual tools to support coordination of components/services, including graphical editors, animation and simulation tools, code generators, and model checkers [1, 11, 21, 38] . We expect our tool to be useful in model-based development of service oriented applications. Our aim is to aid designers who are interested in complex coordination scenarios by enabling them to use UML SDs as the basis for generating implementations automatically using our synthesis approach. Once the Reo circuit is generated from a scenario specification, we can also apply the existing tools, for example, the Reo model checker [21] , to check for containment and equivalence of connectors. It would also be interesting to consider extensions of UML, for example, the UML Profile for Schedulability, Performance and Time (UML-SPT) [29] , which can be used to provide appropriate representation of QoS aspects in UML, and their connection with quantitative Reo circuits [5] . Another future direction is to establish a formal consistency result for our translation from UML SDs to Reo circuits and the synthesis algorithm of constraint automata from UML SDs suggested in [7] . The proof obligation will be to show that the constraint automaton of the Reo circuit constructed from a given UML SD is (bisimulation) equivalent to the constraint automaton obtained by the techniques of [7] for the same UML SD. As both approaches are based on structural induction, we may use inductive arguments to establish this consistency result. The investigation on the link between timing constraints in UML and the verification methods for timed constraint automata [3] is in the scope of our future work as well.
