One of the main problems in component assembly is how to establish properties on the assembly code by only assuming a limited knowledge of the single component properties. Our answer to this problem is an architectural approach in which the software architecture imposed on the assembly prevents black-box integration anomalies. The basic idea is to build applications by assuming a "coordinator-based" architectural style. We, then, operate on the coordinating part of the system architecture to obtain an equivalent version of the system which is failure-free. A failure-free system is a deadlock-free one and it does not violate any specified coordination policy. A coordination policy models those interactions of components that are actually needed for the overall purpose of the system. We illustrate our approach by means of an explanatory example and validate it on an industrial case study that concerns the development of systems for safeguarding, fruiting, and supporting the Cultural Heritage.
Introduction
One of the main problems in component assembly is related to the ability to establish properties on the assembly code by only assuming a limited knowledge of the single component properties. Our answer to this problem is a softwarearchitecture-based approach in which the software architecture imposed on the assembly prevents black-box integration anomalies. Notably, in the context of component-based concurrent systems, Commercial-Off-The-Shelf (COTS) component integration may cause deadlocks or other software anomalies within the system [4, 19, 20, 28] . Building a system from a set of COTS components introduces a set of problems. Many of them arise because of the nature of COTS components. They are truly black-box and developers have no method of looking inside the box. This limit is coupled with an insufficient behavioural specification of the components, which does not permit the understanding of the interaction of components. Component assembly can result in architectural mismatches when trying to integrate components with incompatible interaction behaviour [10] . Thus if we want to ensure that a component-based system obeys specified behavioural properties, we must take into account the component interaction behaviour. In this context, the notion of software architecture assumes a key role since it represents the reference skeleton used to compose components and let them interact. In the software architecture domain, the interaction among the components is represented by the notion of a software connector.
In this paper, we illustrate our approach to the assembly problem and validate it on an industrial case study that concerns the development of systems for safeguarding, fruiting, and supporting the Cultural Heritage. Our aim is to analyze and prevent dynamic behavioural problems that can arise from component composition. We propose an architectural "coordinatorbased" approach. The idea is to build applications by assuming a formal architectural model of the system representing the components to be integrated and the connectors over which the components will communicate [27] . We will consider the special case of a generic layered architecture in which components can request services of components above them, and notify components below them. We compose a system in such a way that it is possible to check whether the system exhibits integration failures. We derive in an automatic way, from the COTS (black-box) components, the code that implements a new component to be inserted in the composed system. This new component implements a software coordinator. The coordinator mediates the interaction among components in order to prevent possible integration failures.
For the aims of this work, we assume that some specification of the externally "observable" behaviour of each component (forming the system to be assembled) is available in the form of a Labelled Transition System (LTS) [36] . The rationale behind this choice is that LTSs constitute a fundamental model of concurrent computation which is widely used in light of its flexibility and applicability. LTSs are often used as semantic model for many formal languages that are used to model concurrent systems. Example of these languages are CCS [23] , and CSP [34] . Actually, very often, these calculi are formalized operationally by using an LTS-based semantics [35] . Furthermore, note also that an LTS can be seen not only as a semantic model but also as a notation for behavioural specification purposes. Thus, in the remainder of the paper, when we consider a set of COTS components, we will associate to each COTS component an LTS representing the externally observable behaviour of the component. That is the behaviour of the component in terms of the messages it exchanges with its environment.
In practice, as it is described in [30, 39] , we assume to deal with the so-called behavioural interfaces. In our context, a behavioural interface is an augmented IDL file. 1 According to "design by contract" approaches [28] , we can assume that the IDL file of a component is augmented, by the component developer, through a commented header. Such a header encodes somehow (e.g., by using XML) an LTS that models the observable behaviour performed by the component when it interacts with its expected environment (i.e., this LTS models the component interaction protocol). For a client, such an XML file is directly provided by the client developer. Note also that, in our context, a component always respects its interaction protocol specification since it is provided by the developer of the same component, who is aware of the information needed to specify the component protocol.
In other works [17, 18] we have shown how it is possible to automatically derive these LTS-based behavioural descriptions by assuming a partial specification of the system to be assembled. This partial specification is given in the form of a basic Message Sequence Charts (bMSCs) and High Level MSCs (HMSCs) specification [1, 32] . The automatic derivation of the behavioural specification for each component can be performed by applying our implementation of the algorithm described in [32] . HMSC and bMSC specifications are useful as input language, since they are commonly used in software development practice. Thus, LTSs can be regarded as an internal specification language.
Moreover, we assume to have a specification of the composed system desired behaviours in the form of LTSs. Under these two assumptions we have developed a framework that automatically derives the assembly code (i.e., the coordinator's actual code) for a set of components. This code is derived in order to obtain a failure-free system, i.e., a deadlock-free system that does not violate any specified desired behaviour. A desired behaviour specification models those interactions of components that are actually needed for the overall purpose of the system.
We have implemented the approach in our SYNTHESIS tool [30, 31, 39] . We have validated and applied SYNTHESIS for assembling Microsoft COM/DCOM components [30] and EJB components [39] . The code synthesized by SYNTHESIS refers either to Microsoft Visual Studio with Active Template Library (for COM/DCOM components) or to Eclispe with AspectJ (for EJB components) as reference development platforms. In this paper, the use of SYNTHESIS is limited to the explanatory example and the case study used to illustrate the approach. We refer to [30, 39] for a detailed presentation of SYNTHESIS at work.
The remainder of the paper is organized as follows. Sections 2 and 3 describe the problem we want to address and introduce a set of background notions, respectively. In particular, Section 3 is related to the formalism that we use for modelling our application context. These formalism is LTSs. Section 4 describes and formalizes our reference architectural style that belongs to layered styles. Section 5 formalizes our method and describes it at work by means of a simple explanatory example. Section 6 validates our method by means of an industrial case study that concerns the development of systems for safeguarding, fruiting, and supporting the Cultural Heritage. In Section 7 we describe how it is possible to apply our approach to multi-layered systems. Section 8 proves correctness and completeness properties of the entire method described in Section 5. It is not critical for the understanding of the approach. Thus, the reader that is not interested in correctness and completeness properties of the described approach can completely skip it. Section 9 presents related works. Section 10 summarizes the work and discusses applications and future extensions.
In Appendix, at the end of the paper, we describe the algorithms that are performed by SYNTHESIS to automatically build a behavioural model of the deadlock-free assembly code (they are also described in [17, 18] ). These algorithms are not crucial for the understanding of the approach. However, we kept them in Appendix for those readers interested in the implementation details of the approach.
Problem description
The problem we want to treat can be phrased as follows: Given a set of interacting components C and a set of behavioural properties P, automatically derive a deadlock-free assembly A of these components which guarantees every property in P, if possible.
The basic ingredients of this problem are: (i) the type of components that we refer to, (ii) the type of properties that we want to guarantee and (iii) the type of systems that we want to build. We consider COTS components that are truly blackbox components. For now, each property in P is a functional property expressing precise ways to coordinate the interaction behaviour of the components forming the system to be assembled. Thus, hereafter, we call them coordination policies. The assembly A depends on the constraints induced by the architectural model the system is based on. This architectural model, that defines the rules used to build the composed system, is called CBA (i.e., Coordinator-Based Architecture) style.
Besides assuming that the system architecture must reflect the rules of a well-defined architectural style (namely the CBA style), we also assume that a behavioural specification of each component is provided in the form of an LTS. Thus when we say: Given a set of interacting components C in the problem definition we mean that we consider a set of component behavioural specifications C (i.e., of LTSs). Informally our approach is the following. The method starts with a set of components, and builds a deadlock-free coordinator following the reference style constraints. Then coordination policy analysis is performed. If the synthesized deadlock-free coordinator contains policy violating behaviours, a prevention strategy is applied. Depending on the behaviours specified by the coordination policy, the analysis of only the coordinator is enough to automatically obtain a version of the system which is deadlock-free and guarantees those behaviours.
Broadly speaking, in order to prevent deadlocks, our method automatically synthesizes a coordinator that preempts all the component interactions in order not to perform the "execution traces" always leading to deadlocks hence restricting the set of all possible composed system's behaviours. In doing so, it cannot be sure that those component interactions that are actually needed for the overall purpose of the system are still kept. The coordination policy analysis is for this purpose. That is, each coordination policy is an LTS specification of the component interactions that are required for the purposes of the system. The coordination policy analysis step of our method further restricts the set of the behaviours exhibited by the deadlock-free coordinator in order to avoid the component interactions that do not guarantee the specified policies (and, hence, they are not required for the overall purpose of the system). It might be the case that, by taking into account the set of components given as input to our method, it is not possible to assemble a deadlock-free system that, in the same time, guarantees also the specified policies. In this case, since we are dealing with black-box components, there is nothing to do and our method answers to the user with an unsuccessful output. Otherwise, our method will synthesize a deadlock-free coordinator that exhibits only the component interactions specified through the coordination policies and that are the ones required for the purposes of the system. It is worthwhile noticing that the correct composed system (with respect to the deadlock freedom and the specified coordination policies), has not to be necessarily forced to let the components exhibit all their possible interactions (as specified by their behavioural interface) but only the ones that are needed for the system's purposes.
Deadlock and beyond
In our context, the deadlock is the base failure because it is directly identifiable in the behavioural model of the synthesized coordinator. That is, we distinguish the deadlock handling process and the handling of failures different from deadlocks. However, we might require the user to provide our tool with a specification of the deadlock-free desired behaviours (of the system to be built) in order to deal with both deadlocks and other different failures by means of the same technique. In spite of this, we maintain a special handling of deadlock freedom because we do not want to force the user to provide such a specification. This is a reasonable choice since for large systems deadlocks are very often unpredictable and hence their detection and prevention is required to be as automatic as possible without involving the user in the detection and prevention process.
We give the following definition to describe the deadlock problem in a component-based context.
Definition 1 (Deadlock).
A set of components is deadlocked if each component in the set is waiting for an event that only a different component in the set can cause.
Informally we can say that in component-based architectures, there are two types of deadlock problems:
• observable deadlocks;
• hidden deadlocks.
For both kinds of deadlock, the behaviour of a component is wrong with respect to the behaviour of its "environment" although the component behaviour is "correct" in a stand-alone context. Moreover, the deadlock occurs during the interaction among a component and its environment. The difference between these two kinds of deadlock is that for the first one the failure is an event that is observable by the component environment. For the second class the failure is an externally non-observable event since it might depend on internal characteristics of the component. Thus while observable deadlocks can be treated in the component setting by operating on the architectural context, namely on the coordinator, the hidden deadlocks cannot be automatically addressed. The only way to solve the problem is to modify the internal behaviour of a component. This is not possible with black-box components. An example for this deadlock type is offered by the Compressing Proxy problem [8] . For these reasons, we focus only on the first class of problems, attempting to create coordinators that can prevent observable deadlocks. In the remainder of the paper we use the term deadlock to mean an observable deadlock.
In our approach we also consider the analysis of failures beyond deadlock. As mentioned in Sections 1 and 2, to do it, we specify a set of component interactions that are needed for the overall purpose of the system. As done previously, we call them "coordination policies" or "desired behaviours" since they represents, among all possible composed system behaviours, only the ones that we wish the composed system will satisfy. Obviously, the user can specify a desired behaviour that is always violated by the composed system. In this case, our tool will answer to the user that this behaviour cannot be satisfied and, probably, something in the specification has to be changed. We recall that each desired behaviour represents a coordination policy (for the interaction of the components in the system to be assembled) which is given in terms of an LTS. In the composed system that we want to build, in order to progress in the execution, a component has to perform actions according to the policy. In other words, a coordination policy can be seen as an abstract and highlevel specification of a new component (i.e., the coordinator that we want to build automatically) that serves as desired 2 environment for the components given as input to our method. Furthermore, the system composed by the components and the desired environment (i.e., the coordinator) is built according to a precise software architecture, i.e., the components cannot communicate with each other directly but they are all connected to the coordinator and indirectly communicate with each other only through the coordinator.
Labelled transition systems
In this section we provide the background needed to understand the approach presented in this paper. We summarize the relevant definitions regarding the notation that is used to specify the externally observable behaviour of a component and the coordination policies that are required for the realization of the system's purposes. This notation concerns LTSs [36] . It allows us to rigorously define the semantics of the interaction behaviour of a component with its environment and of the composed system's behaviour (by means of the LTS parallel composition operator, see Definition 8). Let Act be the universal set of observable actions, and Act τ = Act ∪ {τ}, where τ denotes an internal action that is not observable to a component's environment. 
Definition 2 (LTS). An LTS
It is worthwhile mentioning that the SYNTHESIS tool deals with both deterministic and non-deterministic LTSs hence letting to the user the flexibility of modelling the behaviour of a component by means of either a deterministic or, when needed, a nondeterministic LTS.
We will refer to sink states of an LTS as deadlock states. A deadlock state models the fact that a safety violation has occurred in the associated component/system. We also denote as deadlock-free all LTSs that do not have deadlock states. 
where:
We model the externally observable behaviour of a component C i by means of an LTS modelling the interaction of C i with its expected environment. LTSs can be used to define finite state systems [29] . For our purposes we will in the following assume that all systems we will deal with are finite state. Note that, in our context, this is not a restriction. We are dealing with black-box components, each of them exporting through its interface a finite number of operations. In our model, each operation of a component interface can be seen as a point of interaction of the component with its expected environment (e.g., an observable action in an automaton). If we would model all the possible externally observable component interactions with an automaton, what matters about a particular component interaction is not whether it drives the automaton in an accepting state (since we cannot detect this due to the black-box nature of the component) but whether the automaton is able to perform the corresponding sequence of actions interactively. Thus, we should consider an automaton in which every state is an accepting state [23, 24] (i.e., an LTS). A consequence is that if an automaton accepts a particular component interaction seen as a sequence of component interface operation invocations (i.e., a trace of actions in our model), then it also accepts any initial part of that interaction/sequence. In other words, due to the finiteness of the set of component interface operations, although all the possible component interactions can be infinite, we can always finitely represent them since the language built over the component interface operations (i.e., the model of the component interaction behaviour) is prefix-closed [24] . Prefixclosed languages are generated by prefix-grammars that describe exactly all regular languages. It is well-known that regular languages are always accepted by finite-state automata. Thus, due to our component interaction model and to the fact that we deal with black-box components, 3 for us, it is sufficient to consider finite state systems for dealing with all the systems we are interested in.
In order to model component-based systems, LTSs can be combined using the LTS parallel composition operator. In the literature, several variants of the operator have been defined. The one used here (see Definition 8) has an interleaving semantics. That is, if α is an observable action (i.e., α = τ) of an LTS L i , then α synchronizes with the complementary action α of an LTS L j (with i = j) producing an internal action τ at the level of the parallel composition. Synchronization of actions is thus determined by the alphabets of the component LTSs. An action β of an LTS L i for which no complementary action exists in an LTS L j (with i = j), is executed only by L i , hence, producing the same action β at the level of the parallel composition (β is a so-called "non-shared" action). Analogously, the internal action τ is executed by exactly one component LTS at a time.
Hereafter, we will interchangeably use the terms "complementary action" and "co-action". In the following we formally define the parallel composition of LTSs. For the sake of clarity, we give the formal definition by considering only two LTSs. In Appendix we give the same definition by considering the general case of more than two LTSs.
• Reachability: ∀s ∈ S : ∃µ ∈ T * : µ is a trace leading to s.
In practice, the parallel composition operator "|" combines the behaviours of two LTSs by synchronizing their shared/common actions and interleaving their non-shared and internal actions. The part of the parallel composition that is not reachable from the initial state is ignored, as it has no semantic significance. Note that "Interleaving" has a symmetric version that is not given since its definition is trivial.
The CBA reference architectural style
In this section we define the reference architectural style that represents the starting point of our work. This style imposes constraints on the architecture of the system to be assembled that allow us to automatically derive, from a set of component specifications, a behavioural model of the environment expected by the components (i.e., in our setting, the coordinator). As we will see in Section 5.2, this model plays a key role in synthesizing the deadlock-free assembly A (of our problem description) in such a way that it guarantees the specified policies.
According to [27] , we define an architectural style as a set of constraints on a software architecture that identify a class of architectures with similar features. A software architectural style is determined by the following: • a set of component types that perform some function at runtime;
• a topological layout of these components indicating their runtime interrelationships;
• a set of syntactic and topological constraints;
• a set of connectors that mediate communication, coordination or cooperation among components.
Under the architectural style classification in [27] the reference architectural style we use in this paper belongs to layered systems.
Our reference architectural style is derived from C2 [27] . It consists of components and connectors plus a set of constraints dictating how they may be legally composed.
We assume each component has a top and bottom interface. Top (bottom) interfaces are points of interaction and they define the set of messages (i.e., of request and/or notification type) that can be exchanged with a component or connector above (below). Connectors between components are synchronous communication channels defining a top and bottom interface too. The top (bottom) interface of a component may be connected to the bottom (top) interface of one or more connectors.
Components communicate by passing two types of messages: notifications and requests. A notification is sent downward, while a request is sent upward. As usual, components implement the system functionality, and they are the primary computational constituents of a system (typically implemented as COTS components). As already said in Section 1 the aim of our approach is to automatically derive, from the COTS components, the code that implements a new component to insert in the composed system. This new component implements a software coordinator. A software coordinator mediates the interaction among components in order to prevent possible integration failures. Hereafter, we will refer to it as coordinator component or simply coordinator and to a component that is not a coordinator as component. Coordinator components, in our style, do not have an unconstrained input/output behaviour (as it is the case of components). In fact, they simply route messages (sent or received by other components) and each input they receive is strictly followed by a corresponding output, i.e., they have a strictly sequential input/output behaviour. Coordinators are introduced as a means to act on the original composed system 4 integration and communication behaviour. Within this architectural style, we will refer to a system as a Coordinator-Free Architecture (CFA) if it is defined without any coordinator. Conversely, a system in which coordinators appear is termed a Coordinator-Based Architecture (CBA) and is defined as a set of components directly connected to one or more coordinators, through connectors, in a synchronous way. In contrast to C2 where:
• both synchronous and asynchronous communication is possible;
• coordinators might perform message filtering;
• coordinators have an unconstrained input/output behaviour; in our style:
• only synchronous messages can be exchanged;
• a coordinator is only a routing devices without any filtering policy;
• a coordinator performs a strictly sequential input/output behaviour.
We have introduced the first constraint because the deadlock is a typical problem that occurs in synchronous systems. This is not a limitation because it is well-known that with the introduction of a buffer component we can always simulate an asynchronous system by a synchronous one [23] . We have introduced the second constraint because in order to apply our methodology without human intervention we have to make assumptions on the behaviour of the coordinator. The last constraint introduces a precise (i.e., a strictly sequential one) input/output structure in the coordinator. The aim of this constraint is to make the coordinator behave as a reactive component.
Since we are assuming a synchronous communication, input and output actions, that are common to the components, are considered to be blocking actions. The other actions can be performed autonomously. Thus, component behaviour will be described as LTSs and system configuration will be specified using the LTS parallel composition operator (see Definition 8) .
To define the behaviour of a composition of components, we simply place in parallel the LTS descriptions of those components. In this way, the components (in the parallel composition) synchronize only on "complementary" common actions. In other words, if a component C i is going to perform an action α its only way to progress is to synchronize with some component C j (i = j), which is going to perform the action α, unless either α is an action only of C i or it is an internal action. This gives a CFA for a set of components.
Given a CFA for a set of components, we can also produce a corresponding CBA for these components by automatically deriving and interposing a "no-op" coordinator between communicating components. The coordinator at this point (i.e., the no-op one) simply passes events between communicating components (as we will see later the no-op coordinator will play a key role in restricting the system interaction behaviour to a subset of deadlock-free and specified desired behaviours).
Definition 9 (CFA).
A Coordinator-Free Architecture (CFA) is a set of components directly connected, through connectors, in a synchronous way.
Definition 10 (CBA). A Coordinator-Based Architecture (CBA) is a set of components directly connected to one or more coordinators, through connectors, in a synchronous way.
Given n components (i.e., n component LTSs) AC 1 , . . . , AC n , we will formalize a CFA system as follows:
The corresponding CBA system that our method will generate can be modelled as:
where K is the LTS modelling the behaviour of the coordinator that is automatically generated by our method. We recall that for all i = 1, . . . , n, f i are relabelling functions and AC i [f i ] are the corresponding relabelled LTSs (see Definition 6) . Relabelling functions model the wrapping and deployment mechanisms that are used, in the practice, to interpose the coordinator component among the other components in the system [30] .
Although the CBA style is a generic layered style, for the sake of presentation we will only deal with single-layer systems (e.g., standard client-server architectures in COM/DCOM). We provide the foundations to deal with multi-layered systems in Section 7.
Method description
In this section we formalize and describe our method by using an explanatory example. As illustrated in Fig. 2 we proceed in three steps. The first step starts with a CFA system and automatically produces, obeying our CBA style, a new configuration with the same components plus a no-op coordinator. Although this new configuration contains a coordinator, under a suitable notion of equivalence, it behaves equivalently to the CFA configuration. That is, the no-op coordinator is automatically derived to model all possible interactions of components. Although it plays a key role for the execution of the second and third steps, at this point, it is a simple delegator of the requests (notifications) performed by the components below (above) it towards the components above (below) it.
The second step performs deadlock analysis on the CBA system to detect deadlocks. Subsequently, we can operate on the no-op coordinator in order to obtain a deadlock-free equivalent system, which prevents the detected deadlocks.
The third step concerns the problem of guaranteeing the coordination policies against the model of the deadlock-free coordinator. This step produces the policy-satisfying coordinator representing the failure-free composition code for the components forming the composed system. That is, the synthesized coordinator represents the deadlock-free composition code that guarantees all those component interactions that are needed for the overall purpose of the system. Note that although in principle we could carry on the three steps together we decided to keep them separate. This has been done to support internal data structures' traceability.
In the following sections we formalize and describe our approach.
Explanatory example
The explanatory example that we use in this paper to better describe and illustrate the formalization of our methodology is concerned with the automatic assembly of a client-server component-based system. This system is formed by three components: two clients, respectively denoted as C1 and C2, and one server denoted as C3. This example, although very simple, exhibits coordination problems that exemplify the kind of problems our methodology can solve. These are typically due to the presence of race conditions in accessing shared resources.
By continuing the description of the example and by referring to the method depicted in Fig. 2 , we want to assemble a system formed by C1, C2, and C3. In doing so, we want to automatically prevent possible deadlocks and guarantee the specified coordination policies, hence, guaranteeing that the system's purposes are satisfied. state. By abusing notation, hereafter, we will interchangeably use the terms "state" and "node". Each action or co-action performed by interacting with the environment of the component (i.e., all other components in parallel) is represented as a label of a transition into a new state. By referring to our reference architectural style (see Section 4), actions and co-actions can be requests or notifications. Requests and notifications can be modelled as input/output actions of a component. Within an LTS of a component, the label of an input action is prefixed by the question mark "?" (e.g., ?C3.retValue1 of C1). The label of an output action is prefixed by the exclamation mark "!" (e.g., !C3.method2 of C2). In the case of our example, input (resp., output) actions of the clients are notifications (resp., requests). For the server, input (resp., output) actions are requests (resp., notifications).
The interface of server C3 exports three methods denoted as C3.method1, C3.method2, and C3.method3, respectively. While C3.method2 has no return value, C3.method1 and C3.method3 can return some value. C3.method1 returns two possible return values denoted as C3.retValue1, and C3.retValue2. The former is returned when a call of C3.method1 has not preceded by a call of C3.method2. Otherwise, the latter is returned. C3.method3 returns only one value, i.e., C3.retValue2. The two clients perform method calls according to the server interface.
Indeed, since we want to deal with either reusable black-box components or COTS components, there might not be a direct syntactical correspondence between the action labels used by the different component LTSs. In general, this kind of mismatch cannot be solved automatically and it requires to develop (by hand) component wrappers solving that syntactical mismatch, as done in the work described in [3, 30] . Since in this work we are focusing on automatically preventing interaction protocol mismatches, we consider this problem out of the scope of this paper and, hereafter, we will assume that the component interfaces syntactically match since either they already match or suitable component wrappers have been previously developed by the system assembler (i.e., a possible user of the SYNTHESIS tool).
In the literature, semi-automatic approaches for automatically solving mismatches at the level of the interface's syntax can be found [5, 33, 37] . They use LTSs and a means to define syntactical correspondences [5, 33] , e.g., a set of synchronous vectors [37, 38] that is assumed to be given as input to the approach. The knowledge that is required to give synchronous vectors or, in general, syntactical mappings as input is the same as the one required to develop a suitable component wrapper. Thus, the previously considered assumption is not a limitation of our work with respect to the work described in [5, 33, 37] .
In the following sections we formalize our method illustrating it by means of the explanatory working example introduced above.
Method formalization
For defining a suitable LTS encoding within the SYNTHESIS tool we define the Actual Behaviour Graph (AC-Graph) which is the first basic structure we use in the framework. The term actual emphasizes the difference between the component behaviour and the intended, or assumed, behaviour of the environment.
where T is the set of arc labels such that:
An AC-Graph is an LTS with a specific textual syntax for its actions/co-actions, i.e., the way SYNTHESIS implements LTSs. The point of using them is just to clearly introduce the syntax SYNTHESIS uses for its implementation of LTSs, and hence the syntax we will use hereafter in the paper. Thus, all definitions regarding LTSs that we have given in Section 3 (i.e., Definitions 2-8) directly regard also AC-Graphs. It is worthwhile noticing that !α denotes a co-action α and ?α denotes an action α. Thus, hereafter, we extend the action complement operator of LTSs [36] to actions whose labels are prefixed by '?' or '!'. This is done by considering !α =?α and ?α =!α. By abusing notation, we also extend such an operator to sets of transition labels, traces, and set of traces as follows: let T ⊆ Act τ be a set of transition labels, then
Note that the three component LTSs drawn using SYNTHESIS and shown in Fig. 3 are the AC-Graphs of the components of our working example. For the purposes of the work described in this paper, we assume (without loss of generality) that a component AC-Graph is always deadlock-free. Now, we wish to automatically derive from the components' behaviour (modelled by means of an AC-Graph, one for each component) the requirements on their environment that guarantee a deadlock-free interaction. A system is in deadlock when it cannot perform any computation. In our setting, deadlock means that all components are blocked waiting for an action from the environment that is not possible. Thus, a first requirement is that if a component has reached a state in which it performs an input (resp., output) action ?α (resp., !α), its environment has to have reached a state in which it performs !α (resp., ?α). Furthermore, in the architectural style that we have chosen (i.e., CBA), the environment of the component can be represented only by one or more coordinators. We recall that the coordinator performs strictly sequential input/output operations only. Thus if it receives an input ?α from a component (that performs !α) it will then immediately output the received input message (i.e., it will perform !α) towards a destination component (that, in turn, performs ?α). Analogously, if the coordinator outputs a message, this means that, immediately before, it received that message as input from a source component. Intuitively, in a CBA, if there are two components AC i and AC j respectively performing !α and ?α from their respective current states, the no-op coordinator K (at its current state) performs ?α_i followed by !α_j,
in the CBA model (see Definition 10) would perform !α_i and ?α_j, respectively.
In other words, if a component AC i outputs a message α (i.e., it performs !α_i) the coordinator gets in input that message hence performing ?α_i (the symmetric case is analogous). The coordinator is the component's environment and hence it has a behaviour that is symmetric (i.e., complementary) with respect to the component behaviour. That is, if a component performs an output then the coordinator performs an input and vice versa.
Furthermore, it is worth mentioning that the coordinator does not preempt non-observable actions of a component (i.e., τ actions) since they are component internal actions. Analogously, it does not preempt non-shared actions (form the coordinator point of view, they can be considered as a kind of internal actions since they do not concern the interaction with the environment actualized at assembly-time).
The following definition formalizes the characterizing properties of a no-op coordinator AC-Graph. 
The property "Strictly I/O behaviour" states that the no-op coordinator is synthesized in such a way that each input it receives is strictly followed by a corresponding output. The property "Reachability" simply states that each state of the synthesized no-op coordinator is reachable from the initial state. The other properties do not need further comments.
In our previous work [17, 18] , we have described the algorithm that is used to automatically create the no-op coordinator AC-Graph, from the component AC-Graphs. This algorithm makes use of graph structures beyond AC-Graphs (i.e., the socalled EX-Graphs) and of a unification technique over first-order terms [21] . The no-op coordinator synthesis is actually based on the unification of the component EX-Graphs [17, 18] . An EX-Graph represents the behaviour that a single component expects from the no-op coordinator (i.e., it is a partial view of the no-op coordinator to be created and it is partial since it reflects the expectation of a single component). Each component has only its partial view of the no-op coordinator behaviour, by unifying all components views we can synthesize the no-op coordinator global behaviour. In Appendix, at the end of this paper, we report the algorithms performed by SYNTHESIS to automatically generate the model of the noop and the deadlock-free coordinator. We refer to [17, 18] for further details on the creation algorithm for both the no-op coordinator and the deadlock-free one. For the purposes of this paper, it is sufficient to say that once the no-op coordinator AC-Graph has been automatically created its characterizing properties are the ones listed in Definition 12. It is worth noticing that an empty coordinator AC-Graph can be obtained. In this case, a coordinator for the set of components given as input to our method does not exist. That is, the components' interaction always deadlocks and, hence, dealing with black-box components, there is no way to prevent the problem.
In Fig. 4 , we show a screen-shot of our SYNTHESIS tool illustrating the no-op coordinator AC-Graph automatically generated from the component AC-Graphs of our working example (shown in Fig. 3 ). By referring to Definition 12, SYNTHESIS associates to each coordinator AC-Graph's state an ordered tuple of component AC-Graph states. It is worthwhile noticing that this is done only internally. In fact, externally, SYNTHESIS denotes the ith generated state as Si (see Fig. 4 ). Moreover, as it is shown in Fig. 4 , SYNTHESIS denotes as filled states both the deadlock states (see Definition 3), e.g., the state S9, and the states that can only lead to a deadlock state, e.g., the state S8. Note that if we remove a deadlock state and all its incoming transitions, all the states from which it was possible to reach only that deadlock state in one step transition become deadlock states as well. By filling deadlock states and the ones that can only lead to them, deadlock traces (see Definition 5) can be easily discovered also by the user. For instance, the trace ?C3.method2_2 !C3.method2_3 ?C3.method1_1 !C3.method1_3 is a possible deadlock trace 5 of the no-op coordinator AC-Graph of our explanatory example.
Deadlocks occur because of a race condition among C1 and C2. In fact, one client (i.e., C2) performs a call of C3.method2 (see the sequence of transitions from the state S0 to S4 in Fig. 4 ), hence leading the server C3 in a state in which it expects a call of C3.method1. While C2 is attempting to perform the call of C3.method1, the other client (i.e., C1) performs such a call (see the sequence of transitions from S4 to S9). In this scenario C1, C2, and C3 are in the state S1, S1, and S3 of their AC-Graphs, respectively (see the bottom side of Fig. 3 ). Now, C3 expects to return C3.retValue2 as return value of C3.method1 but C2 is still waiting to perform a call of C3.method1 and C1 expects a different return value. Thus, a behavioural mismatch occurs and it results in a deadlock state in the no-op coordinator AC-Graph. 5 It is the shortest one. 
Deadlock detection and prevention
In Section 2 we divided the kinds of deadlock that can occur in our context in two: observable deadlocks and hidden deadlocks. We recall that we can focus only on deadlocks of the first class due to the different nature of these two kinds of deadlock.
In this section we present our technique to detect possible deadlocks and to prevent them. By referring to Definitions 3 and 5, deadlocks can be prevented by directly operating on the structure of the no-op coordinator AC-Graph. If deadlocks exist we can find them by performing an analysis of the no-op coordinator AC-Graph in order to discover its possible deadlock traces. By continuing our explanatory example, we already said that the trace ?C3.method2_2 !C3.method2_3 ?C3.method1_1 !C3.method1_3 is a deadlock trace of the LTS shown in Fig. 4 . In general, to prevent the deadlocks that have been detected as deadlock states of an LTS L, our SYNTHESIS tool performs backwards error propagation [12] 
In Appendix, at the end of this paper, we formalized the algorithm used to prunes all possible deadlock traces of the no-op coordinator AC-Graph in order to automatically build the model of the deadlock-free coordinator.
The following definition formalizes the characterizing properties of a deadlock-free coordinator AC-Graph.
is a corresponding deadlock-free coordinator AC-Graph if the following properties hold:
"States", "labels", and "Transitions" are trivial properties. The property "Deadlock-freedom" states that each state of a deadlock-free coordinator AC-Graph has a successor and, hence, it does not lead to a state from which states without successors can be reached, i.e., a deadlock-free coordinator AC-Graph has no deadlock states. Note that given a no-op coordinator AC-Graph, the corresponding deadlock-free coordinator AC-Graph is not unique. For the purposes of the work described in this paper, our SYNTHESIS tool generates the maximal deadlock-free coordinator AC-Graph with respect to the cardinality of D, i.e., in our context, it derives the most permissive deadlock-free coordinator. By continuing our explanatory example, this means that our algorithm for ensuring deadlock-freedom [17, 18] (reported in Appendix) prunes the two transitions from S4 to S9, and their source/target states except for S4 that is kept in the set of states.
We refer to [17, 18] (or to Appendix) for a formal description of our deadlock prevention algorithm. For the purposes of this paper it is sufficient to say that, by using SYNTHESIS, once the deadlock-free coordinator AC-Graph has been automatically built it satisfies property "Deadlock-freedom" and it is the maximal one. Notice that the deadlock prevention step might lead to an empty coordinator AC-Graph. In this case, a deadlock-free coordinator for the set of components given as input to our method does not exist and the SYNTHESIS's user is informed about that.
In Fig. 5 we show the maximal deadlock-free coordinator's AC-Graph of our explanatory example automatically generated by SYNTHESIS.
By referring to Fig. 5 , from the state S4 the action ?C3.method1_1 (i.e., the method1 request performed by the client C1 towards the server C3) has been "disabled" since it led to a deadlock. Consistently to the rules of our CBA style and to its formalization (see Section 4), for ?C3.method1_1, being disabled means that the LTS C2[f2] (that has a transition labelled with !C3.method1_1) is not able to synchronize with the coordinator AC-Graph on the action ?C3.method1_1 and from the (global) state S4. This means that for the set of component given as input to our method, the only (and most permissive) way to compose them in order to prevent possible deadlocks is to disable, for this particular case, the request of C3.method1 whenever the coordinator's execution is in the state S4. That is, the deadlock-free composed system will not exhibit that specific interaction. Although it is not the case for our explanatory example, in general, it might happen that some specific component interactions can be disabled forever (i.e., for every state of the coordinator). This might shock the component developer but it is not the same for the system assembler. It is worth mentioning that our work should be intended from the system assembler perspective and not from the component developer perspective. That is, the main goal is to build component-based systems as automatic as possible and, following the Component-Based Software Engineering vision [28] , by stressing a reuse-based approach according to the Brooks' "buy, don't build" philosophy [40] . Thus, in this scenario, the system assembler is looking for a set of components that "together" can provide the functionality specified for the system to be built (i.e., the coordination policies). It might be the case that the components also provide more than what is required (if they provide less, there is nothing to do and the composed system cannot be built by composing that components). The important thing for the system assembler is to reuse the acquired components and keep only the specified desired behaviours although it might be the case of "under-using" some component. Thus, after deadlock prevention, a further step is required and it concerns the step of guaranteeing the coordination policies (as described in Section 5.4).
At the level of the coordinator's actual code, the coordinator is a multi-threaded component that creates a thread for each request and for each caller performing such a request. Removing deadlock traces corresponds to put in a waiting state the thread that handles the request leading to the deadlock state and performed by the identified caller. Thus the coordinator will return, again, the control to the caller, for that request, only when it reaches a state in which the blocked request is allowable. 6 Such multi-threaded servers are supported by existing component technologies such as COM/DCOM or CORBA. Since this implementation is correctly reflected by our LTS modelling, hereafter, we will only be focused on the effects that our approach has on the coordinator's behavioural model (i.e., its AC-Graph) and we refer to [30, 39] for implementation details concerning the coordinator's actual code.
Generic coordination failures detection and prevention
Our approach also considers the analysis of failures beyond deadlock. The specification of these failures is provided in terms of precise ways to coordinate the components forming the composed system, which are focused on preventing such failures or, in other words, on guaranteeing the system's purposes. Each coordination policy is given in terms of an LTS.
In this section we formalize the third step of the method shown in Fig. 2 . This step concerns the problem of guaranteeing the coordination policies against the deadlock-free coordinator's AC-Graph. Moreover, by continuing our explanatory example, we also apply our technique to the synthesized deadlock-free coordinator shown in Fig. 5 .
In particular, in Section 5.4.1, we informally introduce the step of guaranteeing a policy and formalize our LTS-based notation for the specification of coordination policies, by also using it to specify a possible policy for our explanatory example. In Section 5.4.2 we formalizes our step of guaranteeing a policy by providing a formal definition for the failurefree coordinator AC-Graph, and we apply it to our explanatory example. In Section 5.4.3 we evaluate our coordination policy specification notation with respect to property specification notations commonly used in the software verification domain, e.g., Linear-Time Temporal Logic (LTL) [44, 45] .
Generic coordination failures specification
The problem we want to treat can be informally rephrased as follows: Given a set of interacting components C and a set of coordination policies P that describes precise ways to coordinate the interaction behaviour of the components in C, automatically derive a deadlock-free assembly A of these components that guarantees any policy in P, if possible. Thus the coordination policies that must be guaranteed are related to the interaction behaviour of the components forming the system to be built (through the addition of the coordinator). They represent those component interactions that are needed for the overall purpose of the composed system. The components interactions that do not reflect the ones specified by the coordination policies represent behavioural failures of the system because they do not guarantee its overall purpose. Thus, the set of components given as input to our method is taken into account for coordination policy specification purposes. Whereas the synthesized deadlockfree coordinator is taken into account for guaranteeing a coordination policy. In fact, analogously to deadlock, we cannot prevent behavioural failures of the CBA-system that are not identifiable with precise behaviours of the synthesized deadlockfree coordinator. A coordinator behaviour is modelled as a trace in the coordinator's AC-Graph. Thus the coordination policies we deal with are behaviours that might correspond to possible traces of the coordinator's AC-Graph. Since a coordinator ACGraph trace corresponds to a sequence of component AC-Graph actions, one can specify a coordination policy by looking only at the alphabet of the components AC-Graph given as input.
We recall that each coordination policy is specified in terms of an LTS. In particular, for the purposes of our method, the LTS that models a coordination policy is an AC-Graph called coordination policy AC-Graph. It will always be a deadlock-free AC-Graph. It has a syntax for the action labels expressive enough to model regular (i.e., specific), negative (i.e., all possible but one), universal (i.e., all possible) actions, logical "AND" -composition of negative actions, and logical "OR" -composition of regular actions.
To guarantee a coordination policy on the composed system interactions, our method restricts the behaviours of the deadlock-free coordinator in order to keep only those behaviours specified by the policy hence obtaining a failure-free coordinator.
Informally, firstly, a trace containment check [23] between the policy AC-Graph P and the deadlock-free coordinator ACGraph K df is performed. If the set of traces of P is contained (under a suitable notion of trace containment) in the set of traces of K df , then guaranteeing P is possible since the components, among other functionalities, provide also the functionality required for the overall purpose of the system. Otherwise, as already said in Section 5.3, the components given as input to our method do not provide the desired functionality and, dealing with black-box components, there is nothing to do. In this case, the SYNTHESIS's user is notified with an unsuccessful answer.
If the containment check successfully terminates, a kind of synchronous product [36, 38] between the deadlock-free coordinator AC-Graph and the coordination policy AC-Graph is performed, hence mechanically producing the failure-free coordinator AC-Graph. By exploiting it, the components given as input to our method can be composed, to make a system, in a way that it is deadlock-free and guarantees the specified composed system purposes.
Note that, guaranteeing a coordination policy is done analogously to guaranteeing deadlock freedom except for the fact that we do not require a specification for deadlock freedom. Actually, in order to prevent deadlocks, it is required that the components' interaction (embedded in the no-op coordinator) exhibits at least one deadlock-free behaviour. Then, through the deadlock-free coordinator, the non-deadlock-free behaviours are disabled and only the deadlock-free ones are exhibited. Analogously, for guaranteeing a coordination policy, it is required (after deadlock prevention) that the components' interaction (embedded in the deadlock-free coordinator) exhibits at least all the behaviours modelled by the specified policy. Then, through the failure-free coordinator, the behaviours that do not guarantee the policy are disabled and only the ones that guarantee the policy are exhibited.
A coordination policy AC-Graph is defined over a specific alphabet of actions that are semantically equivalent to component actions. Let C 1 , . . . , C n be the components given as input to our method, and let
T i ) \ {τ} be the universal set of observable component actions. U is ranged over by α, α 1 , α 2 , . . . . Unlike actions in a component AC-Graph, each action in U has associated an identifier specifying which component (in the CFA-system) performs that action. For instance, for our explanatory example, !C3.method1_1 models the action !C3.method1 performed by the component C1. In the following we formally define the syntax of coordination policy actions. As it is usually done, the syntax is formalized by means of a grammar that is used to build action labels for a coordination policy.
Definition 14 (Coordination Policy Actions Syntax).
The universal set CPAct U , of coordination policy actions (ranged over by l, l 1 , l 2 , . . .) over U, is the set of action labels generated by the following grammar:
where Neg is a relabelling function over U such that, let α 1 =?a 1 , α 2 =!a 2 ∈ U, Neg(α 1 ) =?-a 1 and Neg(α 2 ) =!-a 2 .
The syntax of the action labels in CPAct U is similar to the syntax of the action labels in a relabelled component AC-Graph except for two kinds of action: (i) a universal action (i.e., ?true_#) which models any possible observable component action (i.e., any action in U), and (ii) a negative action which models any possible observable component action different from the same negative action; for instance, for our explanatory example, the negative action !-C3.method1_1 models all the actions in U different from !C3.method1_1. Moreover, action labels in CPAct U can be also simple formula obtained as logical "AND" or "OR" composition of action labels. The "AND" composition is restricted to only negative actions and it is denoted by means of the notation {. . .}. The "OR" composition is restricted to only regular actions and, for it, the notation [. . .] is used.
The semantics of action labels in CPAct U is defined as follows, by means of a specific notion of semantic equivalence between coordination policy actions. Note that, since observable component actions in a relabelled AC-Graph are a particular case of coordination policy actions (i.e., they are regular actions α), our notion of semantic equivalence defines also a semantic correspondence between component AC-Graph observable actions and coordination policy actions.
Informally, our notion of semantic equivalence between two coordination policy actions l 1 and l 2 is defined with respect to the set of regular actions that can correspond to l 1 and l 2 . If l 1 and l 2 are such that the interception of their sets of corresponding regular actions is not empty, then we say that l 1 shares the meaning of (or, simply, matches) l 2 .
Definition 15 (Coordination Policy Actions Semantics
Regular action:
Negative action: Universal action:
AND-composition of negative actions:
OR-composition of regular actions:
A coordination policy AC-Graph (with respect to a universal set U of relabelled component AC-Graph observable actions) is defined over a set of transition labels that is a sub-set of CPAct U .
Definition 16 (Coordination Policy AC-Graph
be their corresponding relabelled AC-Graphs, and let U = ( n i=1 T i ) \ {τ}; a coordination policy AC-Graph over CPAct U for C 1 , . . . , C n is a deadlock-free and, possibly, non-deterministic AC-Graph P = (S P , T P , D P , s P 0 ) where S P is the set of states, T P is the set of transitions labels such that T P ⊆ CPAct U , D P is the set of transitions, and s p 0 is the initial state.
Abusing terminology, for a coordination policy AC-Graph P we say that P is non-deterministic if
To increase the expressiveness of our LTS-based notation for the specification of coordination policies, SYNTHESIS can deal with both deterministic and non-deterministic coordination policy AC-Graphs.
For our purposes, we extend the matching operator " ∼ =U" to traces of actions in CPAct U : let
In Fig. 6 , we report a SYNTHESIS screen-shot that shows a possible coordination policy AC-Graph for our explanatory example (it is a deterministic policy AC-Graph).
Each state represents states of the system to be built (through the insertion of a failure-free coordinator, if possible). The state S0 is the initial state of the coordination policy AC-Graph. "AlternatingProtocol" (shown in Fig. 6 ) specifies behaviours of the system to be built guaranteeing that method1 of C3 will be invoked by C1 and C2 using an alternating invocation protocol, i.e., it will be invoked first by C1, then by C2, and so on. By continuing our explanatory example and by considering AlternatingProtocol as the coordination policy that must be guaranteed, we show (in the following section) the application of the coordination policy guarantee step to the coordinator shown in Fig. 5 .
Generic coordination failures prevention
As informally introduced in Section 5.4.1, our algorithm for guaranteeing coordination policies can be organized in two phases.
The first phase concerns a variant of the algorithm used to perform a trace containment check [23] between two LTSs. The second phase concerns a variant of the algorithm used to perform the synchronous product [36, 38] between LTSs.
In other words, to derive the failure-free coordinator AC-Graph, our method firstly check whether guaranteeing a coordination policy is possible or not through a trace containment check between a coordination policy AC-Graph and the deadlock-free coordinator AC-Graph. Given a coordination policy AC-Graph P and the deadlock-free coordinator AC-Graph K df , this check is used to verify whether Tr(P) ⊆∼ =U Tr(K df ) or not. This check is implemented by a suitable notion of refinement [23] . Refinement, in general, formalizes the relation between two LTSs at different level of abstractions. Refinement is usually defined as a variant of simulation. In this paper, we use a suitable notion of strong simulation [23] to check a refinement relation between two LTSs with observable actions over CPAct U (i.e., P and K df ). To do this, we use the matching operator " ∼ =U" as action comparison operator of the simulation. [23] ). Let L 1 and L 2 be LTSs where
Definition 17 (Trace
Containment Under ∼ =U). Let L 1 = (S 1 , T 1 , D 1 , s 1 0 ) and L 2 = (S 2 , T 2 , D 2 ,Definition 18 (Simulation Under ∼ =U). Let L 1 = (S 1 , T 1 , D 1 , s 1 0 ) and L 2 = (S 2 , T 2 , D 2 ,s 2 0 ) be two LTSs, and let T 1 , T 2 ⊆ CPAct U for some universal set of observable actions U; a relation ≤∼ =U ⊆ S 1 × S 2 is a strong simulation under ∼ =U, or simulation under ∼ =U for short, where s ≤∼ =U v if and only if ∀s ∈ S1 : ∀l ∈ T 1 : s l −→ s ⇒ ∃v ∈ S 2 : v l −→ v ∧ l ∼ =U l ∧ s ≤∼ =U v . We say L 2 simulates under ∼ =U L 1 , written L 1 ≤∼ =U L 2 ,if and only if s 1 0 ≤∼ =U s 2 0 .
Theorem 1 (A Trivial Variant of the Analogous Theorem Described in
If Tr(P) ⊆∼ =U Tr(K df ) (otherwise it is not possible to guarantee P), the second phase of our algorithm for guaranteeing coordination policies takes into account K df and P to perform a kind of synchronous product between K df and P. We denote with K f the obtained failure-free coordinator AC-Graph.
corresponding failure-free coordinator AC-Graph if the following properties hold:
In order to correctly guarantee a coordination policy, we should guarantee that the synthesized failure-free coordinator exhibits all the interactions specified through the coordination policy AC-Graph. This is ensured by Tr(P) ⊆∼ =U Tr(K df ) and by Property "Policy-Guarantee" that guarantees that the set of traces of the deadlock-free coordinator AC-Graph is restricted to those traces that "match" all the policy AC-Graph traces.
We have given Definitions 17-19 by having in mind the case of only one coordination policy. Indeed, our methods iterates the policy guarantee step formalized by means of Definitions 17-19 for all the specified coordination policies. That is, after our method has built K f for the kth policy, it passes to build a K f from K f and the (k + 1)th policy. This method iterates (to the next policy guarantee step) only if the set of traces of the failure-free coordinator AC-Graph derived at the last iteration contains the set of traces of the next policy AC-Graph, otherwise it stops unsuccessfully.
In Fig. 7 , we show the failure-free coordinator AC-Graph for our explanatory example (the trace containment check between the AC-Graph of "AlternatingProtocol" and the deadlock-free coordinator AC-Graph of our example is valid).
By referring to Fig. 7 , we can see that the failure-free coordinator, from its initial state S0, prevents the component C2 from performing a request of C3.method1 before that component C1 has performed that request. Moreover C1 and C2 perform the request of C3.method1 by following a strictly alternating protocol, i.e., C1 first, then C2, and so on.
From the failure-free coordinator's AC-Graph, by exploiting the information stored in each state and arc, we can automatically derive the code that implements the deadlock-free coordinator component that performs the specified coordination policies (i.e., the correct composition code). The technique used to automatically derive the actual code that implements the failure-free coordinator component is presented in [30, 39] in the context of COM/DCOM and EJB components, respectively.
Evaluation
In this section, by exploiting the analysis, made in [55] , of some concurrent system specification notations existing in the literature, we evaluate our notation with respect to the existing ones and motivate its use within our application context. In the literature there are many languages for reasoning about concurrent systems in order to support their functional property analysis. For example, in the context of software verification via model-checking techniques, these languages belongs all to the formalism of temporal logics, e.g., Linear-Time Temporal Logic (LTL) [44, 45] and other similar formalisms (such as CTL, ACTL). These formalisms make use of temporal operators (e.g., globally, eventually, next, until operators of LTL) that allow one to express a variety of functional system requirements that range from safety to liveness properties.
In this paper, for the coordination policy specification task, we focused on a more simple LTS-based formalism although less expressive. In particular, our LTS-based formalism belongs to regular expressions and, hence, it supports the process of guaranteeing only safety properties.
The main motivation behind our choice has been to find, within the application domain of the SYNTHESIS tool (i.e., automatic component composition and coordination), an acceptable compromise between expressive power and simplicity of use. Actually, the simplicity of our LTS-based notation (used to specify coordination policies) increases not much the practicality of the automatic coordinator synthesis that might be carried on (by implementing the needed technical modifications) also taking into account, e.g., LTL, but it increases more the usability of the SYNTHESIS tool. As deeply discussed in [55] , it is well-known that expressing properties in LTL and other related logics is a difficult task. For example, Holzmann in [46] shows that writing LTL formulae is an error prone task and, hence, the inherent complexity of LTL may cause users to specify properties incorrectly. For this reason, the introduction of temporal logic-based techniques in an industrial software life-cycle requires specific skills and good tool support. As a matter of fact, industries are not willing to use the above mentioned techniques [47] .
Many works in the last years propose solutions to overcome this problem. While one proposal is to construct a library of predefined LTL formulae from which a user can choose [48] , other works propose the specification of temporal properties through graphical formalisms [49] [50] [51] [52] [53] [54] . Any of these solutions have advantages and disadvantages. For example, as reported in [55] 
, "... Graphical Interval Logic (GIL) [49] is sufficiently expressive but its formulae become potentially difficult to understand. This difficulty comes from the fact that its graphical notation is very close to temporal logic syntax. Visual Timed event Scenarios (VTS) [52,53] is a visual language for expressing event-based requirements. In this language system events are considered any observable and interesting changes (from the point of view of the verification) during the system execution. Thus events are considered to be more abstract and general than message exchanging (e.g., an event can be a key press or an internal state change) ...." Consequently, differently from us, they do not explicitly consider component-based systems (where interaction is modelled by message passing). "... Other approaches [58,61] define graphical languages that appear to be not easily comprehensible and not easily integrable into industrial software development processes ...."
It is well-known [57, 59] that all LTL formulae can be translated into a Büchi automaton [60] . Although this representation, as it is usual for graphical representations, looks more intuitive than the corresponding LTL formula, it can be still difficult to directly represent a property as a Büchi automaton. To overcome this problem, in the literature, extended Message Sequence Chart (MSC) notations have been proposed (see [51, 55, 56] and reference therein). Being based on MSCs, these notations are more intuitive and more commonly used in software development practice, although they still require for the users specific skills.
We are aware of the limited expressiveness of our LTS-based notation with respect to the above discussed temporal logic notations (e.g., LTS), but as already said this has been voluntarily done in order to increase the usability of SYNTHESIS in specifying coordination policies. Furthermore, we are also aware of the fact that our LTS-based notation, for complex coordination policies, can be less intuitive than a MSC-based notation such as the one described in [55, 56] . However, we have chosen a LTS-based notation in order not to force the synthesis user to learn new notations beyond LTSs, that is in order to reduce the formal knowledge, needed to use SYNTHESIS, to a single formalism, i.e., LTSs.
Moreover note that, by continuing to use LTSs as internal coordination policy specification language, we could easily simplify the task of specifying coordination policies within SYNTHESIS by implementing a library of predefined coordination policies. Each policy would be denoted by a human understandable language and automatically translated in the corresponding coordination policy AC-Graph, analogously to what has been done in [48] for properties internally expressed in LTL. However, note that the main topic of this paper is on automatic component composition and coordination and not on behavioural property or coordination policy specification. Thus, the complete treatment of property specification patterns and their mapping into our notation is out of the scope of this paper and left to possible future work.
Failure-free coordinator synthesis algorithm
In this section, we summarize the steps of the entire algorithm used to automatically build the failure-free coordinator's AC-Graph and to automatically derive from it the correct (with respect to failure freedom) assembly code for the components forming the specified CFA-system: (6) 
exit(SUCCESS).
Note that steps 1 and 2 are realized by implementing, within SYNTHESIS, the algorithms reported in Appendix and described in [17, 18] .
Step 3 is trivial and consists in checking whether K df has no transitions or not.
Step 4(a) is realized by applying Theorem 1 that, in turn, is realized by implementing the algorithm formalized by Definitions 17 and 18.
Step 4(b) is realized by implementing, within SYNTHESIS, the algorithm that is formalized through Definition 19 (a preliminary version of this algorithm is also described in [17, 18] ). By referring to Definition 19, and as already said in Section 5.4.1, in order to build the failure-free coordinator AC-Graph from the maximal deadlock-free coordinator ACGraph K df and the AC-Graph of a coordination policy P k , SYNTHESIS performs a suitable version of the classical synchronous product [36, 38] between LTSs. Differently from the classical version, this new version does not simply match the transition labels by performing a syntactical match (i.e., checking for the equality of two strings) but, in order to perform the match, it uses the operator ∼ =U that is rigorously formalized by Definition 15. Thus, the used algorithm first performs the Cartesian product of the set of states of K df and P k hence building the set of states of K f (see property "States" of Definition 19). Then, the set of transitions of K f is generated by performing the step formalized by property "Policy-Guarantee" of Definition 19. Note that, in general, at this point, K f can be a disconnected graph made of either strongly or weakly disconnected components. Finally, the algorithm discards all the disconnected components that do not contain the initial state hence satisfying property "Reachability" of Definition 19. Since this algorithm comes directly from the formalization of Definition 19, we do not add a its description in Appendix and consider for it the previous informal explanation coupled with Definition 19.
Step 4(c) is trivial.
Step 5 is out of the scope of this paper and it is described in detail in [30] .
Step 6 terminates.
Case study
In this section, we validate our approach at work by means of an industrial case study. The case study concerns the semi-automatic assembly of part of a large distributed system built in the context of the CUSPIS project [41] .
The CUSPIS project
In the European project society [41] , increasing importance is given to the issue of safeguarding, fruiting and supporting the Cultural Heritage. The European commission [41] gives highly importance to that issue, promoting actions for protection and safeguarding, improving understanding and dissemination of culture and history of the European citizen, making Cultural Heritage increasingly available and accessible. The CUSPIS project combines the Cultural Assets (CAs) 7 infrastructure with the GALILEO and EGNOS ones in order to support the Cultural Heritage safeguarding and protection. To this extent the CUSPIS project focuses on the specification, implementation and deployment of secure information mobility platforms that offer two basic services: Cultural Assets Management (CAM) and Cultural Assets Fruition (CAF).
CAF concerns the dissemination of CAs information everywhere, e.g., people can go around in a museum and receive CAs information on their mobile devices. CAM concerns the secure transport of CAs from a renter (the organization requiring the CAs) to the owner (the organization that holds the CAs).
The CAM process requires three sub-processes: (i) the certificate request, (ii) the certificate generation, and (iii) the monitoring of the CA transport.
In this work we focus on the CAM service and we show how our approach has been used to automatically implement the certificate generation service out of a set of already implemented black-box components. In Fig. 8 we show the two basic activities that the certificate generation service has to support.
In the first activity (see Fig. 8(A) ) the renter and the owner produce a request certificate that expresses their approval to move a CA from the owner location to the renter one (i.e., the CA journey). In the following we describe in details all the request certificate fields. The CA_ID field is a signed string that contains the unique identifier of the CA to be moved. Motivation is a string that describes the motivation leading to the Cultural Asset journey. The field renter (resp., owner) contains the X500 name [42] of the renter (resp., owner) entity. The field RenterSignature (resp., OwnerSignature) contains the signature of the fields (owner, Motivation, owner, Renter) that is generated with the renter (resp., owner) private key. 8 In the second phase (see Fig. 8(B) ) the CA owner and a ministry authorized person produce a validation certificate that is used to certify the ministry consensus to the CA journey. The request certificate field contains the request certificate produced during the first activity. The owner and ministry fields contain the X500 name of the owner and ministry authorized person, respectively. The Ministry Signature (resp., Owner Signature) contains the signature of the fields (request certificate, Owner, Ministry) that is generated with the owner (resp., Ministry) private key. In the following section we describe the set of existing components that we have taken into account to automatically and correctly assemble the part of the CUSPIS project that realizes the certificate generation service.
The existing components and SYNTHESIS at work
In Fig. 9 we show the CUSPIS sub-system that actualizes the certification service.
The component adaptor Ao, the X500 name server So, and the security component To reside on the owner host. The component adaptor Am, the X500 name server Sm, and the security component Tm reside on the ministry host. The renter certificate client Cr, the owner certificate client Co, and the ministry certificate client Cm can access to the owner and ministry hosts through the public network. The clients Cr and Co interact in order to produce the request certificate. The clients Cr and Cm interact in order to produce the validation certificate.
We remark that the request certificate must be always produced before the validation one. This is the overall purpose of the (sub-)system to be assembled.
In our case study we have to face two main problems of adaptation. The first problem is consequence of the use of existing components in a different context from the one they have been originally thought. In particular these components were developed in a previous project and we want to reuse them in the context of the CUSPIS project because they already realize the required functionalities. The second problem is due to the use of the adaptor and of the X500 server in different hosts, i.e., the ministry and the owner host. These different uses require different adaptations of the same components. In Figs. 10-13 we show the AC-Graphs of our existing components as they are displayed by the SYNTHESIS tool. The AC-Graphs of the two adaptor components Ao, and Am are shown in Fig. 10; Fig. 11 shows the AC-Graphs of the server components So, Sm, To, and Tm; finally the AC-Graphs of the two client components Cr, and Cm, and the AC-Graphs of the client component Co are shown in Figs. 12 and 13 , respectively.
The AC-Graphs of the adaptor and server components can be easily understood by looking at the Figs. 10 and 11 , respectively. The AC-Graphs of the adaptors and servers do not need further explanation because the semantics of their transitions is explained, in the following, while discussing the LTSs of the clients.
In the initial state the renter (resp., ministry) client Cr (resp., Cm) can send the connection request to the server So (resp., Sm) and, from the state S1, it can receive the successful connection notification. After a correct connection, the client (either the renter or the ministry) can send the request setAdaptor (to either Ao or Am), followed by the request setX500name (to either So or Sm according to the previous call of setAdaptor). The setAdaptor request is used to set the motivation and the CA_ID of the request certificate. The setX500name request is used to set the X500 renter (or ministry) name in the validation certificate. The request releaseX500 is sent from Cr (resp., Cm) to So (resp., Sm) in order to release the resource it has acquired. The request releaseAdaptor is used to release the Ao (resp., Am) resource. Note that the releaseAdaptor request involves the process of sending the renter (resp., the ministry) signature in order to sign the request certificate.
The owner client Co (see Fig. 13 ) performs almost the same behaviour as the one of either Cr or Cm. The only difference is that Co calls a setX500name followed by a setAdaptor, whereas Cr and Cm call a setAdaptor followed by a setX500Name. In Fig. 14 we show the coordination policy AC-Graph specified, by using SYNTHESIS, to model the system desired behaviour that we wish to guarantee in order to correctly assemble the previously specified components. We denote it as P 1 . It is an high-level description of a desired behaviour that we want to guarantee on the interaction of the components to be assembled in order to form the desired composed system. P 1 specifies that the ministry client Cm (i.e., C5 within SYNTHESIS) and the owner client Co have to interact with the ministry server (i.e., C4) only after both the renter client Cr (i.e., C6) and the owner client Co (i.e., C7) have released the adaptor resource (state S3 shown in Fig. 14) . In other words, it models the fact that the request certificate must be always written before the validation certificate, that is the overall purpose of the system to be assembled.
By referring to the method described in Section 5, by taking into account the AC-Graphs shown in Figs. 10-13 , SYNTHESIS automatically derives the no-op coordinator AC-Graph K no-op and, from it, the corresponding deadlock-free version K df . Finally, by taking into account P 1 (see Fig. 14) , SYNTHESIS automatically derives, after that the trace containment check between P 1 and K df has been successfully performed, the failure-free coordinator AC-Graph K f that models the correct assembly code for the system to be build. The generation of K no-op took 11.5 min, by running SYNTHESIS on a MacBook Pro, 1.83 GHz Intel Core Duo, 1 GB DDR2 SDRAM. This AC-Graph has 8031 states and 15332 transitions. Due to its size, the graphical representation within the SYNTHESIS tool of K no-op is obviously unreadable, hence we do not show it. Despite this, SYNTHESIS returns useful information about the possible deadlocks.
For instance, K no-op has two deadlock states, i.e., S4842 and S5204. Beyond these two states, it has also eight states always leading to deadlock states. Their IDs are S4881, S4994, S5215, S5240, S4841, S4891, S5203, and S5220. From the states S4881, S4994, S4841, and S4891 only the deadlock state S4842 can be reached. From the states S5215, S524, S5203, and S5220 only the deadlock state S5204 can be reached. By referring to the deadlocking states mentioned above, in Fig. 15 we show a fragment of the K no-op that concerns the final portion of each deadlock trace. In the figure the deadlock states are drawn light-gray and the ones always leading to them are drawn dark-gray.
For instance, one deadlock (among all the detected ones) occurs whenever the components Ao, So, Am,Sm, Cm, Cr, Co, To, and Tm reach, respectively, the state S1, S2, S0, S2, S3, S3, S5, S0, S0 (i.e., the tuple of component states corresponding to the global state S3553 of K no-op ), and the ministry certificate client Cm performs a request of setAdaptor towards the adaptor Am. At this point, SYNTHESIS automatically proceeds by performing a deadlock prevention procedure on K no-op hence producing the deadlock-free coordinator AC-Graph K df . It has been generated by taking 2.5 seconds, it has 8021 states and 15316 transitions. The memory usage has been 10 MB. Now, the coordination policy guarantee step must be performed by taking into account the synthesized K df and the specified P 1 . It produces the failure-free coordinator AC-Graph K f , after that Tr(P 1 ) ⊆∼ =U Tr(K df ) has been checked.
The generation of K f took 5.7 min. This failure-free coordinator AC-Graph has 17825 states and 33867 transitions. The memory usage has been 27 megabytes. Due to its size, the graphical representation within the SYNTHESIS tool of K f is obviously unreadable as it has been for K no-op , hence we do not show it. However, SYNTHESIS outputs also a textual format of the failure-free coordinator AC-Graph and by looking at its content (partially shown below) we can see that P 1 has been guaranteed. For 
. }
From K f , by exploiting the information stored in each state and arc, SYNTHESIS automatically derives the code that implements the failure-free coordinator component. That is the deadlock-free coordinator component which guarantees the specified coordination policy on the interaction of the other components in the system (i.e., the correct assembly code). The technique used to automatically derive the actual code, and the same code, that implements the failure-free coordinator, for our case study, is presented in [43] . It is discussed in the context of EJBs applications. In that paper the failure-free coordinator has been implemented in a distributed way. That is as a set of component wrappers each of them local to each component. Each component wrapper is an AspectJ aspect instrumenting the code of the wrapped component and cooperating with the other wrappers in order to realize the specified coordination policy and avoid possible deadlocks. We refer to [43] for further details on the distributed implementation of the synthesized failure-free coordinator.
Dealing with normalization
In this section we show that, under suitable assumptions, within our reference architectural style it is possible to reduce an n-layer system to a set of n single layered (sub-)systems. By means of this decomposition, the automatic coordinator synthesis approach above presented can be applied to single layered systems as well as to multi-layered ones.
In Section 4 we said that a component in our architectural style has a notion of both top and bottom interface. A component can request a service provided by another component and can receive a response (i.e., component client side). On the other hand a component can receive a request for a service it provides and can return a response (i.e., component server side). In a single layered system (e.g., a client-server application in COM/DCOM) a component that declares only its top interface is seen as a client component. Analogously, a component that declares only its bottom interface is seen as a server component. In the case of a multi-layered system, it is possible to have a component that is both server and client. This component declares both a top and a bottom interface. 9 In Section 4, we specified the behaviour of a component in terms of the LTS representing the sequences of messages exchanged with the environment. Referring to the notions of top interface and bottom interface, we can separate the behaviour of a component within the hierarchy of a multi-layered system into two behaviours: (i) top component behaviour, which is the behaviour representing only the sequences of top interface messages exchanged with the environment, and (ii) bottom component behaviour, which is the behaviour representing only the sequences of bottom interface messages exchanged with the environment. In Fig. 16 we show how to decompose a n-layer system into n single layered systems in the case of n = 2.
We are able to perform the normalization shown in Fig. 16 because, under the constraints of our architectural style and by assuming that we deal with multi-layered systems that are built by imposing non-cyclic architectural configurations, 10 we are able to decompose the component behaviour in order to make the layers of a multi-layered system completely independent to each other. In this way, we first reduce a multi-layered system to a set of single layered subsystems and then we build a coordinator for each of them. In fact by exploiting our architectural style and the above mentioned assumptions, let C i be a component in a intermediate layer of a multi-layered system, we can always derive a partition of the actions performed by C i in two disjoint sets of actions that are related to the actions of the top interface and of the bottom interface of C i , respectively. Otherwise, referring to how we model a system (see Definitions 9 and 10), C i might not be an intermediate component for all components above and below it. That is the two sets {C j } and {C k } of components above and below C i (respectively) might have components that directly synchronize with components of the other set. This, in turn, implies that the system formed by C i the components in {C j } and in {C k } would not be a multi-layered system (i.e., for instance a three layered system with C i in the intermediate layer) but it would be a single layered system where C i and the components in {C j } and in {C k } might be directly connected (through connectors) to each other.
Although it is common to see cyclic component dependencies in many real-scale component-based systems, the restriction required to perform the multi-layered system decomposition above mentioned can be often bypassed. In fact, in many cases, such cyclic dependencies can be avoided by changing the system design, e.g., by aggregating in a single component all components in the component cyclic chain or by duplicating component instances that share the same internal state in order to sequentialize the cyclic chain of components. Moreover, in a black-box or COTS component setting such as legacy systems, this kind of configurations is not common (refer to Chapter 4, Page 36 of [28] ) and, hence, our context is not always completely cyclic. 
is the set of actions {α ∈ LA AC i such that α ∈ LA AC k for some k}.
Informally, the algorithm of Definition 21 "collapses" (steps 1, 2 and 3) linear and/or cyclic paths made only of actions of the component's bottom interface, respectively. Moreover, they also avoid (step 4) possible "redundant" non-deterministic behaviours.
11
• for each pair of arcs ((ν, λ, µ) and (ν, λ, υ)) or ((ν, λ, ν) and (ν, λ, υ) The formal definition of the BAC-Graph construction algorithm is very similar to Definition 21 given by considering a set of C i actions called "TopInterface(C i )", which is defined analogously to "BottomInterface(C i )" (see Definition 20) . Thus, for the sake of brevity, we omit the formal definition of the BAC-Graph construction algorithm.
Correctness and completeness
In this section we prove correctness and completeness properties of our approach. Broadly speaking, to do this, we prove:
• Failure-freedom: the CBA-system is failure-free or, in other words, it is deadlock-free and guarantees the specified coordination policies; that is, (i) Deadlock-freedom: the CBA-system is deadlock-free; and (ii) Coordination-policy-preservation: all the traces of K f , "match" with traces of the coordination policy AC-Graph;
• Component-protocol-preservation: all the traces of K f , "projected" on the alphabet of a component AC-Graph AC i , are included in the set of traces of AC i (for all i and by ignoring possible τ-transitions in a projected trace of K f );
• Completeness: all the deadlock-free interleavings of component AC-Graph actions that "respect" the coordination policy AC-Graph (i.e., all the failure-free interleavings of component AC-Graph actions), "correspond to" traces of K f .
Proving "Deadlock-freedom", "Coordination-policy-preservation", and "Component-protocol-preservation" means proving the correctness of our approach. That is, the system formed by the components given as input plus the synthesized failure-free coordinator is deadlock-free and guarantees the specified coordination policies. Furthermore, the synthesized coordinator is the right one in the sense that it is correct with respect to the interaction protocol of the components and the interactions specified by the specified coordination policy. "Completeness" is for proving the completeness. That is, all the "safe" component interactions (with respect to deadlock-freedom and the specified coordination policy) are also interactions performed by the synthesized failure-free coordinator.
Thus, by proving the above mentioned four properties, we prove that, after the failure-free coordinator has been automatically synthesized, the only component interactions that can be performed through the insertion of the coordinator in the system are all the possible safe interactions (i.e., correctness and completeness).
Note that it is enough to prove correctness and completeness with respect to only one coordination policy, although our method allows the SYNTHESIS's user to specify more than one policy. This is trivially true by induction since we guarantee coordination policies by following an iterative process that, at each generic step, considers always a failure-free coordinator and one coordination policy. This contradicts the hypothesis that K f is deadlock-free and, hence, the proof is given. Before concluding the proof, let us explain why v is a deadlock state of K f . v is a deadlock state of K f because otherwise, i.e., if v would not be a deadlock state of K f , from v , according to the specified coordination policy (see property "PolicyGuarantee" of Definition 19), K f would be able to perform a sequence of input-output transitions. By construction of K f (see property "Strictly I/O Behaviour" of Definition 12), K f is a passive component that performs an input only if there exists a component performing the corresponding output and vice versa. Thus if v would not be a deadlock state of K f , this would mean that, in s err , would exist two components that can synchronize with K f , one would perform the output action that synchronizes with the input action of K f and the other would perform the input action that synchronizes with the output action of K f . This would also mean that s err is not a deadlock state of S cba because we have above shown that by construction from s err would be still possible to perform some action. Due to this contradiction, as done above, we can only conclude that v is a deadlock state of K f . 
the CBA-system modelled by the parallel composition of the relabelled component AC-Graphs
AC 1 [f 1 ] = (S 1 , T 1 , D 1 , s 1 0 ), . . . , AC n [f n ] = (S n , T n , D n ,
Proposition 2 (Coordination-Policy-Preservation
Proof. For the sake of simplicity, let us suppose that AC i [f i ] has no non-shared actions. By contradiction, let us suppose that there exists a trace σ such that
trivially, contradicts the hypothesis that σ ∈ Tr(K f • T i ) τ and, hence, the proof is given. ], for some i, j with i = j, on complementary actions α q−1 and α q with α q−1 =!a and α q =?a for some a, t can be rewritten as a trace of input/output actions t I/O = α 0 . . . α 2m−1−k where k is the number of non-shared actions in t, and for all h = 0 . . . 2m − 1 − k, α h is an input action ?a_i (for some a and i) and α h+1 is an output action !a_j (for some a and j). By construction, we have that t I/O ∈ Tr(K ).
Proposition 4 (Completeness
). Let AC 1 = (S 1 , T 1 , D 1 , s 1 0 ), . . . , AC n = (S n , T n , D n , s n 0 ) be
the AC-Graphs of the components forming the CFA-system (modelled as
By hypothesis, we can also suppose that there exits a t ∈ Tr(P j ) such that t = l 0 . . . l 2m−1−k and, for all h = 0, . . . ,
For property "Policy-Guarantee" (see Definition 19) , this means that
Thus, we also have that t I/O ∈ Tr(K f ). This let us conclude that if there exists, in the CFA-system, a failure-free interleaving t (i.e., a deadlock-free interleaving that respects P j ), then it corresponds to a trace t I/O of K f and, hence, the proof is given.
Related work
The architectural approach to correct and automatic coordinator synthesis presented in this paper is related to a large number of other problems that have been considered by researchers over the past two decades. For the sake of brevity we mention below only the works closest to our approach. The most strictly related approaches are in the "scheduler synthesis" research area. In the discrete event domain they appear as "supervisory control" or "discrete controller synthesis" problems [6, 26] addressed by Wonham, Ramadge et al. In very general terms, these works can be seen as an instance of a problem similar to the problem treated in our approach. However the application domain of these approaches is sensibly different from the software component domain. Dealing with software components introduces a number of further problematic dimensions to the original synthesis problem. In the scheduler synthesis approaches the possible system executions are modelled as a set of event sequences, the system specification describes the desired executions. The role of the supervisory controller is to interact with the system in order to meet system specification. The aim of these approach is to restrict the system behaviour so that it is contained in a desired behaviour, called the specification. To do this, the system is constrained to perform events only in strict synchronization with another system, called the supervisor (or controller). This is achieved by automatically synthesizing a suitable supervisor with respect to the system specification. In contrast to our method, there is one main assumption to deal with deadlocks: in order to automatically synthesize a supervisor which avoids deadlocks, they need to consider a specification of the deadlocking behaviours of the base system (i.e., the event sequences that might cause deadlocks). This is a problem because, for large systems, the designers might not know the deadlocking behaviours since they might be unpredictable.
Other works that are related to our approach appear in the model checking of software components context in which compositional reachability analysis [11] and automatic assumption generation [12] techniques are largely used. In [11] Giannakopoulou, Kramer and Cheung described a compositional approach to efficiently perform functional analysis of distributed systems. They validate the behaviour of a distributed system with respect to specified safety and liveness properties. The hierarchical software architecture imposed on the system model to be validated allows them to reduce its size. In fact, by exploiting the system hierarchical structure, they are able to check its subsystems against the specified properties. At this point, each subsystem can be minimized in order to be modelled as a single component and the analysis is incrementally carried on. In contrast to our method they are able to minimize the model of the global system by performing efficient analysis. However, the problem faced by their approach is limited to analysis while our technique goes beyond analyzing functional properties of a system by also considering the problem of automatically forcing the system to exhibit only deadlock-free and specified behaviours. In [12] Giannakopoulou, Pasareanu and Barringer faced a problem that can be seen as an instance of the general problem formulated in Section 2. In the case of these approaches the treated problem can be formulated as follows: given a component C and a desired behaviour B, find an environment E for C in such a way that E(C) ≡ B under an appropriate notion of equivalence. In this approach when model checking a component against a property, the algorithm returns one of the following three results: (i) the component satisfies the property for any environment; (ii) the component violates the property for any environment; or finally (iii) an automatically generated set of assumptions that characterizes exactly those environments in which the component satisfies the property. The difference with our approach is that they automatically synthesize the assumptions that represent the weakest environment in which the component satisfies the specified properties. That is, they deal with only two components: (i) one actual component and (ii) its environment. Moreover, they find an environment in such a way that the specified property is ensured but they do not guarantee the property for any possible environment.
Promising formal techniques for the compositional analysis of component-based design have been developed in [9, 25] . The key of these works is the modular-based reasoning that provides a support for the modular checking of behavioural properties. In [9] , De Alfaro and Henzinger use an automata-based approach to capture both input assumptions about the order in which the methods of a component are called, and output guarantees about the order in which the component calls external methods. The formalism supports automatic compatibility checks between interface models, and thus constitutes a type system for components interaction. The purpose of this work is different from ours. The authors check that two components have compatible interfaces if a legal environment letting them correctly interact there exists. Each legal environment is an adaptor for the two components. They provide only a consistency check among components interfaces. That is they do not deal with automatic synthesis of component interface adaptors (i.e., automatic synthesis of legal environments). However in [25] De Alfaro, Henzinger, Passerone and Sangiovanni-Vincentelli use a game theoretic approach for checking whether incompatible component interfaces can be made compatible by inserting a converter between them which satisfies specified requirements. This approach is able to automatically synthesize the converter. In contrast to the work presented in this paper, with respect to deadlock-freedom, the specification of the converter's requirements is assumed to be correct. Thus if, e.g., the specification would erroneously introduce deadlocks, they would not be prevented by the converter that it is synthesized in order to be completely compliant to its requirements specification. In other words, a deadlock preventing specification of the requirements to be satisfied by the adaptor has to be provided by delegating to the user the non-trivial task of specifying it.
Our research is also related to work in the area of protocol adaptor synthesis developed by Yellin and Strom [33] . The main idea is to modify the interaction mechanisms that are used to glue components together so that compatibility is achieved. This is done by integrating the interaction protocol into components by means of adaptors. However, they are limited to only consider syntactic incompatibilities between the interfaces of components and they do not allow the kind of interaction behaviour that our synthesis approach supports. Moreover, they require a formal specification of the adaptor dictating, for example, a mapping function among events of different components. Although requiring this kind of specification enhances applicability of their approach respect to the one described in this paper, it is in contrast with our need to be as automatic as possible. In fact even if other kinds of techniques to specify the adaptor are possible, providing the adaptor specification requires to know too many implementation details thus missing part of the goals of the work presented in this paper. However, if we assume to have as input that detailed adaptor specification, our approach can be used to deal with the kind of incompatibilities that Yellin and Strom face in their work. In [3, 30] , we extended the approach described in this paper in order to not only restrict the coordinator behaviour but also augmenting it in order to consider also such incompatibilities.
In other work from Bracciali, Brogi and Canal [5] , in the area of component adaptation, it is shown how to automatically generate a concrete adaptor from: (i) a specification of component interfaces, (ii) a partial specification of the components interaction behaviour, (iii) a specification of the adaptation in terms of a set of correspondences between actions of different components and (iv) a partial specification of the adaptor. The key result is the setting of a formal foundation for the adaptation of heterogeneous components that may present mismatching interaction behaviour. Analogously to the work of Yellin and Strom, although this work provides a fully formal definition of the notion of component adaptor, its application domain is different from our. Since, in specifying a system, we want to maintain a high abstraction level, assuming a specification of the adaptation in terms of a set of correspondences between methods (and their parameters) of two components requires to know many implementation details (about the adaptation) that we do not want to consider in order to synthesize the adaptor.
In our previous work [15, 16] we describe an approach to automatically synthesize software connectors whose aim is to restrict all possible component interactions in order to prevent deadlocks. In [17] we started also dealing with generic behavioural failures beyond deadlocks. Although these previous works share most of the ideas contained in this paper, they represent first attempts to face the problem of correct component assembly. The formalization provided in this paper allows us to rigorously characterize the kind of deadlocks (i.e., observable deadlocks mentioned in Section 2.1 and formally characterized in Section 5.3) and generic failures (i.e., formally characterized in Section 5.4) that we can automatically prevent in assembling a component-based system. Thanks to this formalization, we prove also correctness and completeness properties of our approach. Moreover, in our previous work, the decomposition described in Section 7 was not developed yet and, hence, it was not clear whether our approach was easily generalizable for multi-layered systems or not. Finally, also the described case study is a new one respect to the case studies treated in our previous work.
Conclusions and future work
In this paper, by means of an explanatory example, we have described a coordinator-based architectural approach to component assembly. Furthermore, we have validated the described approach on an industrial case study that concerns the development of systems for safeguarding, fruiting, and supporting the Cultural Heritage. Our approach focuses on detection and prevention of the assembly concurrency conflicts (i.e., deadlocks) and on guaranteeing coordination policies against the interaction behaviour of the components constituting the system that must be assembled.
A key role is played by the software architecture structure since it allows all interactions among components to be explicitly routed through a synthesized coordinator. By imposing this software architecture structure on the composed system we isolate the component interaction behaviour in a new component (i.e., the synthesized coordinator) that is inserted in the composed system. By acting on the coordinator we have that the system interaction behaviour can both be deadlock-free and satisfies the specified coordination policies.
Our approach requires having a bMSC and HMSC specification of the system that must be assembled from which we automatically derive the LTS description of the interaction behaviour of each component with its environment. Since bMSCs and HMSCs are common practice in real-scale contexts, this is an acceptable assumption. Moreover we assumed that an LTS specification of the coordination policies that must be guaranteed will be provided by the user.
In [30] , we show that our approach is compositional with respect to the coordinator synthesis process. That is if we build the coordinator for a given set of components and later we insert new components in the obtained system we can simply extend the coordinator model already available without need to perform again the entire synthesis process. Moreover, we have applied our approach in COM/DCOM and EJB real-scale contexts. To do this we have developed a tool called SYNTHESIS implementing the entire method presented in this paper [30] .
Analogously to other program synthesis algorithms (e.g., [22] by Manna and Wolper), our approach suffers the wellknown state-explosion phenomenon. The space complexity of the synthesis algorithm is exponential. This value of complexity is obtained by considering the complexity of the coordinator AC-Graph generation and the size of the data structure used to build the coordinator AC-Graph. At present we are able to reduce this problem by efficiently implementing the model of the centralized adaptor. That is, we internally represent the model of the no-op adaptor symbolically by using Binary Decision Diagrams. Moreover, it is worth noticing that our approach suffers the state-explosion phenomenon since we want to keep the deadlock prevention process as automatic as possible, i.e., without requiring a specification of the deadlocking behaviours or of the coordination policies that -when guaranteed -allow us to prevent deadlocks. Currently, by requiring a specification of such deadlock-preventing coordination policies, we are able to synthesize the failure-free coordinator in an efficient way (i.e., the synthesis algorithm has a polynomial space-complexity in the maximum number of states of the components) as described in [2] . In this way, we loose in terms of applicability of the approach but we definitively solve the state-explosion problem suffered by the current approach.
As future work, we plan to further reduce the state-explosion phenomenon by suitably combining our method with partial order reduction [14] techniques. This would allow us to reduce the size of the state space of the coordinator ACGraph generation algorithm. By referring to the automata-based model checking [7] , we are also working to perform onthe-fly analysis during the coordinator model building process. Other possible limits of the approach are: (i) we completely centralize the coordinator logic and we provide a strategy for the derivation of the coordinator source code which derives a centralized implementation of the coordinator component. We do not think this is a real limit because although we centralize the coordinator logic we can actually think of deriving a distributed implementation of it; (ii) we assume that a HMSC and bMSC specification for the system that must be assembled is provided. Although this is reasonable to be expected, it is interesting to investigate testing and inspection techniques -such as, for example, the one described in [13] -to directly derive from a COTS (black-box) component some kind (possibly partial) of behavioural specification; (iii) we assume also an LTS specification for the coordination policies that must be guaranteed. It would be interesting to investigate more user-friendly coordination policy specifications, for example by extending the HMSC and bMSC notations to express more complex system's behaviours.
Appendix. Algorithms of the deadlock-free coordinator synthesis, and LTSs parallel composition
In this appendix, in order to make the paper self-contained, we report the algorithms performed by SYNTHESIS to synthesize the deadlock-free coordinator and that are described in [17, 18] . We also include the formal definition of parallel composition of LTSs in the general case of more than two LTSs.
EX-Graph creation algorithm
As already said in Section 5.2, in order to automatically construct the LTS of the deadlock-free coordinator, SYNTHESIS automatically derives, for each component AC-Graph, another LTS that is a component EX-Graph. An EX-Graph is a partial model/view of the behaviour of the coordinator to be built. It is partial since it reflects only the expectations of a single component. As it will be described in the following of this appendix, the LTS of the deadlock-free coordinator is built by means of a unification algorithm over EX-Graphs.
The following is the algorithm that SYNTHESIS uses to automatically derive from an AC-Graph the corresponding EXGraph: • for all (µ, α, µ ) ∈ D i do:
. EX i contains complete information for the interactions performed in order to make the coordinator able to synchronize with C i (e.g., action ?a_i synchronizes with action !a performed by C i ), partial information for the interactions performed in order to synchronize with components different from C i (e.g., action !a_# can synchronize with all the actions ?a performed by C j for some j different from i). As it will be described in the following, the partial information of all the EX-Graphs will be handled by a unification algorithm over EX-Graphs and solved (i.e., instantiated with suitable complete information) in order to produce the complete model of the no-op coordinator behaviour. The no-op coordinator is not the deadlock-free one yet. It performs/models all the possible component interactions, i.e., the deadlock-free and the deadlocking interactions. As it will be described in the following, through backwards error propagation [12] , from the LTS of no-op coordinator and by cutting the deadlocking interactions, it is possible to automatically derive the LTS of the deadlock-free coordinator.
In the following sections we present the algorithms required for the automatic construction of the no-op coordinator and for the automatic synthesis, from it, of the LTS of the deadlock-free coordinator.
No-op coordinator synthesis
Let us now present the algorithm for the no-op coordinator creation. Since our algorithm makes use of a unification technique over first-order terms [21] , we name it unification of EX-Graphs. The no-op coordinator synthesis is actually based on the unification of the component EX-Graphs. EX-Graphs represent the behaviour that the component expects from the no-op coordinator. Each component has only its partial view of the no-op coordinator behaviour, by unifying all components views we can synthesize the no-op coordinator global behaviour.
Definitions of "unifiable" pair of actions:
Referring to [21] we can say that abstractly, the unification problem is the following: "given two descriptions x and y, can we find an object z that fits both descriptions?"
For our purpose the unification problem can be stated as follows: let an action_term be a known action label plus an integer, which denotes the component that knows that action (e.g., (t, i) is the term that denotes the action t that is known to the component C i , that is t is an action label in the action alphabet of C i ); let an action_variable be an unknown action label plus an integer number, which denotes the component that does not know that action (e.g., (v, j) is the variable which denotes the action v that is unknown to the component C j , that is v is an action label in the action alphabet of a component C k for some k = j); then given an action_term (t, i) and an action_variable (v, j), do t and v denote the same action? (i.e., is t equal to v and is i different from j?). It is worth noticing that our notion of term and variable does not exactly correspond to the usual notion given in the context of unification over first-order terms [21] . Since our synthesis algorithm is based on the idea of iteratively trying to match unknown actions in an EX-Graph with known actions in a different EX-Graph (i.e., finding a syntactic match between their labels), for the sake of simplicity, we consider a known (unknown) action as an action_term (action_variable).
In the following we give the formal definitions necessary to fully-define the notion of unification that the coordinator synthesis algorithm is based on: 
EX-Graphs unification algorithm
Intuitively, we attempt to match known actions in a EX-Graph EX i (i.e., action_terms of C i ) with unknown actions in another EX-Graph EX j (i.e., action_variables of C j ). In the following we will interchangeably use the terms state and node. The following is the EX-Graphs unification algorithm we use to automatically synthesize the coordinator AC-Graph: To make easier the understanding of a generic step of the unification algorithm, in Fig. 17 we show its first step of execution applied to the explanatory example introduced in Section 5.1. Fig. 17 shows portions of EX-Graphs for the components of the example. It also shows the no-op coordinator portion automatically synthesized by the unification algorithm after the first unification step. Moreover, the sets of action_terms and action_variables built during the first unification step are reported in the figure. Note that for each no-op coordinator node, there is also its internal label written as a tuple of EX-Graph states.
The synthesized no-op coordinator AC-Graph (see Definition 12) might contain deadlock traces. The deadlock states, in these deadlock traces, model possible deadlocks in the interaction of the components that has to be controlled by the synthesized coordinator. As said in Section 5.3, by pruning the deadlock traces of the no-op coordinator, the LTS of a deadlock-free coordinator can be automatically derived (see Definition 13) . We recall that, for the purposes of the work described in this paper, our deadlock-free coordinator synthesis algorithm produces the maximal deadlock-free coordinator. That is, SYNTHESIS automatically synthesizes the most permissive deadlock-free coordinator.
In the following section, we formalize the algorithm that is performed by SYNTHESIS to automatically derive, from the no-op coordinator AC-Graph , the maximal deadlock-free coordinator AC-Graph. The previous algorithm, in order to build the maximal coordinator AC-Graph, removes all possible deadlock traces in the no-op coordinator AC-Graph by performing backwards error propagation. Informally, by means of a classical depth-first visit, the algorithm first looks for a deadlock state. If it is the case that a deadlock state has been found, the algorithm prunes all its incoming transitions and the same deadlock state. Otherwise, the algorithm is recursively performed on all the non-visited successors of the current state. Additionally, once a successor has been visited and hence the algorithm has been recursively performed on it, the algorithm is recursively performed also on the current state since, due to recursion on its successors, it might be become a deadlock state. This is the case in which the current state was not directly a deadlock state but, however, all the traces originating from it were deadlock traces. ((s 1 , . . . , s n ), β, (s 1 , . . . , s n )) ∈ D ⇐⇒ ∃i ∈ {1, . . . , n} : (s i , β, s i ) ∈ D i ∧ ∀j ∈ {1, . . . , n} : j = i ⇒ s j = s j ∧ (β / ∈ n k=1,k =i T k ∨ β = τ)
Construction algorithm for the maximal deadlock-free coordinator
Let K = (S K , T K , D K , s K 0 ) be
Parallel composition of LTSs
