Abstract
Introduction
The software engineering community has long understood the importance of requirements elicitation. Stakeholder involvement in the elicitation process and tools to help build a common ground between stakeholders and developers is essential in order to obtain a good requirements definition. Consequently, it is not surprising that scenarios have become increasingly popular as a requirements specification technique. Scenarios describe how system components (in the broadest sense) and users interact in order to provide system level functionality. Each scenario is a partial story which, when combined with all other scenarios, should conform to provide a complete system description. Thus stakeholders may develop descriptions independently, contributing their own view of the system to those of other stakeholders.
Our objective is to facilitate the development of behaviour models in conjunction with scenarios. Such models are complementary to scen&ios. In addition to 0-7695-1050-7/01 $10.00 Q 2001 EEE providing an alternative view, we believe that there is benefit to be gained by experimenting with and replaying analysis results from behaviour models in order to help correct, elaborate and refine scenario-based specifications. We aim to provide a workbench for supporting various approaches to MSC specification, behaviour synthesis and analysis.
Message Sequence Charts (MSCs) [ 11, together with their UML counterpart Sequence Diagrams [ 2 ] , are widely accepted notations for scenario-based specification. Nevertheless, up to now, there is little agreement on the exact meaning of these graphical languages. There is, of course, a core semantics on which most approaches coincide, especially in terms of explaining a single scenario. However, when interpreting several scenarios together, we can identify very distinct approaches.
For instance, in the approach adopted by the International Telecommunication Union (ITU) [ 11 and others [3-51, the focus is on providing MSC specifications with a means for managing complexity. Basic MSCs (bMSCs) are used to specify simple sequences of behaviour. High-level MSCs (hMSCs) are directed graphs with bMSCs as nodes and edges indicating their possible order. hMSCs allow stakeholders to reuse scenarios within a specification and to introduce sequences, loops, and disjunctions of bMSCs [l]. The advantage of the hMSC approach is that it allows stakeholders to break up a scenario specification into manageable parts in a simple, intuitive, and operational way, and to show how these different parts relate.
Another approach that differs significantly is presented in [6-81. The focus here is on identifying, throughout the set of bMSCs, those states that are considered to refer to the same component state. As explained by Rudolph et al. [4] , the use of hMSCs as the only way of introducing alternative system behaviours forces stakeholders to specify many short bMSCs; whereas, in this approach, complex component behaviour can be shown in bMSCs of any length as information is provided at the component level. The main problem with this approach is that the criteria used for identifying component states are rarely made explicit within the MSC specification. Instead, when constructing a behaviour model for components, the criteria are often embedded into the synthesis algorithms.
For example, Whittle and Schumann [6] use the Object Constraint Language (OCL) to express pre-and postconditions for messages. These are traversed with bMSCs to produce a valuation of global state variables in bMSC states. These valuations are used to identify equivalent states. Another example is the statechart synthesis algorithm in SCED [SI. This approach employs the domain-specific assumption that the capability of outputting a specific message uniquely identifies the state of a component.
In this paper we present an MSC language that integrates approaches based on hMSCs and on identifying component states. However, instead of assuming specific criteria for identifying component states, we provide a simple mechanism for making this information explicit within an MSC specification using bMSC state labels In this way we aim to provide a workbench for approaches such as [6- 81 that allows for explicit additional information (usually in some other formalism such as OCL) and/or domain-specific or other assumptions with an MSC specification. Furthermore, we show how many of these assumptions can be automatically translated into bMSC state labels. MSC semantics is given in terms of Labelled Transition Systems (LTS) and parallel composition 191.
In addition, we provide for the automatic synthesis of system behaviour models from MSCs. We integrate our synthesis process to an existing model checking tool to support system requirements validation. This is done by first translating the MSC specification into a Finite Sequential Processes (FSP) specification [IO] , which can then be analysed using the labelled transition system Analyser [ 101 by model checking for deadlock, safety and liveness properties and by model animation [ 111.
In Section 2 we present the Message Sequence Chart syntax and semantics. The well-known ATM example is used throughout the paper to illustrate the presentation. Section 3 introduces the synthesis algorithm and Section 4 shows how to map some existing approaches to this one. Related work is discussed in Section 5 and the conclusions and future directions of our work are given in Section 6.
Message sequence charts
In this section, we briefly describe the syntax and semantics of Message Sequence Charts (MSCs). We also introduce the ATM example (see e.g. [8]) which is used to illustrate the different aspects of our approach. This example has several scenarios showing how a customer operates a bank account through an ATM machine and a consortium. For the sake of brevity, we use a reduced set of scenarios.
Syntax
Syntactically, the language is a subset of the MSC ITU language [I] . A basic MSC (bMSC) describes a finite interaction between a set of components (see Figure 1 L is a finite set labels. < _c(E x E) is a total ordering of events. We denote the minimal event e' such that e < e' as "e).
-lbl: E+L is function that describes each event's label. A high-level MSC (hMSC) provides the means for composing bMSCs: it is a digraph where nodes are bMSCs and edges indicate their possible continuations (see Figure   1 , bottom left, for an example). An hMSC can also have special initial and final nodes that correspond to the initial and final system states. We define hMSCs within the definition of MSC specifications. 
is the hMSC function that determines the possible continuations of the bMSCs. C is a finite set of components.
lblk(o).
----name is a family of bijective functions name). 5 + C that determines to which component each instance belongs.
A portion of the MSC specification of the ATM example is shown in Figure 1 . It consists of five bMSCs and one hMSC.
The ITU MSC language [l] has several more features. For simplicity we have excluded them, as many are simply syntactical sugar and others do not substantially change any of the results and algorithms that follow. These features are asynchronous messages, queues, co-regions, horizontal composition, inline expressions, actions and global and non-global conditions. There are other features that we have left out as we consider them to be out of scope in this initial stage of our work; however they may be included in the future. These features are timers, gates, process creation and termination, and incomplete messages. We define the semantics of MSC specifications in terms of labelled transition systems (LTSs) and parallel composition 191. We first define the semantics of an instance, then go on to that of components, and finally we define the system that is determined by an MSC specification.
There are the two types of information that an MSC specification provides: sequences of message input and outputs, and information on states. Information on sequences of messages is provided by instances. For example, reading from the top to bottom, it is intuitive to say that in the Bad Bank Account bMSC of Figure To simplify the presentation of semantics, we shall assume normalised instances. A normalised instance is an instance in which there are no consecutive states and no consecutive message events, but in which they alternate, i.e. for all events e, e' such that e'=suc(e) then ( e 6 Cond and e'€ {In uOut}) or ( e € {In u O u t } and e'e Cond). In addition, it has states as the first and last events, i.e. if e is the minimal or maximal event in E then e E Cond.
Normalised instances have events of a special kind, zevents, that represent internal changes of a component's state. Normalising an instance is done by using r-events to separate consecutive states and states labelled with E to separate consecutive message events. --Start is the minimal event in E A E (S x A x S) is the transition relation where (q x a x q ' ) E A if and only if there is a message event e E E such that suc(q) = e, suc(e) = q', and lbl(e) = a We shall refer to the maximal state in E as Stop.
State labels and hMSCs provide information on states of instance LTSs. State labels identify component states indicating that, although they appear as distinct in instances, they are actually the same internal component state. For example, there are three different bMSCs in Figure 1 where the ATM reaches a state called Verifying. In terms of component behaviour, this means that the ATM could "switch" between bMSCs when arriving at that state. hMSCs provide information on how components can continue once they have completed a bMSC. In other words, they determine a relation between the Start and Stop states of Instance LTSs. For example, according to the hMSC in Figure 1 If lbl(q) = lbl(q') and lbl(q) # E then (q, q ' ) E R and (q',
If lbl(q) = Init, then (Init, q) E R..
In conclusion, components defined by an MSC specification are the result of putting together their instance LTSs and their continuation relation.
-(9, 4 ) E R . 
)

Synthesis of behaviour models
In this section we show how an LTS model can be synthesised from an MSC specification. We translate the MSC specification into a model specification in the form The 'synthesis algorithm is outlined in Figure 3 . We provide our explanation while applying it to the synthesis of the ATM component of Figure 1 . The algorithm first adds state labels to bMSCs to distinguish those states that correspond to starting or stopping of bMSCs: the modified Bud Bunk Account bMSC is shown in Figure 5 .
Second, the algorithm constructs a subset of the relation described in Section 2. Actually, the inverse relation "IS a Continuation of' is built, and only for states, as opposed to all bMSC states. Figure 4 shows each state and the set of states that are a continuation of it. Third, every instance is broken-up in order to start and 6 shows the two instances that result from breaking up the ATM instance of the Bud Bunk Account bMSC. Fourth comes the actual synthesis. Every instance is trivially translated into an FSP-production by using its first state as the left side of the production and the sequence of events and last state as the right side of the production. Additional productions are constructed by replacing each state with one of the states that can continue it. For the instances in Figure 6 , two productions are constructed by the translation, and two from the fact that E-CustomerArrives and E-BudBunkPussword are continuations of B-BudBunkAccount.
In the fifth step the algorithm generates an FSP specification by merging productions with the same left side into one production using the choice operator. Finally, unreachable non-terminals and duplicate productions are eliminated. The final specification FSP specification, and actual output of our implementation, for the ATM component process is shown in Figure 9 . The complete system is the parallel composition of all FSP components:
Once an FSP specification has been generated, LTSA can be used to build an LTS model of each component and of the complete system. Furthermore, LTSA can build minimised models of the FSP specification with respect to observational equivalence [9] , which provides a more compact model and potentially clearer insight into its behaviour. In Figure 8 we show a minimised LTS of the ATM generated by LTSA. It is worth noting the impact that the state Verihing has had on the resulting model of the ATM component. Stakeholders have explicitly stated that the state in which the ATM is in after sending a Verify Account Message is the same whatever bMSC is being executed. This fact can be seen in Figure 8 where state 5 represents the Verifying state. Since the synthesis algorithm preserves the semantics of the MSC specification [12] , the synthesised model can be analysed to provide sound and useful feedback to those who wrote the MSC specification. An immediate result when the complete ATM example is analysed is that the system may deadlock. In Figure 10 LTSA shows a trace that takes the system to deadlock: if the User cancels just after entering a password but before receiving an answer, the ATM, which has requested the account to be verified, does not wait for the answer from the Consortium. Eventually when the ATM serves the User again, it cannot communicate with the Consortium as the latter is still trying to communicate the results of verifying the previous account.
Implementation
The synthesis algorithm is implemented in Java. It inputs MSC specifications in textual format [ 11 and outputs an FSP specification. We intend to embed the implementation into LTSA. The implementation, together with the examples used throughout this paper, is available at [13] .
As our synthesis algorithm builds component specifications one at a time, computational complexity is not a critical issue; however, the number of states and the size of the hMSC specification could have an important impact on the number of synthesised productions and clean-up procedures. We show that this impact is low in Table 1 where the execution times for the three examples presented in this paper are shown. All examples were run on a Pentium 111, 300Mhz, 256Mb with Windows NT 4.0 and Java 1.3. 
A workbench for synthesis of behaviour models
So far we have presented an MSC specification language that integrates approaches based on hMSCs and on identifying component states. We have also provided a synthesis algorithm that generates LTS behaviour models. In this section we illustrate how this approach can be used as a workbench for experimenting with different approaches for behaviour model synthesis.
Many synthesis procedures have been proposed and, although they agree on the basic interpretation of MSCs, they differ greatly in terms of the algorithms and the results obtained. This is because these approaches embed assumptions in their synthesis algorithms. These can be domain-specific assumptions, assumptions on how to include additional information provided in alternative specification languages or other assumptions based on, for example, characteristics of the stakeholders, organization or development process. Synthesising behaviour models with certain assumptions in mind is not a problem; however, having assumptions embedded in the algorithms results in less flexibility. In this section we show how our approach can be used as a common workbench for these approaches by making assumptions explicit using states. We demonstrate this for two approaches: firstly where additional information is used to determine equivalent states between scenarios [6, 71; the secondly [8] where it is assumed that the capability of outputting a particular message uniquely identifies the state of the component.
Additional information and process-specific
Whittle and Schumann [6] present an algorithm for automatically generating UML statecharts from scenarios combined with a set of message pre-and post-conditions given in UML Object Constraint Language (OCL). In fact, a labelled transition system is first synthesised and then some abstraction techniques are used to build statecharts. LTSs are synthesised by combining information of MSCs and OCL: a valuation for every state of the bMSCs is inferred from the OCL specification. These valuations are used in two different ways:
Within an instance, two states with the same valuation determine a loop. As not all message occurrences provoke a change in valuation it is also required that valuations be the result of a state-changing message. Between instances, two states with the same valuation and a common incoming message label are considered to be referring to the same state. In addition, there is an underlying assumption: stakeholders describe system behaviour from its initial state.
In order to place this approach in our setting and make assumptions v all assumptions explicit, we address each of the three issues mentioned above separately. First of all an hMSC is constructed to make explicit the assumption that scenarios describe system behaviour from the very start of it (see Figure 11 ). Second, after valuations have been inferred from the OCL specification, and loops have been detected, all states in bMSCs are labelled with the message that precedes it and its inferred valuation. In Figure 12 we show bMSC Bud Bank Account as provided in [6] and part of the same bMSC annotated as explained above. Valuations are shown as vectors where each position represents the value of a variable in the OCL specification.
Finally, every loop, detected using the algorithm provided in [6] , is used to split bMSCs into three: the initial part that occurs before the loop, the looping part, and the part that occurs after the loop is exited. The hMSC is modified to reflect the relation between the new bMSCs resulting in an hMSC as in Figure 13 .
The final MSC specification has all the information in it and all assumptions have been made explicit through a process that can be automated. In other words we have shown that both the ideas and algorithms developed in [6] can be used with our approach giving the benefit of a potentially standardised synthesis algorithm and placing all information in one specification. It is worth noting that the resulting LTS corresponding to the ATM component turns out to be the same as the one for the previous specification (Figure 8 ). The complete example can be found in [ 131. Systa [8] presents the statechart synthesis algorithm implemented in SCED. The algorithm is based on the BKalgorithm [ 141 for constructing programs from example computations. In the synthesised statecharts, states are labelled with component actions (message outputs) and transitions with a sequence of events (message inputs). In addition, all states that represent the same action are merged as long as they do not introduce non-determinism.
Domain-specific assumptions
The underlying rationale for this approach is the assumption that scenarios describe a particular type of component in which the capability of outputting a particular message uniquely identifies its state . This assumption can be made explicit in an MSC specification using state labels. However, there are some minor aspects of the approach that cannot be completely mapped into our setting. Firstly, in [8] resulting statecharts have no initial state. This may not be appropriate, so we prefer to be explicit using an hMSC as shown in Figure 11 . Secondly, in [8] non-deterministic models are prohibited as a way of avoiding unwanted component behaviours [ 151. We see no need for imposing such a constraint particularly when tools, such as LTSA, for analysing and detecting such behaviours are available. Besides, non-determinism may be desirable and deliberately introduced by designers in some cases.
In conclusion, modulo non-determinacy and initial states, we map the assumptions that are used in the approach by adding two sets of states. To merge states with the same enabled output messages, a state label B-<Message Label> is added to component instances just before outputting any message. To model statechart states that represent actions, a state label E-<Message Label> is added to component instances just after outputting any message. In Figure 14 , the bMSC of Figure 12 is annotated as explained before. Thus, using state labels, an MSC specification can be built that has all the information on how states are supposed (according to [SI) to be merged. The complete example can be found in [13] .
Related work
Several semantics for scenario-based languages have been proposed, and also a number of synthesis techniques for building models from a scenario description have been developed. Our work focuses on integrating some of these approaches and on providing a workbench for supporting these and future approaches.
There are many approaches that generate statechart models from MSCs [6, 8, 161 . Authors argue that statecharts provide a more structured, and therefore understandable, view of behaviour. Automatically synthesising this structure does require that some design decisions be embedded into the synthesis process. However, we argue that this can be counter-productive. Design decisions should be explicit and changeable, particularly as they may vary a great deal according to the system, designer, and organization. Our approach gives special importance to producing analysable models and uses standard minimisation techniques to help to provide compact, comprehensible models.
Many approaches do not explicitly provide semantics for the scenario language they use, providing instead a synthesis algorithm to some other notation. Broy et al. [ 161 present a statechart synthesis algorithm in which interpretation of conditions is similar to our use of state labels. However their approach does not support hMSCs. We share the authors' view of MSC specifications as an exact representation of interaction sequences, and also the synchronous communication setting.
Whittle and Schumann [6] focus on synthesising readable and understandable statecharts. Besides the LTS synthesis discussed previously, they provide a means of introducing structure and hierarchy into the synthesised model. Although the use of additional information in terms of OCL specification provides interesting feedback of possible specification errors, and the whole approach provides a bridge between initial MSC specifications and more complex specification techniques, the approach tends to be obscured by the assumptions that are embedded into the synthesis algorithms. We have shown how these assumptions can be made explicit, thereby providing a clearer and more transparent basis for their work. Some et al. [7] also use additional information to infer equivalences between states. We believe that they too might benefit from being mapped onto our approach to make the results of the inference process explicit. Systa's approach [8] discussed in previous sections also focuses on synthesising readable and understandable statecharts. The approach has strong assumptions embedded into the synthesis algorithm as to when states should be merged. We have shown how these assumptions can also be made explicit by mapping the approach to ours.
Some approaches give a different semantics to hMSCs and state labels. ways of expressing the systems behaviour, using long or short, and several or few bMSCs as appropriate.
The formal semantics of MSCs proposed by Cobens et al. [5] is part of the 2.120 recommendations for MSCs. The semantics of hMSCs differs slightly as a late decision assumption is used. Late decision means that a component, when choosing between two different possible scenarios, will postpone the decision if both scenarios have common initial events. In our approach this needs to be explicitly stated using state labels. The advantage of the late decision assumption is that it can reduce the size of specifications. However we again prefer to make this assumption explicit. Late decision semantics could be translated automatically into state labels in our workbench. Furthermore, the 2.120 formal semantic definition is given in terms of process algebra, with non-standard operators of delayed choice and delayed parallel composition. We prefer the more standard model of LTS with parallel composition. Other similar formalisations to [5] are given in , however only bMSC semantics is given.
'
Van Lamsweerde et al. [20] present a different approach to synthesis. A set of examples and counterexamples expressed as scenarios is used to infer a temporal logic specification. Thus, generating explicit declarative requirements from an operational description. Combining these requirements with LTS models may be an interesting possibility for future work.
Alur et al. [21] give the semantics of bMSCs in terms of a partial order of events occurring in the whole system, as opposed to considering one component at a time. The focus is on characterizing and providing algorithms for checking satisfiability, and weak and strong realizability. As in our view, MSCs determine a unique set of system components rather than a set of system traces, issues such as satisfiability and weak realizability do not apply, while strong realizability (essentially deadlock freedom) can be verified using LTSA. However, we are currently looking into MSCs with semantics based traces and analysing the scenarios implied by a model that satisfies the MSC specification. Finally, Ben-Abdallah et al. [22] focus on detecting process divergence and non-local choice, for which efficient algorithms are given.
Conclusions
We have presented a language for MSC specifications that integrates approaches based on hMSCs and on identifying component states. We have implemented a synthesis algorithm that generates LTS models for model analysis in LTSA, and illustrated how this approach can be used as a workbench for model synthesis and analysis.
Using hMSCs we help to manage complexity of MSC Specifications, promoting scenario reuse, and providing a simple, intuitive, operational way of showing how scenarios relate. Using state labels to provide information on component states we help to make explicit any additional information, and domain-specific or general assumptions in MSC specifications. By generating FSP specifications, our approach integrates with LTSA, thus supporting model checking of deadlock and safety and liveness properties. There is also the potential for model animation [ l l ] as a means of including further domain constraints and of making the models more comprehensible to stakeholders and developers. Finally, by taking two dissimilar approaches with their own assumptions and their own means of adding information to MSC specifications, and by showing how they can be built using our approach, we have indicated how our approach could serve as a workbench for other synthesis approaches .
Scenarios have proved to be a good tool for bridging the gap between stakeholders and developers. However, up to now, this is mainly a one-way bridge in which developers gain more insight of stakeholders' domain knowledge. Future work will be focused on building a bridge in the other direction, i.e. building mechanisms to provide feedback of the developer's world to stakeholders. Preliminary work in this direction is promising. We are automating the construction of alternative system views from synthesised LTS models. Interestingly, many views can be generated by taking advantage of the semantic overlap between hMSCs and state labels. The latter identify component states across scenarios, while the former provide information about all components by relating bMSCs. Moving information from state labels to hMSCs allows for a large number of possible views that vary from long bMSCs that start at the system's initial state to short bMSCs that optimise reuse. These views can allow stakeholders to gain more insight into their own MSC specifications or be used by designers to show the impact of their changes to behavioural models in a language that stakeholders manage.
