Abstract Transforming Unified Modelling Language (UML) models into a formal representation to check certain properties has been addressed many times in the literature. However, the lack of automatic formalization for executable UML models and provision of model checking results as modeller-friendly feedback has inhibited the practical use of such approaches in real life projects. In this paper, we address those issues by performing the automatic formalization of the Foundational subset for executable UML (fUML) models into communicating sequential processes without any interaction with the modeller, who should be isolated from the formal methods domain. The formal analysis provides the modeller with a UML sequence diagram that represents the model checking result in the case where an error has been found in the model. This work also considers the formalization of systems that depend on asynchronous communication between components in order to allow checking of the dynamic concurrent behaviour of systems. We have designed a comprehensive framework that is implemented as a plugin to MagicDraw (the CASE tool we use) that we call Compass. The framework depends on Epsilon as a model transformation tool that utilizes the Model Driven Engineering approach. It also implements an optimization approach to be able to model check concurrent systems using FDR2, and at the same time comply with the fUML inter-object communication mechanism. In order to validate our framework, we have checked a Tokeneer fUML model against deadlock
Introduction
Formal methods benefit from its mathematically rigorous representation that enables automatic analysis using model checkers and theorem provers. However, not many software engineers (modellers) have the specialist mathematical knowledge to model their industrial size systems formally. On the other hand, semi-formal modelling notations, such as Unified Modelling Language (UML) [24] , are easy to use and understand by software engineers, making UML the de-facto standard for modelling object oriented systems. The impossibility of automated analysis or checking of the UML models made it very risky to use UML in modelling safety-critical systems.
Much work has been done to make use of the two domains' advantages (formal and semi-formal modelling) by letting the modeller develop the system model using UML and then automatically transforming it to a formal representation which can be checked against certain properties. Throughout the paper we will refer to this process as "formalization".
By reviewing and analyzing the previous work (refer to Sect. 9 for more details) we have observed several issues that we consider are the main barriers for the practical use of UML formalization in real life projects. First, the avoidance of having a comprehensive framework that isolates the modeller from dealing with the formal methods, and at the same time integrates with the current case tools. This isolation requires providing the modeller with modeller-friendly debug feedback in case of a problem in the checked model. Second, asynchronous inter-object communication has been addressed rarely in this field, yet in many systems this kind of communication is preferred due to its simplicity and modularity compared to other ways of communication that require tight synchronization between the system's objects (e.g., using a clock). Finally, using UML as a semi-formal language requires tremendous effort to formalize such a huge standard, which has been developed mainly to provide modellers with a multi-view modelling approach. Moreover, formalizing the UML models cannot be a direct process because of its excessive flexibility which increases the gap between it and the corresponding formal model.
The main originality of our work comes from addressing the aforementioned issues. We propose a comprehensive framework that uses Foundational subset for executable UML (fUML) [25] as a semi-formal modelling language. Compared to UML, fUML is a more restricted subset of the UML2 standard that has a well defined structural and behavioural semantics. Our framework also isolates the modeller from the formal methods domain through the whole model checking cycle from the beginning until providing him with a UML sequence diagram (modeller-friendly) that describes a problem scenario (if found). We have implemented this framework as a plugin that integrates with MagicDraw, 1 the case tool we use in this work.
We also consider in this work the formalization of the asynchronous communication mechanism between the system objects. We took the well defined specification of the inter-object communication in the fUML standard and formalized it in communicating sequential processes (CSP) [14] . Although the standard was clear in defining this mechanism, it left the event dispatch scheduling (how are signals processed when received?) as a semantic variation point to be defined by the fUML execution engine implementor. The formalization of this point allowed us to test different interpretations.
Having the inter-object communication mechanism formalized allowed for checking overall system behaviours. In this paper, we will focus on deadlock freedom only as a sample system behaviour to check. We also chose the Tokeneer project [3] as a case study to validate our framework. This paper extends our previous paper [1] on this area; it introduces the formalization framework that automates the transformation process using Epsilon [17] as an Model Driven Engineering (MDE) framework. We developed a group of Epsilon transformation rules which depend on the available fUML [25] and CSP [32] meta-models. This paper also considers the automatic generation of a sequence diagram that represents the counter-example in case of deadlock.
The rest of this paper is organized as follows. In Sect. 2, we give a brief background about the fUML standard and CSP. In Sect. 3, we introduce the Tokeneer project as the used case study in this work. In Sect. 4, we give an overview of the formalization framework. In Sect. 5, we describe the Model Formalizer, the most important component in the framework. In Sect. 6, we describe the role of FDR2 to check the model against deadlock. In Sect. 7, we describe how the framework automatically provides modeller-friendly feedback. In Sect. 8, we outline the implementation of the framework as a plugin to MagicDraw. Finally, we discuss related work and conclude in Sects. 9 and 10, respectively.
Background

fUML
As defined by OMG, fUML acts as an intermediary between "surface subsets" of UML models and platform executable languages (e.g., Java) [25] . fUML models are executable models, which means they can be used by code-generators to generate full executable code directly from the models, or model-interpreters that rely on a virtual machine to directly read and run the models (e.g., fUML Reference Implementation [19] ).
The fUML specification is a subset of the original UML2 specification [24] . This subset was defined by specifying modifications to the original abstract syntax (of UML2) of the class and activity diagrams. These modifications are specified in clause 7 of the standard [25] by merging/excluding some packages in the UML2 specification, as well as adding new constraints.
As defined in the fUML standard, we are listing below some of the modifications to UML2 that are relevant to our case study (Tokeneer ID Station) fUML model. All of those modifications are related to the fUML activity diagrams since our goal is to capture the behaviour of our model.
-Central buffer nodes are excluded from fUML because they were judged to be unnecessary for the computational completeness of fUML. -Variables are excluded from fUML because the passing of data between actions can be achieved using object flows. -Exception handlers are not included in fUML because exceptions are not included in fUML. -Opaque actions are excluded from fUML since, being opaque, they cannot be executed. -Value pins are excluded from fUML because they are redundant with the use of value specifications to specify values. The operational semantics of fUML is an executable model with methods written in Java, with a mapping to UML activity diagrams. The declarative semantics of fUML is specified in first order logic and based on Process Specification Language (PSL) [11] .
Inter-object communication mechanism in fUML
This part gives an overview of the semantics of the interobject communication in fUML as defined by clause 8 in the standard [25] . Such communication is conducted between active objects only. Active objects in fUML communicate asynchronously via signals. Each active object is associated with an object activation which handles the dispatching of asynchronous communications received by its active object. Figure 1 shows the structure related to object activation.
Object activation maintains two main lists: the first list (event pool) holds the incoming signal instances waiting to be dispatched, and the second list (waiting event accepters) holds the event accepters that have been registered by the executing classifier behaviour. Event accepters are allowable signals with respect to the current state of the active object.
The fUML standard permits the specifier (tool implementer) to define a suitable dispatching mechanism for signals within the event pool (semantic variation point). The default dispatching behaviour, as described in [25] , dispatches events on a first-in first-out (FIFO) basis.
CSP
CSP [14] is a modelling language that allows the description of systems of interacting processes using a few language primitives. Processes execute and interact by means of performing events drawn from a universal set . Some events are of the form c.v, where c represents a channel and v represents a value being passed along that channel. Our UML/fUML formalization considers the following subset of the CSP syntax:
The CSP process a → P initially allows event a to occur and then behave subsequently as P. The input process c?x → P(x) will accept a value x along channel c (corresponding to performance of the event c.x) and then behave subsequently as P(x). The output process c!v → P will output v along channel c (corresponding to performance of the event c.v) and then behave as P. Processes interact by synchronising on the events c.v. Channels can have any number of message fields, as a combination of input and output values, for example: c!v?x : E → P(x). Also, x can be constrained to be a value from the set E.
The choice P 1 2 P 2 offers an external choice between processes P 1 and P 2 whereby the choice is made by the environment. Conversely, P 1 P 2 offers an internal choice between the two processes.
The parallel combination P 1 A B P 2 executes P 1 and P 2 in parallel. P 1 can perform only events in the set A, P 2 can perform only events in the set B, and they must simultaneously engage in (i.e., synchronize on) events in the intersection of A and B. The interface parallel P 1 A P 2 requires synchronization only on those events in the common set (interface) A. The process P\A behaves like P except that the events from A have been internalized. In other words, all these events are removed from the interface of the process and no other process will be able to engage with them. The let . . . within statement defines P with local definitions N i = P i . The conditional choice i f b then P 1 else P 2 behaves as P 1 or P 2 depending on the evaluation of the condition b.
Tokeneer: case study introduction
The Tokeneer project [3] is one of the most interesting pilot projects forming part of the Verified Software Initiative [15] , and has been cited by the US National Academies as exemplifying best practice in software engineering [16] . The project was certified to Common Criteria Level 5 and in the areas of specification, design and implementation achieving Levels 6 and 7. The Tokeneer project re-developed one component of a Tokeneer system that was developed by the National Security Agency (NSA) to provide protection to secure information held on a network of workstations situated in a physically secure enclave. A survey of other projects using formal methods has been discussed in [37] .
The entire project archive has been released [2] for experimentation by researchers. This includes the project specifications written in Z [4] and an open source implementation. Woodcock and Aydal [36] have conducted several experiments using model-based testing techniques to discover 12 anomalous scenarios which challenged the dependability claims for Tokeneer as a security-critical system. Several of the scenarios highlight the importance of the behaviour of the user because one of the security objectives for Tokeneer is to prevent accidental, unauthorized access to the enclave by a user. The user was not formally modelled in the Z specification [2] . We also note the importance of modelling the user in our analysis.
Our motivation for using the Tokeneer project as a case study was not to re-validate the project but rather to investigate the concurrent behaviour of the various components of the Tokeneer ID station (TIS) subsystem in the context of asynchronous communication.
The correspondence between the Tokeneer formal specifications [2] and our Tokeneer fUML model is not a oneone relationship. Our Tokeneer fUML model contains more implementation details that are abstracted in Tokeneer Z specifications. Therefore, our formal analysis benefits from being able to examine the low level details of asynchronous communication. Such an analysis allows us to investigate potential deadlocks which might occur if the formal specifications were implemented using such communication mechanisms.
TIS subsystem structure
The components of interest in the TIS subsystem are represented on the class diagram in Fig. 2 . We do not formalize the class diagram, and its inclusion is just to illustrate the relationship between the system's components.
Door: This is the physical enclave's door that the user opens to access the secure enclave. It has no intelligent behaviour as it is entirely controlled by the door controller component. The two main attributes of this component are: isOpen attribute which indicates the status of the door (opened or closed), and the isLocked attribute which indicates the status of the door's latch (locked or unlocked).
Door controller:
This component controls the door's latch status (isLocked) by setting its value based on the incoming signals from the User Panel. It also manages two timers: the first timer watches if the door is kept closed and unlocked, and the second timer watches if the door is kept opened and locked.
User: This component models the user behaviours toward the system. He is responsible for requesting the enclave entry, and opening the door in case it was successfully unlocked by the User Panel. He is also responsible for closing the door after accessing the enclave. The system may serve more than one user at the same time. However, the results in this paper focus on a single user only.
User panel: This component models the behaviour of the panel with which the user interfaces to gain access to the enclave. It is responsible for deciding whether the user is allowed to access the enclave or not.
Alarm: This component holds the status of the alarm (alarming or silent), based on the setting/resetting by the Door Controller component to the isAlarming attribute.
TIS subsystem behaviour
In the Tokeneer fUML model all objects (of the above classes) which have interesting behaviour have associated activity diagrams. The Alarm object is a simple data holder and thus no activity diagram is associated with it. For the purpose of this paper, we choose to focus on a segment of the Door Controller activity (depicted in Fig. 3 ), which includes all the described elements in Sect. 5.1. Initially, the Door Controller waits for the unlockLatchSignal to be sent by the User Panel when the User requests an entry to the enclave and he is authorized to do so. When receiving this signal, the Door Controller changes the status of the Door's lock to be Unlocked by setting the attribute isLocked to FALSE. Consequently, the Door Controller sends unlockingDoorCompleteSignal to the User Panel to indicate the completion of the Door unlocking. At this point, the Door Controller starts a timer to watch if the User did not open the Door after getting the permission for entry. The two possible scenarios for the timer expiry (lockTimeoutExceeded or lockTimeoutNotExceeded) are represented as an internal decision. The lockTimeoutNotExceeded choice corresponds to the door opening within the allowed time. If the timer timeouts the Door Controller sends the lockLatchSignal to itself to change the Door's lock status to Locked. Otherwise, the Door Controller will accept the doorIsOpenSignal from the Door's object to continue its normal behaviour until sending the entryAuthorizedSignal to the User's object.
Framework overview
In this work we propose a framework that allows fUML models to be formalized to CSP automatically and checked for deadlock using FDR2. The framework also translates FDR2 output to a modeller-friendly format (UML sequence diagram). Figure 4 shows the overall architecture of this framework and the used components.
Initially, the modeller develops the system fUML model using the case tool (MagicDraw). The model should include an fUML activity diagram for each active class in the system to describe its behaviour. Based on a feature in the case tool, the framework exports the fUML model into an XML Metadata Interchange (XMI) [23] format, thus it can be read by any MDE framework for transformation.
At this point, the Model Formalizer reads the fUML model (represented in XMI) and transforms it to a CSP script based on the available fUML [34] and CSP [32] meta-models. The Model Formalizer uses the Epsilon model management framework to perform the model-to-model and model-to-text tasks. The generated CSP script contains a process for each active class in the system, as well as a formalization for the inter-object communication mechanism to allow those processes to communicate with each other asynchronously via signals. The Model Formalizer also generates an Objectto-Class mapping table, which will be used for traceability to relate the modeller-friendly feedback to the original fUML model. In the case of a problem during the formalization process (e.g., an fUML activity diagram without a connected initial node cannot be formalized), the Model Formalizer generates the Formalization Report which reports the error(s) in the fUML model which led to this problem.
Consequently, the framework launches FDR2 to check the generated CSP script for deadlocks. In case of deadlock, FDR2 generates a counter-example which includes the traces (sequence of events) that led to the deadlock. The UML Sequence Diagram Generator reads this counter-example and visualizes it in the form of a UML sequence diagram making use of the information stored in the Object-to-Class mapping table. The generated sequence diagram represents the deadlock scenario in a modeller-friendly format which visualizes the objects interactions in a chronological order.
The following sections provide more detail regarding each component included in this framework.
The Model Formalizer
The main functionality of the Model Formalizer component is to translate the input fUML model to CSP. The component achieves this translation in three stages:
1. Translating the fUML activity diagrams into CSP processes. 2. Generating CSP processes that represents the inter-object communication mechanism.
3. Combining all the previous CSP processes into one single process that represents the whole system (SYSTEM).
The following sections describe each of those stages and how the Model Formalizer automates the formalization.
fUML activity diagrams formalization
We perform the translation from fUML activity diagrams to CSP based on a collection of mapping rules. Table 1 shows the fUML activity diagram's elements and the corresponding CSP representation that reflects the semantic behaviour for each element.
In the mapping rules, aIH and bIH represent the instance handler of the sender and receiver objects, respectively. Instance handlers are used to uniquely identify each object in the system and are included in all the CSP events. The values rp1 and rp2 in Rules (3) and (4) represent the registration points where the object (bIH) is waiting to accept the signal instances sig1 and sig1, sig2, or sig3, respectively. Each AcceptEventAction in an fUML activity diagram (e.g., Fig. 3 ) associated with a unique registration point.
Mapping from UML activity diagrams to CSP has been addressed several times in the literature [39, 40] . The novel points of our mapping are as follows:
Rule(1) maps the fUML activity as a parent CSP process with several parameters (param1, param2, . . .). Within this process we define sub-processes, each acts as a different fUML element within this activity. The within statement defines the action (sub-process) connected to the initial node (AC1). Rule(2) and (3) maps the SendSignalAction and AcceptEventAction to the CSP parameterized events send and accept, respectively. The registerSignals event is used to let the object activation fill the waiting event accepters list with the allowed signals to be accepted at this point (registration point). The value rp1 is explicitly included in the event so that each AcceptEventAction is uniquely identified. Without those registration points, the model checker will not be able to identify the possible signals to be accepted by the accept event. Section 5.2 describes how those events synchronize with the object's buffer process to allow the asynchronous communication between processes (active objects).
The fUML standard supports the fact that the AcceptEventAction handles more than one signal at a time. When the control flow of the activity reaches this action, the object waits for any of the defined signals (sig1, sig2, or sig3) to be received. If any of those signals arrive, the object execution proceeds and the incoming signal instance is passed to the AcceptEventAction output pin. For that reason, in Rule(4), we connect the decision node to the action's output pin to branch the flow based on the incoming signal. We use the same concept of Rule(3) followed by an external choice to represent the branching semantics. Rules like (2), (3), and (4) are not presented in [39, 40] because the focus there is not on interaction between activity diagrams.
Rule (5) maps the combination of the actions: value SpecificationAction and addStructuralFeatureValue Action to two events to allow (for example) the aIH instance handler's attribute isOpen to be set to FALSE. We represent the decision node as an internal choice (as in Rule (6)) when the incoming edge to the decision node is a control flow. But we represent it as an external choice (as in Rule (4)) when the incoming edge is an object flow. Having the decision nodes in the fUML standard allowed for modelling internal decisions which was not possible using Executable UML (xUML) [18] .
The mapping rules scope
It is obvious that the mapping rules do not support all the fUML standard elements, and for the chosen elements not all the properties are considered in the formalization. This part discusses the rationale behind the inclusion and the exclusion for some elements.
The formalization rules include all the fUML elements that have been used in the Tokeneer fUML model (our primary case study) and the chosen properties for each element are sufficient to check deadlock freedom between the communicating active objects. This explains why we have excluded elements such as Activity Final Nodes from the formalization, especially that the dynamic object creation and destruction is not support in this work. Also, formalizing unnecessary properties will lead to a complicated CSP model that FDR2 will possibly fail to check. For example, the formalization of the addStructuralFeatureValueAction considers the assignment of unordered boolean structural features only.
Some of the excluded fUML elements such as Fork and Join nodes are appropriate to use when modelling the concurrent behaviour within the active object. We will show in Sect. 5.2.2 that modelling the concurrent behaviours is considered in our formalization but only between the active objects which are communicating with each other asynchronously.
As we are constrained with CSP as a formal representation, some aspects in the fUML standard cannot be formalized directly using CSP such as the Join nodes which are used to combine multiple/parallel flows in the activity diagram into one flow. That is mainly because parallel processes in CSP can just synchronize on some events, but their behaviours cannot be combined to act as one process.
The fUML standard includes many intermediate actions such as: Read Structural Feature, Write Structural Feature and Test Identity actions. Our framework is flexible enough to support adding more formalization rules for such actions.
However, some actions such as Create/Destroy Objects require adding additional processes to the CSP model to handle the objects management tasks.
Formalization automation
We use Epsilon 2 as an MDE framework to do the transformation from the source model (fUML) to a CSP script. The transformation is done in two stages: firstly, Modelto-Model transformation from the fUML model to a CSP model using Epsilon Transformation Language (ETL), and secondly, Model-to-Text transformation from the CSP model to a CSP script using Epsilon Generation Language (EGL) [17] . The Model-to-Model transformation includes all the rules shown in Table 1 represented in ETL. Epsilon performs the transformation based on the source/target metamodels. In this work, we use the available UML2 metamodel 3 [34] and the CSP meta-model used in our previous work [32] . Figure 5 illustrates a sample ETL rule (Rule (1)) and segments of the involved meta-models in this rule. The first metamodel segment (fUML) shows that each Activity in the fUML model can have many ActivityNodes, and the ActivityParameterNodes are a kind of those nodes. This small segment is sufficient to understand Rule(1) ETL representation from the fUML aspect. Similarly, the second meta-model segment (CSP) shows that each LocalizedProcess holds mainly a ProcessAssignment entity which relates the ProcessID (e.g., AC1) with the ProcessExpression (the expression after the "=" operator).
The execution of this ETL rule (Activity To Localized Process) applies the mapping shown in Rule(1) in Table 1 , as it transforms any activity in the fUML source model (activity) to a CSP localized process (locProc) and all its related elements (ProcessAssignment, ProcessID and ProcessParameterList). The actions and the nodes inside the fUML activity are translated using the other mapping rules.
The fUML and CSP model elements can be accessed using the variables AD and CSP, respectively, using the '!' operator. The for loop and the nested if condition in the rule's body are concerned with the activity parameters nodes (ActivityParameterNode) that should be represented as ProcessParameterListItem's in the CSP model. Inside the loop, the rule sets the items' names, adds them to the ProcessParameterList (ppl) and adjusts the ppl size. After the loop, the rule sets the Table 1 The fUML to CSP mapping rules CSP ProcessID (pid) name with the activity name augmented with ' Proc' and then associates the CSP elements with each other. The reader can refer to [17] for more details about the Epsilon ETL language.
The Model Formalizer uses Epsilon to execute all the ETL rules followed by the EGL script to perform the Model-toText transformation which generates a comprehensive CSP script that represents the source fUML model behavioural semantics.
Tokeneer fUML activity diagrams formalization
As mentioned in Sect. 3, our motivation is not to re-validate the Tokeneer project but to use it as a case study primarily to validate our framework and secondly to study the fUML model behaviour in the context of asynchronous communication as a possible implementation for Tokeneer Z specifications. This section shows a sample output from the Model Formalizer when using Tokeneer fUML as an input Figure 6 shows the Door Controller CSP process (DoorControllerActivity Proc) that represents the behavi oural semantics of the DoorControllerActivity depicted in Fig. 3 .
As a direct application of Rule(1), the DoorControllerActivity is translated to the DoorControllerActivity Proc CSP localized process with the corresponding parameters. AC2, AC8 and AC10 are generated by Rule(5). Applying Rule(6) on the timer expiry decision node resulted in the internal decision in DS1.
When the process registers (using registerSignals event) and accepts (using accept event) the unlockLatchSignal in AC1, this means that the process is ready to accept this signal when it is placed in its object's (selfObj) event pool. On the other hand, when the send event in AC4 happens, the unLockingDoorCompleteSignal will be placed in the User Panel object's (upObj) event pool. The mechanism that allows for signals sending/accepting is described in more detail in the following sections.
Representing the fUML activity as a localized process (using let · · · within statement) with a sub-process for each action makes the CSP process more readable and the transformation task easier. This style also allows for recalling the same action several times without repetition.
Inter-object communication formalization
In the second stage, the Model Formalizer formalizes the inter-object communication semantics (described in Sect. 2.1) into CSP. However, having the events dispatching scheduling as one of the fUML standard semantic variation points led to different interpretations and thus different performances and results for the model checking using FDR2.
The initial attempts of the events dispatching formalization
We have conducted several attempts to formalize the events dispatching scheduling before reaching the current implementation. Although all of the attempts are compatible with the fUML standard, each of them implements the semantic variation point in a different way. The events dispatching scheduling is mainly controlled by the representation of the event pool. Among those attempts, we outline below two of them:
In the first attempt, we represented the event pool as a bag, which means that any signal can be dispatched from it arbitrarily. The main problem in this representation was that it does not preserve the order of the incoming signals. Fig. 6 The corresponding CSP process for the Door Controller activity segment (i.e., the translation of Fig. 3) Also, when the bag becomes full, any incoming signal will be dismissed, which will lead to a quick deadlock invalid afterwards. Decreasing the effect of the former problem can be done by increasing the bag's size. However, with this representation, FDR2 failed to compile the CSP script when the bag's size was larger than 4 slots (for the Tokeneer case study), which in practice is too small to keep the system alive.
In the second attempt we represented the event pool as a queue, which preserves the FIFO order of the incoming signals. This is the default fUML strategy for dispatching events from the event pool. Using the queue solved the problem of the non-deterministic dispatching of signals from the event pool, preserving the incoming signals order. However, a new problem was introduced when an object receives an unexpected signal (i.e., not matched to one of the waiting event accepters). In this situation, the object dismisses the incoming signal directly because it has been already removed from the event pool for matching and the fUML standard does not allow for returning signals back to the event pool. In many cases the object may need to accept this dismissed signal after few further actions, which generally leads to an invalid deadlock to the system.
In the following sections we will describe the representation of the event pool as a Controlled Buffer in CSP which is the most optimized implementation (compared to the initial attempts) that led to the minimum compilation and checking time.
The event pool list as a Controlled Buffer
In the current implementation, the event pool is represented as a Controlled Buffer (described below). The Controlled Buffer with the current implementation benefits from its definition using only the CSP primitives (parallel composition, prefix, etc.) and avoiding using the Haskell functions which can be used to allow functional definitions within process definitions, as they lead to a significant decay in FDR2 performance during the compilation process. In other words, although Haskell functions are allowed by FDR2, they slow down the model checking and avoiding them by pure CSP makes the model checking faster. The current implementation also maintains the signals sending order and provides a scalable solution for the event pool size. The idea of this implementation came from Michael Goldsmith and P. Armstrong (personal communication, 2010).
The Controlled Buffer consists of a sequence of nodes, where each node holds one signal at a time. When adding a new signal to the buffer, it is placed in the first empty node on a queue basis. Signals can be removed from any slot of the buffer (not on a queue basis). However, when selecting the signal to be removed, the buffer controller checks the oldest signal first (i.e., the signal that matches the selection criteria and at the same time spent the longest time in the buffer). All the signals located after the removed signal are shifted up when it is removed. When the buffer becomes full, the controller drops the oldest signal in the buffer and shifts all the other signals. Figure 7 shows the general structure of a Controlled Buffer consisting of N consecutive nodes. When an object sends a signal to another object (performs the send event), the signal is placed in the receiver object's buffer (event pool) by placing it in the first node (B0), then the signal will move down the chain automatically until reaching the rightmost node in the buffer. The same will be repeated for any other incoming signal filling the buffer from right to left. When the buffer is Fig. 7 The event pool as a Controlled Buffer full, the accepting of a new signal will result in the signal in the rightmost node (oldest signal) being dropped out (drop event) and all the signals shifted right by one node. Signals are moved down as a parameter to the c1, c2, …, cN events. According to the fUML standard, the dropped signals cannot be returned back to the event pool, and thus will never reach the destination.
As will be outlined below, the receiver object uses the testY event (where Y represents the current node: A, B, . . .) to check if the contained signal is member of the object's waiting event accepters list. If so, the signal is removed from the event pool via the acceptY event, otherwise the rejectY event is enabled to allow checking the next node. We represent each of those nodes as a mutually recursive CSP process with a simple logic illustrated in Fig. 8 for the first node (B0) and the general node (B). Notice the example possible values for the processes parameters between square brackets.
The processes B0 and B represent the node when it is empty, while B1 and B2 represent them when the node is holding a signal. In B0 and B the only allowed event is c to fill the node with the passed signal in its parameter (x). In B1 and B2 the hold signal (x) can either be passed to the next node (d) or tested (g) by the buffer controller for acceptance (e) or rejection (h). If in B1 the c event happened, the oldest signal will be dropped (f ) and then the d event will be allowed to shift the signals to the right.
As the buffer consists of sequence of nodes, we combine the previous processes (B0 and B's) in parallel to form a new process (CB NODES) that represents the Controlled Buffer (event pool) but without being controlled yet. Figure 9 shows the CSP representation of a three node buffer which can hold three signal instances at a time. The process CB NODES is defined using one B0 process and two B processes whose parameters are instantiated appropriately. The functionality of chase will be described in Sect. 5.2.4.
Controlling the buffer
To maintain dispatching signals in the same order they were sent, we developed a controller process (CB CTRL) that checks nodes one by one from the oldest (rightmost) to the newest (leftmost) before removing the signal from the event pool, and if the signal exists in the waiting event accepters list, the process allows for its acceptance (accept event) otherwise the signal is rejected (reject event) and the next node is checked. Figure 10 shows our representation of the buffer controller process (CB CTRL) for a three nodes event pool.
The getRegisteredSignals is a mapping function that returns the allowed signal(s) at a certain registration point (rp). For example, in the Door Controller activity, getRegisteredSignals(rp2) returns lockLatchSignal and doorIs Open Signal. The registerSignals event synchronizes with the corresponding event in the translation of the activity diagram (Rule(3) and (4)) to initiate the signals checking process. The controller process (CB CTRL) checks (testY ) the nodes starting from the rightmost node (C) to the leftmost node (A). If the signal is a member of the waiting event accepters list (EA), the controller allows for its acceptance (acceptY ) and flushes all the other signals in EA, otherwise it is rejected (rejectY ) and the next node is checked. The controller at the end synchronizes with the send event to stop the checking until an object sends any signal to aIH.
To allow the CB CTRL process to control the buffer CB NODES we combine them in parallel in the new process CB NODES CTRL as illustrated in Fig. 11 . The set aSynchEvents contains the synchronization events: test, reject, and accept for all nodes.
Moving signals along the Controlled Buffer
The parallel combination in the process CB NODES CTRL does not provide a mechanism to force FDR2 to move the signals along the nodes from left to right. For that reason we depend on the chase function of FDR2 to complete the definition.
Chase gives priority to internal (tau) transitions over external ones, and chooses one internal transition arbitrarily when there is a choice of several. This reduces the state space of the labelled transition system in FDR2 by removing external transitions competing with internal ones, and selecting one internal transition where there is a choice of them. This results in a refinement of the original process, which can only perform external events once all internal progress have completed. Thus, chase is not semantics-preserving, in general (and in this case), but it is exactly what is required here so that shuffling the signals along always occurs after an output event before further visible events are possible. For more details about how chase works the reader can refer to [41] . For example, using the chase function for analyzing the left hand side tree (a tree with some hidden events) in Fig. 12 will produce only two possible traces tau, tau, g or tau, tau, h . Figures 9 and 13 illustrate the application of chase to the processes CB NODES and CB NODES CTRL, respectively, after hiding the buffer internal events (test, reject, c, and drop) for all nodes (grouped in aHiddenEvents for the CB NODES CTRL process). Having those events hidden (taus), FDR2 will follow them causing signals to be propagated along the nodes whenever a send event happens. The process CB is the complete definition of the Controlled Buffer for one instance in the system.
It is important to note that we are not using the chase function in the conventional way. Chase here prevents further external events from occurring following output from the buffer until the signals are propagated internally along the buffer. This is precisely the behaviour required: that the effect of an output is instantaneous. Representing the buffer as a parallel combination of small processes (B0 B B · · · ) rather than a sequence of signals (C B( sig1, sig2, . . . ,  sig N ) ) shows a substantial performance improvement during the model checking compared to the later representation. The non-deterministic design of the C B process allowed chase to move the signals, because chase has no effect on the deterministic processes (chase(P) = P, when P is deterministic).
The SYSTEM process
This is the third stage of formalizing the fUML model where the Model Formalizer generates the overall system process (SYSTEM). This process is a parallel combination between all processes that synchronize on the send, accept, and registerSignals events as depicted in Fig. 14. This in turn, for example, will allow object A to send signals to object B by inserting the signals in its (object B) event pool (Controlled Buffer).
The SYSTEM process can then be used by FDR2 to check the whole system against a specific behaviour such as: deadlock, livelock or determinism. The Model Formalizer component generates all the processes related to the inter-object communication automatically using Epsilon. A copy of those processes will be generated using an EGL script for each instance in the input fUML model to allow its objects asynchronous communication. For example, the Model Formalizer generates the following processes for the Door Controller instance dcIH0: CB NODES dcIH0, CB CTRL dcIH0, CB NODES CTRL dIH0 and CB dIH0. The Model Formalizer also generates the required sets of alphabets which will be used in those CSP processes such as: aSynchEvents and aHiddenEvents. Finally, the Model Formalizer generates the SYSTEM process described in the previous section.
Deadlock checking using FDR2
After the Model Formalizer completes its function and generates the comprehensive CSP script, the framework initiates FDR2 to perform the model checking. In this paper, we will focus on the deadlock checking as one of the possible behaviours that FDR2 can check automatically. FDR2 reports a deadlock when it reaches a state in which no further actions are possible, which means in our model that all the objects in the system (SYSTEM process) are waiting to accept signals from each other. In case of deadlock, FDR2 displays a counter-example (sequence of events) that led to this deadlock.
Tokeneer deadlock checking
The Tokeneer CSP model SYSTEM process includes four interacting processes (Door, Door Controller, User, and User Panel). Each process has its own event pool with 10 slots. When checking SY ST E M using FDR2, it managed to compile the CSP script (about 600 lines) and reported a deadlock scenario (counter-example) after exploring 2.5K states in 5 s. 4 The following trace shows part of this counter-examples: The trace shows the sequence of events generated from the checking of Tokeneer SYSTEM process. The registerSignals event causes the object to wait until one of the registered signals arrives. As highlighted in the trace, eventually all the system's objects are waiting for each other, causing deadlock. The Door Controller (dc0) will never send the entryAuthorized signal to the User (u0) because it does not make sense for a User to enter when the door is locked. Consequently, the User cannot evolve its behaviour. Also, the unlockLatchSignal will never be sent from the User Panel (up0) to the Door Controller and so the Door Controller cannot evolve its behaviour. This scenario might happen in real life if the user takes a long time (more than the timer period (lockTimeoutExceeded)) to open the door after getting permission to enter from the User Panel.
We cannot claim that this deadlock is a breach of the Tokeneer requirements [7] for two reasons: firstly, the entry expiration timer that caused this deadlock was not specified explicitly in the requirements document. However, we added this timer as part of the system implementation to prevent the Door Controller from waiting forever for a User to enter the enclave. Secondly, the requirements do not specify a certain communication mechanism between the system components (objects), leaving that as an implementation issue. We would argue that this deadlock was identified because we modelled concurrent behaviour of all the components within the TIS subsystem.
When we disabled the entry expiration timer (i.e., the door can be kept closed and unlocked forever), the system did not deadlock and FDR2 succeeded in doing a full model check in 8 s after exploring 9.2K states on the same hardware mentioned in Sect. 6. The Controlled Buffer with the predescribed implementation in Sect. 5.2.2 allowed for this fast compilation and model checking compared to the previous implementations of the inter-object communication mechanism.
We also tried to reproduce this deadlock scenario on the Tokeneer simulator [2] ; however, this scenario did not happen due to the different implementation decisions that were taken in the SPARK implementation especially for the door unlocking timer.
Formalization and model checking feedback
There are two kinds of feedback that can be provided by the framework to the modeller. The first kind is the Formalization Report which is generated by the Model Formalizer in case of errors during the formalization process. The second kind is a UML sequence diagram which visualizes the counterexample in case of deadlock.
The Formalization Report
The formalization rules described in Sect. 5 include only a subset of the fUML elements, this means that not every fUML diagram can be formalized using the Model Formalizer. The diagrams have to fulfill minimum requirements in order to be formalized. These requirements include the existence of certain elements and the assignment of certain properties. For example, the Model Formalizer cannot formalize an fUML activity diagram that does not include a connected initial node, because this will prevent the Model Formalizer from setting the initial CSP sub-process in the within clause of the localized process. Another example is not assigning the name of an edge emerging from a decision node in an fUML activity diagram.
To be able to check the formalizability of each diagram ("is formalizable?"), each transformation rule is divided into two parts. The first part checks for the required elements/ assignments, and if met, the second part performs the transformation. Otherwise, a formalization error is reported to the modeller that guides him to the missing items.
The UML Sequence Diagram Generator
We have shown in Sect. 6 the output that FDR2 produces in case of deadlock (a counter-example as a sequence of events). This representation may not be accessible to the modeller who developed his model as an fUML model in the beginning. For that reason, we include the UML Sequence Diagram Generator component as part of our framework to transform FDR2 output to a modeller-friendly format. This component takes the counter-example generated by FDR2 as an input and generates a UML sequence diagram that represents this counter-example.
The UML Sequence Diagram Generator also makes use of the Object-to-Class mapping table (generated by the Model Formalizer) to relate the behaviour of each object to its class in the fUML model. Figure 15 shows the automatically generated sequence diagram which corresponds to the trace in Sect. 6.
The UML Sequence Diagram Generator depends on an open-source tool called Quick Sequence Diagram Editor [27] . The tool takes an input script (*.sdx file) that specifies the system objects and how they interact with each other. Based on that script, the tool generates an image of that sequence diagram. A sub-component of the UML Sequence Diagram Generator translates FDR2 output to an sdx script based on a group of simple mapping rules. Table 2 shows two samples of FDR2 output and the corresponding sdx representation.
To list the corresponding signals of rp19, we use the information stored in a mapping table called RP-to-Signals which had been generated by the Model Formalizer during the formalization process. This table maps between each rp and the possible accepted signal(s) at this point.
Multiple counter-examples
FDR2 has the option to generate more than one counterexample in case of deadlock. Instead of aborting the model checking once detecting a sequence of events that lead to a deadlock, FDR2 continues the model checking until reaching another sequence. Our framework utilizes this option in FDR2 by allowing the modeller to identify the maximum number of counter-examples to be generated in case of deadlock through a simple Graphical User Interface (GUI) before the model checking as shown in Fig. 16 . This is made possible by FDR2 batch mode that gave us this level of control through the command line parameters.
The UML Sequence Diagram Generator has the ability to detect if more than one counter-example have been generated by FDR2, and thus generates a corresponding sequence diagram for each counter-example. 
Loop detection
Sometimes the generated counter-example includes a repetition of certain pattern(s) (sub-sequence of events) many times, which decreases the readability of the corresponding sequence diagram as it becomes too long to track. To avoid this issue, the UML Sequence Diagram Generator has Fig. 16 The modeller selects the counter-examples per check the ability to detect this repetition automatically using an advanced search algorithm and replace it with one pattern surrounded by a "loop" box. Figure 17 shows part of a generated sequence diagram. As shown inside the "loop" box, the repetition of sending the signals requestEntry and readUserToken three times has been detected by the UML Sequence Diagram Generator. Such a scenario can happen due to a bug in the User activity diagram.
Framework implementation
We have implemented the framework within MagicDraw as a plugin called "Compass" (Checking Original Models means Perfectly Analysed SystemS). To use Compass, the modeller should first model the system objects' behaviours using fUML activity diagrams. Consequently, he can use the plugin GUI to initiate the deadlock checking. In case of deadlock the plugin generates an UML sequence diagram to the modeller in a separate window. Compass totally isolates the modeller from dealing with the formal representation of the model. Figure 18 shows a screen shot of MagicDraw/Compass during checking Tokeneer fUML model for deadlock. The screen shows part of the TIS subsystem fUML activity diagrams and the sequence diagram which shows the deadlock scenario.
We would argue that implementing the framework in the form of a plugin to an already existing case tool is more practical than implementing it as a standalone application for several reasons. Compared to a standalone formalization application, a plugin will allow for having a single integrated modelling environment. Also, modifying the plugin to work with other case tools is a straightforward task, which means that the plugin can be made available for several case tools. This in turn will allow the modellers who are already using a certain case tool not to change their modelling environment to check the models (or even to re-check legacy models).
Related work
Much research work has been done on formalizing semiformal models to check different properties. Among these works, Hansen et al. [12] and Xie et al. [38] focused on checking user defined safety specification for an xUML models formalized into mCRL2 [10] and S/R (the input language of COSPAN [13] ), respectively. Roscoe et al. [28] developed a CSP-M-based compiler to formalize Statemate Statecharts [8] into CSP for the purpose of checking several properties such as consistency with application-specific requirements.
Our work is more related to those who focused on checking model-independent system behaviours (i.e., can be checked as part of the toolset) such as deadlock or livelock. In this category, Ng et al. [22] used CSP as a formal representation to check deadlock and divergence for the input UML state machines. Thierry-Mieg et al. [31] used Instantiable Petri Nets (IPN [21] ) to check deadlock and unreachable final states for the input UML activity diagrams. Also, Turner et al. [33] automatically formalized xUML state machines into C S P B [29] (an integrated formal language that combines CSP and B) to check deadlock.
Formally representing the asynchronous communication between objects has been discussed in a limited way in [9, 12, 33] where part of the xUML was formalized, which specify a way of communication different from fUML. On the other hand, Xie et al. [38] simulated the asynchronous message passing by synchronous communication between processes modelling objects and their message queues. Our previous work [1] considered also the asynchronous communication mechanism between system objects; however, the manual formalization reduced the practicality of the approach.
To perform the formalization automatically, some authors developed their own tools to perform that task. For example, Cabot et al. [6] developed a tool called UMLtoCSP to do the formalization. Also, Shah et al. in [30] used UMLtoAlloy and Alloy Analyzer to do the formalization and model checking, respectively. Another group of authors used MDE tools to do the transformation. Varró et al. in [35] summarized a comparison between 11 different MDE tools used to transform from UML activity diagrams into CSP (UML-to-CSP case study [5] ), as part of the AGTIVE'07 tool contest. Also, Treharne et al. [32] used the Epsilon framework to transform UML state diagrams to CSP B. Providing modeller-friendly feedback to report the model checking results has been addressed only a few times in the literature. The authors in [6, 30] proposed presenting the model checking results (e.g., counter-example) as an object diagram that represents a snapshot of the system during the error. Alternatively, Mrugalla et al. [20] presents the counterexample as sequence and timing diagrams. In another approach, the authors in [26, 31] proposed compiler-style errors with valuable feedback.
Compared to all the reviewed literature, this work is the first attempt to automatically formalize the fUML activity diagrams, including the formalization of the fUML asynchronous inter-object communication mechanism.
Conclusion and future work
In this paper, we have presented a framework that helps modellers to check the behaviour of their fUML model automatically. The framework depends on formalizing the fUML model into CSP and then checks it using FDR2 taking into consideration the formalization of the asynchronous inter-object communication mechanism. The comprehensive formalization (for fUML diagrams and communication mechanism) allowed for checking the system against deadlock which may occur if all the system's objects stop working waiting for each other.
In case of deadlock, the framework provides the user with a UML sequence diagram that describes that deadlock scenario in terms of the fUML model, not the formal CSP model, to isolate the modeller from the formal domain.
We have developed an implementation of this framework as a MagicDraw plugin called Compass. Compass made use of the Epsilon MDE framework to translate the fUML model into a CSP script in two stages (Model-to-Model then Modelto-Text).
Validating the framework's functionality and applicability was achieved by applying it on a non-trivial case study (Tokeneer ID Station). Using the implementation of the communication mechanism described in Sect. 5.2, FDR2 succeeded in compiling the generated CSP script and detected the deadlock scenario in 5 s for a 10 slots event pool for each object. The detected deadlock scenario was due to an implementation decision added to Tokeneer fUML model (i.e., not a breach in the Tokeneer specification).
Currently, the framework supports having only one instance for each class. Such a constraint will be addressed in our future work to support multiple instances for each class in the system. Also, we will modify the framework to include safety and security specifications checking.
This appendix includes a simplified version (just showing the main logic) of the used ETL rules in the framework and the associated meta-models for each rule. We have developed a group of Epsilon operations to allow for more compact ETL rules. The following outline those operations:
A.1 Operations -getCSP Process
Takes an activity diagram element reference as an input and returns the corresponding CSP ProcessID for that element.
-createSendEvent Creates a CSP Event entity (send) for the SendSignal action and returns it. It also creates the corresponding CSP EventParameter's and associates them with the Event entity.
-createRegisterSignalsEvent Creates a CSP Event entity (registerSignals) for the AcceptEvent action and returns it. It also creates the corresponding CSP EventParameter's and associates them with the Event entity.
-createAcceptEvent
Creates a CSP Event entity (accept) for the AcceptEvent action and returns it. It also creates the corresponding CSP EventParameter's and associates them with the Event entity.
-createValueSpecificationEvent Creates a CSP Event entity (valueSpec) for the ValueSpecification action and returns it. It also creates the corresponding CSP EventParameter's and associates them with the Event entity.
-createAddStructuralFeatureValueEvent Creates a CSP Event entity (addStFeatureValue) for the AddStructuralFeatureValue action and returns it. It also creates the corresponding CSP EventParameter's and associates them with the Event entity.
-createInternalChoiceEvent Creates a CSP Event entity for a given internal choice branch and returns it.
-addToLocalizedProcess Adds the created subprocess (ProcessAssignment) to the corresponding localized process.
-getTargetNode Returns the target node (connected to the other side of the edge) given the edge reference.
