State-based languages are widely used for modelling systems that have an internal state, such as communications protocols and embedded control systems. As testing is a vital part of system development, this has led to much interest in testing from finite state machines (FSMs). However, complex systems are seldom designed in one step; usually, the design is constructed gradually, through a process of refinement. In the case of state-based models, this may lead to a hierarchy of machines. Furthermore, some of the components of the hierarchy may exhibit concurrent behaviour. In this paper, we present a method for generating tests for a hierarchical FSM by reusing and refining the tests for the FSM components of the hierarchy. The method is also adapted for testing a system of communicating FSMs, in which the communication is one-directional, from one master to one or more slaves.
INTRODUCTION
State-based languages, such as Statecharts [1] and SDL [2] are widely used for modelling systems that have an internal state, such as communications protocols and embedded control systems. State diagrams have also become a standard model for representing object behaviour; in particular, UML [3] adopted this model. As testing is a vital part of system development, this has led to much interest in testing from finite state machines (FSMs) [4 -9] . Given an FSM specification, whose transition diagram is known, and an implementation, which is a 'black box' for which we can only observe its input/output behaviour, we want to select a set of test sequences that check whether the implementation under test conforms to the specification. This is called conformance testing. Many test-selection methods rely on the assumption that the implementation cannot have more states than the specification; among these, the best known are based on Transition Tours [7] , (Multiple) Unique Input Output (UIO) sequences [7, 8] , Characterizing Sequences [5] and Distinguishing Sequences [7, 9] (the methods are enumerated in increasing order of their fault-detection capability [6] ). Although this assumption is quite restrictive, these methods have been successfully used in testing of network protocols [6 -8] . A less restrictive assumption is required by the W-method [10] : this generates test suites that guarantee the correctness of the implementation provided that the number of states of the implementation remains below a known upper bound.
Complex systems are seldom designed in one step. Usually, the design is constructed gradually, through a process of refinement. In the case of state-based models, this may lead to a hierarchy of machines [1, 11] . Furthermore, some of the components of the hierarchy may exhibit concurrent behaviour. In conformance testing, hierarchical and concurrent models are usually turned into behaviourally equivalent FSMs, which are then used as the basis for test generation [12, 13] . However, this approach suffers from the state explosion problem [4] and, furthermore, the refinement used in the design process is not reflected in the construction of the implementation and of the test data. An alternative approach is to develop the implementation in parallel with the design, that is each change in the design is accompanied by an appropriate change in the implementation. In this case, tests can also be constructed and refined in parallel with the specification, and so test generation will be kept consistent with the design and we will avoid leaving testing to the final pre-release stage. Furthermore, by developing the implementation in parallel with the design, some implementation faults are avoided from the outset and so fewer test cases will be needed.
In this paper, we present a method for generating tests for a hierarchical FSM by reusing and refining the tests for the FSM THE COMPUTER JOURNAL, Vol. 52 No. 3, 2009 components of the hierarchy. Test generation from hierarchical FSMs was previously investigated, from a theoretical point of view, in [14] , but only certain particular types of hierarchies were considered. This paper generalizes these previous results. The application of this technique to hierarchies with history states is also discussed in the paper. The method is then adapted for testing a system of communicating FSMs, in which the communication is one-directional, from one master to one or more slaves.
The paper is structured as follows. Section 2 introduces the FSM model and briefly presents the W-method. The technique for generating tests from hierarchical FSMs is presented in Section 3, while its application to hierarchies with history states is discussed in the next section. A case study to illustrate the development and testing of hierarchical FSMs is given in Section 5. The method adapted for a system of communicating FSMs is presented in Section 6. An analytical evaluation of the proposed approach is given in Section 7 and related work is discussed in the next session. Finally, conclusions are drawn in Section 9.
FINITE STATE MACHINES
First, we introduce basic FSM concepts and results and briefly describe the W-method.
An FSM M consists of a finite set of states Q, of which one is the designated initial state q 0 , and transitions between states are labelled by pairs of input/output symbols from a finite input alphabet X and a finite output alphabet Y, respectively [4] . FSMs are usually represented as state-transition diagrams. For example, the FSM model of a simple tape recorder is given in Fig. 1 .
The FSMs referred to in this paper are assumed to be deterministic. In such an FSM, for any given state q and any given input x, there is at most one transition from q labelled by x. When there is exactly one such transition, the FSM is said to be completely specified; otherwise, it is said to be partially specified. An FSM may be transformed into one that is completely specified by assuming that the 'refused' inputs produce a designated error output, which is not in the output alphabet of M; the erroneous behaviour can be represented as self-looping transitions or transitions to an extra (error) state [4] . However, in this paper we discuss the more general case when FSMs may be partially specified.
Given a state q, the finite sequences of input/output symbols that can be traced out of q make up the language accepted by M in q, denoted L M (q) or simply L M when q is the initial state q 0 . When only input sequences are of interest, the language will be denoted Li M (q) or Li M , respectively.
Once we have an FSM representation of a system, we can use appropriate techniques to uncover possible errors in the implementation, such as erroneous transition labels, erroneous next-states, missing states, extra states, etc. One of the most general approaches is the W-method [10] , which generates sequences of input symbols to (1) reach every state in the diagram, (2) check all the transitions from that state and (3) identify all destination states to ensure that their counterparts in the implementation are correct. Consider, for example, the FSM representation of a tape recorder given earlier and take the Playing state. This is reached from the initial state, Off, by a sequence of input symbols, s Playing ¼ on play. All possible transitions emerging from this state need to be checked; in order to achieve this, each input symbol is applied after s Playing . Furthermore, the original state, Playing, as well as the next-states of all these transitions, will have to be identified, so the resulting sequences will be, in turn, concatenated with each element of a set W that distinguishes between every pair of states. Therefore, the W-method involves the construction of two sets of input sequences: † A state cover S # X* that reaches every state of the FSM; in particular, the empty sequence e, which reaches the initial state q 0 , is contained in S. † A characterization set W # X* that distinguishes between every pair of states in the FSM. In other words, for every two distinct states q 1 and q 2 , W contains at least one input sequence that produces different outputs when applied from q 1 and q 2 , respectively. When the FSM is partially specified, this includes the case in which the input sequence can be applied in q 1 but not in q 2 or vice-versa.
An FSM in which all states are reachable and pairwise distinguishable is called minimal. As a state cover and a characterization set are assumed to exist, a prerequisite of the method is that the FSM specification is minimal. However, this condition does not restrict the applicability of the W-method since for every FSM a minimal, functionally (i.e. input/ output) equivalent FSM can be constructed [4] .
For the FSM represented in Fig. 1 , e reaches Off, on reaches Idle, on play reaches Playing while on rec reaches Recording. Thus S ¼ fe, on, on play, on recg is a state cover. On the other hand, the input symbol off can be applied in Idle, Playing and Recording but not Suppose we have an FSM specification M and we have constructed a state cover S and a characterization set W of M. Naturally, we assume that the implementation of M can be modelled by an unknown FSM, M 0 . The only information we need about M 0 is an estimation of the maximum possible number of states n 0 that it may have. This is based on the tester's knowledge of how the system has been implemented. Suppose we denote by k, k 0, the difference between n 0 and the number of states n of the specification M. Then a test suite T can be constructed by first concatenating S with the set
. . < feg of all input sequences of length at most k þ 1, and then concatenating the resulting set with W. Thus
Note that the above formula is given in [10] for the case in which the FSM specification is completely specified. As explained in [15] , when the specification is partially specified, the test suite will also include some of the prefixes of the sequences in T. However, in order to check the result produced by an input sequence, one would normally check the results produced by all its prefixes; consequently, this extra sub-set of prefixes does not explicitly appear in the formula.
The idea is that the set S X[1] ¼ S < S X (usually called a transition cover of M) ensures that all the states and all the transitions in M are also present in M 0 , and X[k] W ensures that M 0 is in the same state as M after performing each transition. Note that the latter set contains W and also all sets X i W, 1 i k. This ensures that M 0 , does not contain extra states. If there were up to k extra states, then each of them would be reached by some input sequence of length up to k from the existing states.
In general, a transition cover is derived from a transition tree constructed in a breadth-first fashion, while a characterization set is constructed by gradually partitioning the state set based on the responses produced by sequences of length k 1, using the so-called P k tables [10] .
Variants of the W-method also exist in the literature. The partial W-method (or W p -method) [16] is an improvement of the W-method that may reduce the size of the test suite at the expense of a slightly more complex generation algorithm. The round trip [17] approach is based on a transition tree constructed in a depth-first fashion.
A precise oracle, which dumps the values of state variables and compares them to what is expected, can be used instead of W to check the state of the implementation under test. On the other hand, when the states of the FSM are actually abstract states, obtained by suitably partitioning the domain of the system data-as in the case of a statechart model of a class, for example-a precise oracle may be too expensive and so an abstract oracle [18] , which entails checking the invariants of the states that are expected to be reached during the execution of test cases, can be used instead. and Rew to the initial internal state Stop. Furthermore, transitions from internal states to external states may also exist, e.g. the play/P transition from Stop to PLAYING. Using the afore-mentioned rules, it is possible to flatten a hierarchical FSM, that is to turn it into a behaviourally equivalent flat FSM.
TESTING HIERARCHICAL FSMS
There are two ways we can handle test generation from a hierarchical FSM. We can either generate tests solely on the basis of the final, flattened, specification or we can have the implementation developed in parallel with the specification, through a process of refinement, and reuse the test suites for the previous versions in the construction of the final test suite. Consider again the tape recorder example. Suppose the original specification represented in Fig. 1 is implemented first and then the implementations of the internal FSMs are 'placed' in the appropriate states. Then the implementation of the overall system can also be modelled by a hierarchical FSM.
More generally, suppose we have built a hierarchical FSM specification by representing one or more states of an original, flat, FSM M as internal FSMs. The set of all such (compound) states is denoted by Q c , for each state q [ Q c , the corresponding internal FSM is denoted by M q . The input alphabets of M and M q are denoted by X and X q , respectively. The internal and external input alphabets are disjoint (otherwise the system may exhibit non-deterministic behaviour), so for every q [ Q c , X > X q ¼ 1.
As previously discussed, the hierarchical FSM preserves the original functionality described by M, but adds further detail. Consequently, every transition x/y from state q 1 to state q 2 in M will be replaced by one or many x/y transitions in the hierarchical FSM. More precisely, the following two cases can be distinguished: † If q 1 Ó Q c (i.e. q 1 does not become a compound state in the hierarchical FSM) then the original transition will be replaced in the hierarchical FSM by exactly one transition, from q 1 to the initial state of M q 2 . Note that the case in which q 2 Ó Q c (i.e. q 2 does not become a compound state) is a particular case of this rule; in this situation, M q 2 is the 'trivial' FSM, consisting only of one state and no transitions, so the corresponding transition in the hierarchical FSM is from q 1 to q 2 . † If q 1 [ Q c then the original transition will be replaced in the hierarchical FSM by at least one transition, from one or more states of q 1 to the initial state of M q 2 . As explained above, the case q 2 Ó Q is a particular case of this rule.
If the above condition is satisfied we say that the hierarchical FSM is a refinement of M.The equivalent, flattened, FSM of the hierarchical specification is denoted by M 0 . As all states of the internal FSMs are reachable (all internal FSMs are assumed to be minimal), each transition in M will be 'refined' into one or more sequences of transitions in M 0 .
Suppose that M and M q have been implemented; the models of their implementations are denoted by M I and M qI , respectively. Suppose also that the whole implementation can be modelled by an (unknown) hierarchical FSM, in which the FSMs M qI are placed within the states of M I . As in the case of the specification, the hierarchical model of the implementation is a refinement (in the sense indicated above) of the original model. Analogously to the W-method, we assume that we know the difference between the estimated maximum number of states of the implementation model and the number of states of the specification for M and M q , q [ Q c ; these are denoted by k and k q , respectively. Finally, suppose that we have generated a test suite T for M and test suites T q for M q , q [ Q c . Then, a test suite T 0 for M 0 can be constructed as the union of three sets, T 1 , T 2 and T 3 : T 1 will test the implementation of the original FSM M, T 2 will test the implementations of the internal FSMs M q , while T 3 will check the integration of these implementations.
The first set T 1 is constructed by 'refining' the sequences in the original test suite T. The procedure is outlined below. † Initialize T 1 ¼ T.
† For every sequence t [ T 1 and for every compound state q [ Q c such that t crosses q (one or more times), replace each transition from q with one of the corresponding sequences of transitions from the hierarchical FSM.
Note that the definition of refinement for FSMs given earlier ensures that at least one such sequence will exist. Naturally, when many corresponding sequences exist, the shortest will normally be selected. 
TEST SELECTION FOR FINITE STATE MACHINES 337
The construction of the set T 3 , which checks the integration of the internal implementations within the system, is slightly more complex. From the system passing all tests in the set T 1 we can deduce that M I is functionally equivalent to M. However, M I may not be minimal-unlike the specification, the minimality cannot be enforced on the implementation. Consequently, if the difference k between the maximum number of states of M I and the number of states of M is greater than 0, a state q of the specification may correspond to many, equivalent, states in the implementation. If q is a compound state, M qI will be used to detail the behaviour of each of the corresponding states of the implementation, so the integration of M qI within any such state needs be checked. The procedure for constructing T 3 is given below. † Select a state cover S of M.
† Extend all sequences in S with sequences of length at most k and, for each q [ Q c , select the set R q of all sequences in S X[k] that reach q. Since M and M I have been shown to be equivalent, R q will reach all states in M I that are equivalent to q. † For each q [ Q c , refine R q (by replacing transitions from the original specification to sequences of transitions from the hierarchical FSM, as explained above); the resulting set of sequences is denoted by R 0 q . † For each q [ Q c , select a state cover S q of M q (the same state cover used in the construction of T q will normally be used). Since the system has passed all tests in T 2 , M qI is already known to be equivalent to M q , so the sequences in S q will reach a number of states of M qI equal to the number of states of M q . Then, if we extend each path in S q with sequences of length at most k q , the resulting set of valid paths, P q ¼ S q X q [k q ] > Li M q will reach all states of M qI (see [19] for a proof). Therefore, P q X will check all transitions from the (internal) states of M qI to external states. † Finally, all we have to do is to concatenate every sequence in R 0 q with P q X. Thus T 3 ¼ < q[Q c R 0 q P q X. In our example, consider q ¼ Idle and suppose k ¼ 1 and k Idle ¼ 1. X ¼ fon, off, stop, play, recg and X Idle ¼ fff, rewg are the input alphabets and S ¼ fe, on, on play, on recg and S Idle ¼ fe, ff, rewg are state covers. R Idle ¼ fon, on stop, on play stop, on rec stopg is the set of all sequences in S < S X that reach Idle and R 0 Idle ¼ R Idle . Finally, P Idle ¼ fe, ff, rewg is the set of all valid paths in S Idle < S Idle X Idle .
The above explanations that accompany the construction of the three test suites outline a soundness proof of the method. This is not formalized since it would make the paper much harder to read. Formal proofs for two particular cases are given in [14] . The two cases considered are (1) when, for each state q, every transition from q in the original machine is replaced in the hierarchical FSM with transitions from all states of M q and (2) when there exists an 'escape sequence' e, constructed exclusively from internal input symbols, such that, for each state q, every transition from q in the original machine is replaced with at least one transition, from state q 0 of M q , where q 0 is the internal state reached by applying e in the initial state of M q . It can be observed that the hierarchical FSM for the tape recorder considered above is in neither of these two categories.
For multiple-level hierarchies (i.e. whose hierarchies in which some internal FSMs are themselves described by lowerlevel FSMs), the procedure will be applied bottom-up to gradually test each FSM component and integrate it within the upper-level FSM. As in the case of flat FSMs, precise or abstract oracles may be used instead of characterization sets to check the precise or abstract state, respectively, of the corresponding FSM.
TESTING HIERARCHIES WITH HISTORY CONNECTORS
Hierarchical FSMs may contain a special kind of state, called history states or history connectors [3] . A history connector, usually represented as a circled 'H', is attached to a compound state and remembers its last current internal state. In this paper, we only discuss the so-called shallow history connector [3] , which remembers the last current internal state at the same hierarchy level, as opposed to a deep history connector, which remembers the last current leaf state at the lowest hierarchy level. However, the underlying ideas presented here also apply to deep history connectors. Consider, for example, the hierarchical FSM model of a computer system consisting of a simplified word processor (that allows text editing, table and diagram insertion) and a screen saver, as represented in Fig. 3 . The screen saver is activated if the system has not received any input (i.e. key press or mouse click) for a certain time interval; the interrupted processing is then resumed when a new input arrives. The diagram is kept simple in order to illustrate more clearly the underlying ideas; the outputs produced by transitions, which do not play any part in our discussion, are ignored. When the compound state WORD_PROC is entered for the first time, the processing will start from the initial internal state, Text. However, when the system subsequently re-enters the WORD_PROC state, after the screen saver is deactivated, the processing will have to be resumed from where it has left off and not always be restarted from the initial Text sub-state. This is specified by using the history connector attached to WORD_PROC. For the sake of discussion, we consider that the screen saver may have two different layouts and the user can switch between them; consequently, SCREEN_SAVER is also a compound state, but without a history connector.
A hierarchical state diagram with a history connector can be transformed into an equivalent hierarchical FSM without the connector by multiplying the external states of the diagram, so that distinct external states are created for each internal state which can be 'remembered' by the connector. For example, the equivalent model for the word processor/screen saver system will have three SCREEN_SAVER states, one for each sub-state of the word processor, as represented in Fig. 4 . The equivalent model can then be used as a basis for test generation and the procedure presented in the previous section can be applied. It has to be noted that the multiplication of states associated with the construction of the equivalent model without connectors will increase the size of the 'integration' test suite T 3 , but will not affect the 'internal' test suite T 2 . In our example, it is sufficient to test the (internal) SCREEN_SAVER FSM once, but, since it is used in three distinct places, its integration within the system will need to be tested three times, once for each state of the word processor.
On the other hand, since the equivalent model (without connectors) may have a significantly larger number of states than the original, it would be tempting to avoid its construction and generate test sets directly from the original model, by separating the testing of the underlying hierarchy from the testing of the history connector itself. Again, tests for the hierarchical FSM can be derived using the above procedure. As suggested in [20] , the implementation of the history connector can be tested by entering and exiting the corresponding internal diagram, entering it again and verifying the entered state. While this approach may yield a significantly smaller test suite, it is usually unrealistic since it works on the assumption that there is an actual implementation of the history connector which can be identified and tested separately. This is not normally the case. The history state reflects an external observer's (designer's) view of the behaviour and not the implementer's view of behaviour. A history connector is normally implemented by passing information about the last-visited internal state to a global variable which is then read when the internal FSM is re-entered. Consequently, the states of the internal FSM are externalized through the use of a global variable. Thus, realistic tests for such an implementation can only be derived from an equivalent model of the type discussed earlier.
CASE STUDY: USING A HIERARCHICAL FSM TO SPECIFY A WORD PROCESSOR
Hierarchical FSMs provide a simple but effective means of gradually developing specifications for complex systems. We illustrate the approach on a word processor. For the sake of clarity, the functionality considered is fairly basic, but a much more complex word processor can be specified in this way. In order to keep the diagrams simple, the outputs produced by transitions are omitted. At the first level in the hierarchy (Fig. 5) we look at the main functionalities of the system and decide which require separate states. Suppose that, apart from normal text editing, the system also allows the user to work with tables and equations and to draw pictures. In order to use these additional facilities, the user must select the appropriate option (e.g. ins_draw, ins_tab, etc.), perform the corresponding operation (which is not described at this level) and then either successfully finalize the operation (ok) or cancel it (cancel). Each of these operations are then detailed at the next level of the hierarchy, by replacing each state with an appropriate internal FSMs. Some of these internal FSMs may only consist of one state and the required operations (e.g. normal editing will not involve additional, internal, states so it will be sufficient to add a loop-back transition in the Normal 
TEST SELECTION FOR FINITE STATE MACHINES 339
state), others may be more complex. Suppose, for example, that in order to draw a picture the user will be required to choose its shape (sel_shape), set its position (set_posi-tion) and, finally, adjust (i.e. move and resize) the selected shape. This functionality is described by the DRAWING internal FSM in Fig. 6 . The 'adjusting' operation is not described at this stage, it will be detailed at the next level of the hierarchy (Fig. 7) . According to Fig. 7 , the picture can be moved by first enabling the 'move' operation (enable_-move) and then dragging the picture (drag); alternatively, the move operation can be aborted (abort). Similarly, the 'resize' operation can be performed by selecting (select_-point) and dragging (drag) a point of the shape in question. In this manner, it is possible to gradually refine the system functionality until the required level of detail is reached.
However, the case study also reveals a few limitations of the approach. Firstly, in the type of hierarchical FSMs considered in this paper, any transition leading to a compound state will always enter the initial state of the corresponding internal FSM. This may result in the duplication of some internal states. Suppose, for example, that the system allows not only the insertion of new drawings but also the adjustment of existing drawings. This functionality corresponds to the Adjust state of DRAWING in Fig. 6 , which is then detailed by an internal FSM in Fig. 7 . Clearly, rather than representing this functionality as a separate diagram, distinct from DRAWING, it would be tempting to reuse the DRAWING internal FSM and thus allow external transitions to enter a non-initial state of DRAWING, in this case Adjust. However, this approach would not suit our refinement-based testing philosophy. Consider an extended version of the word processor that also incorporates the 'modify drawing' functionality and suppose external transitions that lead to non-initial states of internal FSMs are permitted. At the first level in the hierarchy (Fig. 8) , the ins_draw and modif_draw transitions from Normal will both lead to the same state, Drawing. Consequently, when test suites for the second level of the hierarchy (Fig. 9) are constructed, the DRAWING FSM and its integration within the external diagram will only be checked once (by the sets T 2 and T 3 in Section 3), either via the ins_draw transition or the modif_draw transition. This would not be sufficient since the behaviour exhibited by the DRAWING FSM in the state Adjust will differ from the behaviour exhibited by the same FSM in the initial state, Start. Furthermore, some of the external transitions of DRAWING may be accessible from its initial state, but not from other internal states (e.g. no cancel transition is accessible from Adjust). Thus, the correct approach would be to represent ins_draw and modif_draw as transitions leading to different states (as in Fig. 10 ), which are then detailed as separate FSMs at the next level in the hierarchy (as in Fig. 11 ). Furthermore, test savings can still be achieved if the implementation of DRAWING is reused. In this case, this implementation will not have to be checked twice (in the construction of the test suite T 1 in Section 3); it will be sufficient to test it once, via the ins_draw transition (that leads to the its initial state) and then just check that the modif_draw transition leads to the correct state, Adjust-for this it is sufficient to concatenate modif_draw with a characterization set of
DRAWING.
Other limitations relate to the use of FSMs as a modelling tool, rather than being limitations of the hierarchical model itself. As the FSM model does not include a data structure, the level of abstraction is low and, consequently, different states may be required for essentially very similar things. For example, distinct Shape and Adjust states (and implicitly different Idle, Move and Resize states) will have to be created for different shape types that are moved or resized differently (e.g. a rectangle can only be dragged but a line can also be rotated). The level of abstraction of a specification can be increased by using extended FSMs such as stream X-machines [11, 19] . On the other hand, the transitions of an extended FSM will also depend on internal variables and so not all paths may be feasible; that is, there may be paths that cannot be driven by any sequence of inputs applied to the machine. However, stream X-machine-based, testgeneration methods that can deal with this situation exist [15] .
We now consider the application of our test-generation method to the hierarchical specification represented in Fig. 7 . The input alphabets of the original specification (represented in Fig. 5 ), DRAWING internal FSM (see Fig. 6 ) and ADJUST internal FSM (see Fig. 7 ) are denoted by X, X D and X A , respectively. These alphabets are 
TESTING MASTER -SLAVE COMMUNICATING FSMs
FSMs can also be used to model concurrent processes. When these processes communicate with each other, a slightly more complex model, called a communicating finite state machine (CFSM) [13] is needed. Basically, a CFSM is an FSM plus an input FIFO (first-in and first-out) queue; the CFSM only consumes the inputs from the queue. A system of CFSMs (M 1 , . . . , M n ) works as follows: † An input symbol x received from the external environment will go to the input queue of a CFSM, say M i , provided it is contained in the input alphabet of M i . In this paper, the input alphabets of the CFSMs are assumed to be disjoint, so x will enter one of the queues in a deterministic fashion. † An output symbol y produced by a CFSM, say M i , will pass to the input queue of another CFSM, say M j , provided y is included in the input alphabet of M j . If no such M j exists, then x will go to the output environment.
A system of CFSMs is normally assumed to run in a slow environment and to have no live-lock [13] . We say that a system runs in a slow environment if inputs can be sent from the environment to the system only when the input queues of all CFSMs are empty. We say that a CFSM has a live-lock if it is possible to execute an infinite number of transitions without further inputs.
Consider, for example, the behaviour of an alarm radio modelled by a system of two CFSMs, as represented in Fig. 12 . The CFSM at the top, representing the alarm side of the process, sends a switch signal to the other CFSM, representing the radio, each time it senses the starttime or endtime event of the alarm; as a result, the radio will switch from Idle to Play or vice-versa. The outputs produced by the radio will be released to the output environment; for simplicity, a two-valued output is used: 1, for when the radio is playing and 0, for when it is not.
If a system of CFSMs runs in a slow environment and has no live-lock, then it can be turned into a behaviourally equivalent FSM whose (global) state set is the product of the state sets of the CFSM components [13] . As this equivalent product FSM suffers from the state explosion problem [4] , the test suites derived from it may be unmanageably large. Consequently, whenever the particular situation allows it, alternative testgeneration strategies are sought.
One case of practical interest is when the system is made of two CFSMs and the communication is one-directional: one of the CFSM components, say M 1 , sends messages to the other, M 2 , but not vice-versa. This master -slave type of interaction 1 between M 1 and M 2 can be described by a hierarchical model, in which the states of M 1 (the external FSM) become compound states and the behaviour of each is represented by M 2 (the internal FSM). The equivalent hierarchical model of the radio alarm system is given in Fig. 13 . There are, however, differences between the hierarchical FSM model presented earlier and the hierarchical equivalent of a system of two CFSMs: the output produced by an external transition (of M 1 ) may be passed as input to the internal FSM, in which case the next internal state will depend on the input received (instead of always being the initial state of M 2 , as in the hierarchical model presented earlier); furthermore, the next internal state may also depend on the internal state from which the external transition departs. Suppose, for example, that the starttime/switch transition leaves the Idle internal state of the compound state START. The switch output produced by the external transition will cause M 2 to move from Idle to Play, so the equivalent overall transition will enter the Play internal state of END. Similarly, when starttime/switch leaves the Play internal state of START, the destination internal state of END will be Idle. With this observation in mind, the test-generation procedure given earlier can be adapted for a system of two CFSMs, M 1 and M 2 , in which the communication is one-directional, from M 1 to M 2 . Analogously to the original procedure, three sets of test sequences, T 1 , T 2 and T 3 , will be produced:
( 
The above procedure can be extended to a system of CFSMs with one master M 1 and slaves M 2 , . . . , M n , n .2, which do not interact with each other. Since the slaves run independently, they can be tested separately from one another. Thus, the three test suites produced by the procedure will have the following form: † T 1 is the test suite for the master CFSM M 1 .
i , where T 3 i is the integration test suite for M i , constructed as indicated above.
Furthermore, the procedure can naturally be extended to a system of CFSMs that behaves like a multi-level hierarchy of master -slave interactions.
A system of CFSMs can also be used to describe the internal behaviour of compound states in a hierarchical FSM [3] . In this case, the above procedure can generate tests for the compound states, which are then used as inputs in the construction of the test suite for the hierarchical model.
DISCUSSION
Traditional methods for test generation from FSMs [4, 7] or (UML) state diagrams [17] can only be applied to flattened specifications. In comparison, the divide and conquer strategy used in this paper can yield considerably reduced test suites.
Consider the case of an FSM hierarchy with one compound state. According to [10] , for an FSM specification with n states and r input symbols and for n 0 , the estimated maximum number of states of the implementation under test, equal to n, the W-method will produce at most n 2 . r sequences. Then, for a hierarchy formed from an external FSM M with n states and r input symbols and an internal FSM M q with n q states and r q input symbols and for n 0 and n 0 q , the estimated maximum number of states of the corresponding implementations, equal to n and n q , respectively, the upper bounds on the number of elements of the test sets T 1 and T 2 will be n 2 . r and n q 2 . r q , respectively. In order to construct the "integration" set, T 3 , each of the n q sequences that form a state cover of M q is concatenated with every input symbol of M, so T3 will contain n q . r elements. Thus, the upper bound for the total number of sequences in T 1 < T 2 < T 3 will be n 2 . r þ n q 2 . r q þ n q . r. On the other hand, the maximum number of sequences produced by applying, under the same conditions, the W-method to the equivalent flat FSM will be (n þ n q 2 1) 2 . (r þ r q ). It can be observed that our method yields a considerably smaller test suite; for n, n q , r and r q reasonably large, the size difference can be substantial.
Consider also the case of a system composed of two CFSMs, a master M 1 and a slave M 2 , having n 1 and n 2 states and r 1 and r 2 input symbols, respectively. This can be transformed into a behaviourally equivalent flat FSM (using the approach in [13] ) with n 1 . n 2 states and r 1 þ r 2 input symbols. Thus, the maximum number of test sequences obtained by applying the W method to this flat specification will be (n 1 . n 2 )
2 . (r 1 þ r 2 ). On the other hand, by using our approach, the system of two CFSMs will be represented as a hierarchy formed of an external FSM M 1 in which all states are replaced by the same internal FSM, M 2 . The set T 1 , which checks the external FSM, will have at most n 1 2 . r 1 sequences, while the set T 2 , which checks the internal FSM, will have at most n 2 2 . r 2 sequences. The integration of M 2 within M 1 will be checked n 1 times, each time with n 2 . r 1 sequences, so T 3 will contain n 1 . n 2 . r 1 sequences. Thus, an upper bound for the number of test sequences produced by our method will be n 1 2 . r 1 þ n 2 2 . r 2 þ n 1 . n 2 . r 1 , which is much lower than the upper bound for the case in which test sequences are derived directly from the equivalent flat FSM.
RELATED WORK
Belli [21] applies the divide and conquer principle for reducing the complexity of tests derived from an FSM model of a GUI, but hierarchical FSMs are not explicitly considered. Andrews et al. [22] use constraints to reduce the set of input values and to help solve the state explosion problem in hierarchical FSM models of Web applications. Paiva et al. [23] exploit the structure of a hierarchical FSM to reduce the number of states in the equivalent flat FSM. However, neither [22] nor [23] approach the test-generation problem from the perspective of conformance testing, as in this paper; instead, test sequences are generated to achieve certain coverage criteria (e.g. node coverage, transition coverage, etc.). Bogdanov and Holcombe [20, 24] investigate test selection from hierarchical statecharts, in particular from Harel's semantics statecharts. Basically, two extreme cases are considered: when no restrictions are placed on the development of the implementation and when the specification is a 'geometrical hierarchy', so that faults related to not properly entering or exiting internal statecharts are ruled out by the construction of the system. In the former case, they provide, under certain constraints, formulas for constructing state covers and characterization sets for the equivalent (flat) statechart of the hierarchy from state covers and characterization sets of the external and internal statecharts. A test suite for the hierarchy can then be derived by simply applying the W-method. In the latter case, test generation for the overall system is reduced to test generation for the internal and external FSMs-basically, the sets T 1 and T 2 produced by our method. The former approach may produce large test suites and does not take advantage of the refinement strategy used in the design, while the latter has limited applicability; it cannot be used, for example, to generate tests for hierarchies in which external transitions leave some, but not all the sub-states of an internal FSM (e.g. in the tape recorder example, the play/P transition leaves PauseRec, but not Record). The method presented in this paper not only tests the implementation of the external and internal FSMs, but also the integration of these implementations, and so it can cope with a much richer hierarchy semantics. Bogdanov and Holcombe [20, 24] also investigate test selection from concurrent statecharts; the approach is similar to that used for test selection from hierarchical statecharts. Furthermore, the presented methods require communication to be disabled during testing, so, actually, they can only be applied to non-communicating concurrent statecharts. A similar approach is also used by Li and Qi [25] for testing hierarchical and communicating UML statecharts, but the testselection strategies proposed are based on the Wp-method instead.
In general, a system of communicating FSMs is first transformed into a behaviourally equivalent (non-deterministic) FSM which is then tested using one of the established techniques, such as the W or Wp-methods [13] . While this approach is the most general and provides the best coverage, as pointed out earlier in the paper, it may suffer from a state explosion problem. Hierons [26] points out that for some communicating FSMs called semi-independent communicating FSMs, it is possible to test the core transition structure with transitions that do not communicate and then check the remaining ones separately. The execution of a communicating transition can lead to a sequence of transitions being executed, so the final states of these transitions need be checked. The problem of finding a set of sequences that minimize the total cost is shown to be NP-complete and possible heuristics are discussed in [26] . The test strategy employed is based on the TEST SELECTION FOR FINITE STATE MACHINES 345 UIO method. The method presented in this paper, on the other hand, allows more general strategies, such as the W and Wp-methods, to be used for testing the individual FSMs.
CONCLUSIONS
In this paper, we present a method for generating tests for a hierarchical FSM by refining and reusing test suites for the components of the hierarchy. Test generation for FSM hierarchies with history states is also discussed. The method is also adapted for a system of communicating FSMs in which the communication is one-directional, from one master to one or more slaves. The proposed method is general, in the sense that the test suites for the individual components can be generated using any FSM-based technique. A design is usually constructed gradually: initially, we may have only a skeleton, then step by step we fill it with details. Our approach is to develop the test sets in parallel with the design, rather than generating them only on the basis of the final specification. In this way, testing is kept consistent with the design and we avoid leaving testing to the final, prerelease, stage. Furthermore, by refining test suites in parallel with the specification and using a divide and conquer strategy, the size of the test data is considerably reduced in comparison with the case in which tests are derived directly from the final specification.
For testing to be efficient, it must be automated as much as possible. It is straightforward to convert an FSM into a computer program and this process can be easily automated. The test-generation procedures given in this paper, including the test selection for the individual FSMs (using the W or Wp-methods, for example), can be fully automated. Appropriate tools are under development.
Further work involves looking at test generation from systems of CFSMs with more complex types of communication and extending the method given here to extended FSM models, such as stream X-machines [11, 15] . Test generation from particular types of hierarchical stream X-machines has already been investigated in [19] .
