Abstract. In this paper we attempt to demonstrate that on-the-y techniques, developed in the context of veri cation, can help in deriving test suites. Test purposes are used in practice to select test cases according to some properties of the speci cation. We de ne a consistency preorder linking test purposes and speci cations. We give a set of rules to check this consistency and to derive a complete test case with preamble, postamble, verdicts and timers. The algorithm, which implements the construction rules, is based on a depth rst traversal of a synchronous product between the test purpose and the speci cation. We shortly relate our experience on an industrial protocol with TGV, a rst prototype of the algorithm implemented as a component of the CADP toolbox.
Introduction
It is widely recognized that testing is an essential component of the full lifecycle of communicating systems. However, the process of generating test suites is complicated, error-prone and expensive. The intrinsic di culty comes from the black-box nature of the implementation: its behaviour is only observable and controlable at the interfaces. In that context, a formal framework is a prerequisite for giving precise and consistent meanings of test verdicts. The usual theoretical approach Bri88] is to consider a formal speci cation of the intended behaviour of the Implementation Under Test (IUT). It permits to de ne the notion of conformance relation linking an implementation to the speci cation and the notion of verdict associated to the application of a test case (set of interaction sequences) to an implementation, w.r.t. the conformance relation. The problem is to automatically generate correct test cases from a formal speci cation of the IUT. A correct test case, applied to an IUT, will declare \fail" only implementations which do not conform to the speci cation (soundness property). We also require that an implementation which does not conform to the speci cation might be detected by repeating the application of a test case, under a fairness assumption on the implementation (property of exhaustivity).
During the last decade, testing theory and algorithms for the generation of tests have been developed from Labelled Transition System speci cations (LTS). Test generation involves sub-problems of traversal, comparison or reduction of LTS, already addressed by veri cation. Consequently, we think, among others CGPT95], that time is ripe for linking test and veri cation. The experience of practitioners tells them that it is not reasonable to try to validate all possible behaviours of their protocol. It is why they use informal test purposes. Basically, we traverse in a depth-rst manner, the synchronous product of the IUT speci cation and of the test purpose. During the traversal, we check their mutual consistency. If so, an acyclic test graph is generated and decorated with verdicts and timers. The algorithm is an original extension of the on-the-y veri cation kernel we developed a few years ago JJ89, JJ91, FM91, FMJJ92] . It provides a complete treatment of the problem of test cases, including preambles and postambles, verdicts and timers management. It is now well known that depth-rst traversals are the heart of some good veri cation algorithms, for behavioural comparison and reduction FM91], as well as for model-checking JJ89, CVWY90, JJ91]. We show that it is also true for test generation, which constitutes a good example of transfer from veri cation to testing.
The test generator, based on veri cation technology, has been prototyped in the context of an industrial consortium, linking V erilog, Cap-Sesa, Cnet, Inria and the French Army. An experiment was performed on a real ISDN protocol speci cation. The results were very encouraging, con rming the interest of using this kind of algorithmic, which is now mature enough to be transfered in the industrial world, to deal with real formal speci cations. Our approach is compatible with symbolic (or structural) ones like TVEDA Pha94b] which may compute test purposes using reachability analysis.
The presentation is organized as follows. We start by de ning the di erent models used for describing test purposes, test cases, the speci cation and the IUT. We de ne a consistency preorder between test purposes and speci cations, and a test conformance relation linking implementations to speci cations. We give the formal rules allowing the construction of a test case from a test purpose and a speci cation. We give some results concerning soundness and exhaustivity of our generated test cases. Finally, we give the main results gained during an experiment on an ISDN protocol.
Models
In this section we rst describe the models used for the description of the di erent objects involved in the generation of test cases. They are used to de ne the notion of consistency relating a test purpose with a speci cation and the notion of conformance relating an implementation with a speci cation. These models are then used to de ne formal rules for the construction of test cases.
Input-Outputs labelled transition systems
The models used are all based on Input-Output Labelled Transition Systems (IOLTS) in which input and output actions are di erentiated because of the asymmetrical nature of the testing activity.
We We adopt the following notations and conventions: Let 2 A , p; q 2 Q M . We write p ! M q i (p; ; q) 2 T M and write p =) M q i 9 1 ; 2 n 2 A, p 0 ; ; p n 2 Q M : = 1 : 2 : : : n and p 0 = p, p i i+1 ! M p i+1 with i < n, p n = q. A(q) = f j 9q 0 and q ! M q 0 g is the set of immediate actions after q, I(q) = A(q) \ A I is the set of inputs after q, and O(q) = A(q) \ A O is the set of outputs after q. Succ (q) = fq 0 j q ! M q 0 g is the set of states reachable from q by means of a transition labelled by . We write :(p ! M ) if there is no transition starting from p and labelled by , :(p ! M ) = (Succ (p) = ;). We note p after = fq 2 Q M j p =) M qg the set of states reachable from p by the sequence of transitions and traces(p) = f 2 A jp after 6 = ;g the set of sequences starting from p and reaching a state in Q M . In the sequel, we will not distinguish between a transition system and its initial state.
An IOLTS satis es the controlability condition if and only if for each state, if an output is enabled, then there is exactly one outgoing transition. More formally, if jXj denotes the cardinality of the set X, 8p : jO(p)
An IOLTS is deterministic if and only if 8p; 8 : jp after j 1.
We consider four kinds of IOLTS: the speci cation, the implementation, the test purpose behaviour and the test cases which meanings are described below.
Speci cation and implementation
An IUT is placed in a test environment in which the tester can only interact with inputs and outputs. Thus the tester has an external view of the implementation. In contrast, a speci cation generally models the internal view of the system, i.e. the behaviour of the system with its internal actions, without considering the way it interacts with the environment. But this interaction should be taken into account in the test generation. As an example, if the implementation communicates asynchronously with its environment through several points of control and observation (PCOs), two subsequent and causally ordered outputs may be observed by the environment as two concurrent inputs if they occur on two different PCOs. In the following, we will consider the environmental point of view: outputs are controlable actions initiated by the environment (which may be the tester) and sent to the IUT whereas inputs are observable actions, initiated by the IUT and received by the environment.
While testing an IUT, we check for the conformance of the IUT in its environment with the speci cation in the same environment. Thus we rst have to transform the speci cation into its external view. Internal actions which are not observable by the environment have to be hidden and replaced by a transition. Inputs are replaced by outputs and vice versa, taking care of concurrency which may be produced by asynchronous interaction. This is called the mirror image operation. After that, we have to apply a -reduction which suppresses transitions. The transition system of the resulting speci cation is then an IOLTS. This has been implemented on-the-y in our prototype but, due to space limitations, we will not give more details of how this can be done e ectively. Deadlocks are often supposed to be observable by a tester. In practice the tester uses timers to achieve this (see 3.2) and we have to suppose that a timeout occurs if and only if the implementation is deadlocked. This is why timeouts are considered as inputs of the tester. If the speci cation is allowed to deadlock in a particular state, this is modelled by a special transition considered as an input of the environment initiated by the system. This treatment of deadlocks is quite similar with what is done in Tre95, Pha94a] . Finally, the last operation is determinization.
The resulting speci cation is a deterministic IOLTS S = (Q S ; A; T S ; q S init )
with A = A I A O and 2 A I a distinguished input. Without loss of generality, we will suppose that S starts with outputs of the environment i.e. A(q S init ) A O . In the following, speci cation will always correspond to the external view S of the speci cation. Though the implementation is not necessarily a transition system (it may be a physical system), as in all testing theories, we have to reason formally about it and model its behaviour. As it is only considered by its interactions with the environment, it is also modelled as an IOLTS I = (Q I ; A I ; T I ; q I init ), with A I = A I I A I O , A I A I I .
Test Purpose
A test purpose de nes a property on some particular interactions between the IUT and the tester. It consists in two parts : a behavioural part and a constraint part. The constraint part gives some property on the state of the implementation. It can be seen as computable by the environment and will be modelled by an input for the tester. Thus it is integrated in the behavioural part.
De nition 2.1 (Test Purpose behaviour) A test purpose behaviour is a deterministic acyclic IOLTS TP = (Q TP ; A; T TP ; q TP init ) satisfying the controlability condition and with a set of distinguished states Accept Q TP with no successor.
Test Cases
A test case is a set of sequences of actions describing all the interactions occuring between an IUT and a tester which wants to verify that an implementation conforms with the speci cation according to a test purpose. In an industrial context, test cases are often described using the Tree and Tabular Combined Notation (TTCN ISO92]). Some transitions are decorated with verdicts with the following informal meaning : (PASS): means that the test purpose is satis ed by the current sequence. But a sequence leading to the initial state (Postamble) must be applied in order to carry on another test case. It is a temporary verdict as the application of the postamble may produce Fail verdicts. PASS: this is a de nitive verdict meaning that the initial state has been reached after a (PASS) verdict. The sequence between (PASS) and PASS is a Postamble. FAIL: means non-conformance of the IUT. INCONCLUSIVE: this verdict is used in practice when a reception is allowed in the speci cation but cannot lead to a (PASS) or leads to a behaviour that is not considered in the test case because testing cannot be exhaustive in practice. We consider a conformance relation quite similar to those in Tre95, Pha94a]. Informally, the conformance relation states that outputs of the environment which are not accepted by the speci cation may be accepted by the implementation but inputs produced by the implementation must be also produced by the speci cation.
De nition 2.4 (Test conformance relation) Let S and I be two IOLTS describing the external view of a speci cation and an implementation, I ioconf S if and only if 8 2 traces(S); I(I after ) I(S after ) 3 Construction rules
The essence of the on-the-y method is to traverse a kind of synchronous product between two graphs, one for the speci cation and the other for the property to be checked. We rst de ne this synchronous product. Then we give the rules for the test case construction, including decoration with verdicts and timers. Finally we give some properties of the generated test cases.
Synchronous product
A transition is rable in the product if either it is rable in the two components or it is rable only in the speci cation.
De nition 3.1 (synchronous product) We de ne the product P = (Q P ; A; T P ; (q TP init ; q S init )), with Q P Q TP Q S where Q P and T P are the smallest sets obtained by application of the following rules: Timers Timers are useful in practice in order to insure against implementation deadlocks. The management of timers is made on the DAG generated by the test generation rules 3.2. As timers depend on inputs, we associate a timer t i to each input i labelling a transition in the test case. Three operations on a timer are available: Start(t i ) which initializes the timer and must be done as soon as input i is expected, Cancel(t i ) which is done when i is received or when, due to a choice, i is no more expected, and Timeout(t i ) which represents the observation of a deadlock when waiting for i.
Let (p TP ; p S ) ! T (q TP ; q S ) be a transition of the synchronous product and t : (n; ; m) the corresponding transition in the DAG. Let Ind((p TP ; p S )) be the independency relation which represents the concurrency. The independency relation is a binary symmetrical relation de ned on the inputs of a state: two inputs are independant if they may be received in any order. We denote by Running(n) the set of timers that have been started in the sequences leading to n and have not yet been cancelled, Cl and St are sets of timers that have to be respectively cancelled or started after action . Finally, discard(t) means that t is discarded from the DAG. The following rules specify the timers management: Init As speci cations, test cases start with an output, thus if r is the root of the DAG, Running(r) = ; Cancel and Start t : (n; ; m) 2 DAG Running(n) = R where Cl = ftiji 2 I((p TP ; p S ))^( ;i) 6 2 Ind((p TP ; p S ))g St = ftiji 2 I((p TP ; p S ))g n (R n Cl)
i.e. all timers corresponding to inputs not concurrent with must be cancelled and a timer must be started for each input available in m if it is not already running in n, except if it has just been cancelled.
Timeouts We suppose that 2 AI. By the construction and verdict rules, in each node of the DAG, there is a transition labelled and its verdict may be (PASS) if is in the test purpose, Inconclusive if it is in the speci cation or Fail otherwise.
If an input i (i may be ) is possible in a state of the synchronous product, a transition labelled by Timeout(ti) is added. The verdict assigned to this timeout must be the same as the verdict assigned to . Discard For each transition t : (n; ; 1) 2 DAG, apply discard(t)
Another depth rst search is performed on the DAG to generate the timers operations. Unlike the DAG construction, which works by synthesis (just around the pop operation) the operations on timers are generated before the exploration of the state successors (around the push operation). The running set associated with each state is initialized to empty set at the initial state. It is inherited from a state to its successor. During this step, on one hand, each transition of the DAG is decorated with cancel and start operations on timers, on the other hand some transitions labelled by timeout are added, following the previous rules.
Results
Proposition 3.1 Let P be the synchronous product between S and TP (denition 3.1) and T (TP; S) be the DAG synthesized by applying the rules of definition 3.1. If (q TP init ; q S init ) is the root of the dag, then q TP init q S init else the test purpose and the speci cation are not consistent.
Let OT(S) = fTP 2 IOLTS j q TP init q S init g be the set of test purposes which are consistent with respect to the speci cation S. Let TS(S) = fT (TP; S) j TP 2 OT(S)g i.e. the set of test cases (test suite) that can be constructed for a speci cation S. For T 2 TS(S), we denote Max traces(T ) = f j A(T after ) = ;g the set of maximal traces of T . For = 0 : 2 Max Traces(T ), and for an implementation I, we de ne verdict( ; I) = verdict(T after 0 ; ; 1). Notice that T is deterministic, thus T after 0 is unique. We have the two following results: This second proposition is not exactly the converse. Implementations can be non deterministic. Thus the application of the same sequence of actions of the tester may produce di erent verdicts. Thus, like other authors Pha94a], we assume a bounded fairness hypothesis on implementations. This informally means that a bounded number of executions of a non deterministic implementation will show all its behaviours. For n 2 IN, we de ne verdict (n; ; I) to be Fail if one of the n applications of on I produces a Fail verdict, Pass otherwise. 
Experimentation
The algorithms and transformations described in previous sections have been developed in the CADP toolbox FGM + 92] as a software component named TGV (for Test Generation using Veri cation techniques). In order to prove the feasibility of the approach, we have applied TGV to an industrial protocol, the DREX protocol.
TGV
As we were primary interested by demonstrating the feasibility of our approach before a real implementation, all algorithms are not yet combined into a unique on the y algorithm. We have used the Geode simulator ALHH93] from Verilog as an SDL CCI88] front-end which produces state graphs representing the behaviour of a speci cation, constrained by the test purpose constraints.
Thus the inputs of TGV are a state graph produced by Geode (from a SDL speci cation of the protocol) and an automaton formalizing the behavioural part of a test purpose. The output is the behaviour description and constraints de nitions of a test case in the standard TTCN format ISO92].
Di erent steps bring out this output. The rst step takes as input the state graph produced by Geode and transforms it into a graph representing the observable behaviour of the protocol speci cation in the testing environment (external view graph). Several transformations are performed in this step: abstraction of unobservable internal actions, determinization, mirror image which transforms inputs into outputs and vice versa and construction of diamonds modelling concurrency introduced by the asynchronous interaction between the tester and the IUT. The next step is the kernel of TGV. The output is the DAG which contains all informations needed in TTCN test cases. The last step takes as input the DAG. The algorithm extracts from the transition labels the message parameters and produces the constraint part in TTCN GR format. The remaining graph is unfolded into a tree describing the behavioural part of the test case in TTCN GR format. Finally the constraint and behavioural parts of the test case are translated into the graphical format TTCN GR.
Experiment with the DREX protocol
TGV has been used during an industrial contract for the Direction G en erale pour l'Armement. The protocol used for the experiment was a military protocol called the DREX protocol which allows the access to the transit network Socrate of the French Army, de ned in the framework of Integrated Service Military Network. This protocol has been chosen for three main reasons: rstly, we wanted to prove the feasibility of automatic test generation methods on realistic speci cations; secondly, an SDL speci cation of a similar protocol was already available, and nally, hand written test suites had already been produced. This last point is important as hand written test cases have served as a basis for comparison with automatically generated test suites.
The SDL speci cation models the behaviour of the DREX protocol on the network, communicating asynchronously with two users by two PCOs. The size of the SDL speci cation was about 2000 lines. 54 test purposes have been considered and 54 corresponding test cases have been generated. The time needed for the generation of a test case has to be separated into two parts: the time needed for the graph generation with Geode which took between 3.5s and 400s and the test case generation with TGV which took between 1s and 2s.
We have compared automatic test suites generated by TGV with hand written test suites in a qualitative way. Even though TGV is just a prototype, all hand written test suites or similar ones have been generated. The di erences that were observed were principally due to the fact that TGV treats systematically concurrency and timers. For example, in some hand written test cases, concurrency between events were forgotten and risked an incorrect verdict. Some di erences were also due to the formal interpretation of test purposes. More details and other quantitative results of this study can be read in FJJV96].
Conclusion
In this paper, we have shown how on-the-y veri cation techniques could be used in the generation of test suites. Starting from an already known conformance relation and from the experiment gained with the analysis of hand written test cases, we have formally de ned the rules allowing a construction of complete test cases, with preambles, postambles, verdicts and timers. These rules allowed us to prove that generated test cases are sound (correct implementations are not rejected) and exhaustive (if we assume a fairness hypothesis on implementations under test, incorrect implementation can be detected) with respect to the conformance relation. A depth rst search algorithm implementing these rules has been described. A rst version of this algorithm has been implemented in a prototype named TGV which produces TTCN test suites from SDL speci cations. TGV has been experimented on an industrial protocol, proving the e ciency and maturity of the algorithm.
The next step in this study will be the development of a new prototype which will incorporate the algorithm described in this paper in a unique on-the-y algorithm and its integration in a complete validation tool. Another continuation of the work is to deal with concurrent testing and links with interoperability testing.
