Abstract. In this paper we show how to automatically generate test sequences that are aimed at testing the interconnections of embedded and communicating systems. Our proposal is based on the connectivity fault model proposed by [8] , where faults may occur in the interface between the software and its environment rather than in the software implementation. We show that the test generation task can be carried out by solving a reachability problem in a system consisting essentially of a specification of the communicating system and its fault model. Our technique can be applied using most off-the-shelf model-checking tools to synthesize minimal test sequences, and we demonstrate it using the UppAal real-time model-checker. We present two algorithms for generating minimal tests: one for single faults and one for multiple faults. Moreover, we demonstrate how to exploit the unique timeand cost-planning-facilities of UppAal to derive cheapest possible test suites for restricted types of timed systems.
Introduction
Testing modern embedded and communicating systems is a very challenging and difficult task. In part, this is due to their complex communication patterns and by their reduced controllability and observability caused by the embedding and close integration with hardware. Although testing is the primary validation technique used by industry today, it remains quite ad hoc and error prone. Therefore there is a high demand for systematic and theoretically well founded techniques that work in practice and that can be supported by automated test tools.
A promising approach to improve the effectiveness of testing is to base test generation on an abstract formal model of the system under test (SUT) and use a test generation tool to (automatically or user guided) generate and execute test cases. A main problem is to automatically generate and select a reasonably small number of effective test cases that can be executed within the time allocated to testing. This paper presents a technique for (formal) model-based (extended-finite state machines) black-box behavioral testing of embedded systems where a particular fault model, connectivity faults, is used to select test cases. Moreover, we demonstrate how such test cases can be generated using the diagnostic trace facility of a standard, unmodified, model checking tool using standard reachability analysis.
Connectivity Testing
An embedded system may as presented in [8] idealistically be regarded as consisting of embedded software encapsulated by hardware, like depicted in Figure 1(a) , where all communications to and from the software pass through the hardware. This is visualized by letting the inputs from the system environment to the software (a, b, c, d, e) pass through the hardware towards the software via connections (the unlabeled arrows). Likewise the outputs (0, 1, 2, 3) generated by the software have to pass via connections through the hardware in order to emerge at the system environment. 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 00000000000 A connection is by assumption related to exactly one input or output. This assumption implicitly implies that there is a one to one correspondence between external inputs to the system and the inputs to the embedded software, likewise there is a one to one correspondence between the outputs from the software and the outputs from the system.
Ideally it should be ascertained that the specification of the software component is correct. For instance, it may have been verified by some FSM verification technique. Then exploiting the ability to automatically generate executable code from specifications and assuming a careful construction of such compilers it would be reasonable to expect the generated code to be correct with respect to the specification, that is the two perform the same FSM behaviour.
In the composition of the two system components it then follows that the hardware (and probably drivers managing the interaction between the hardware and the software or malfunctioning sensors and actuators) may be the only error prone part. Therefore, in order to manage the multitude of potential errors we shall make an abstraction and regard the hardware (and the drivers, sensors, and actuators) as a black box interfacing the embedded software through the connections. As a consequence system errors may now only be referred to in terms of the connections.
In the system in Figure 1 (a) a fault could for instance be that one of the input connections is missing as shown in Figure 1(b) , where the b-input is disconnected. In the physical world, say for a mobile phone for instance, this may correspond to the situation where some button of the phone is not connected such that the software will never receive the input, and therefore the pressing of the button will cause no effect.
To ensure that the faults are testable they are assumed to be permanent. Testing in order to detect the kind of faults addressed in this paper is a matter of providing sequences of inputs that will reveal the missing connections. If say the b-input connection in Figure 1 (a) is missing this may be revealed by an input sequence containing b and where eventually an expected output event is not produced or (in case the system is not input enabled) an expected input is not allowed.
We require that test generation is sound and complete, but from a practical perspective the generated suite should also be cost effective, e.g., in terms of test execution time. Thus, the suite should be minimized in the number of tests and length.
Contributions
We provide algorithms for generating tests for embedded systems with respect to fault models for input connectivity errors where for the system under test, it is assumed that the embedded software behaves as an FSM. By exploiting a real-time model-checker like UppAal [13] we are able to generate timed test sequences. However, to ease presentation we define our algorithms in the untimed setting of I/O deterministic EFSM (previous work [8] defined connectivity errors in term of Mealy machines). We prove that a minimal length sound and complete test (with respect to single connectivity faults) can be found via a reachability question of a composition of the system model, its fault model, and a simple environment model. We extend the basic algorithm to generate a minimal length test for multiple connectivity faults, and we prove its soundness and completeness.
Previous work [8] provided dedicated, heuristic polynomial time reduction algorithms; ours always produce the minimal (at the expense of increased complexity). Our algorithms can be implemented in most model-checking tools, but are additionally valid for a particular class of timed automatons using a real-time model-checker like UppAal. It symbolically solves clock constraints to perform reachability analysis on a network of timed automata, and produces a timed diagnostic trace (an alternating sequence of discrete transitions and time delays) to explain how the property is (is not) satisfied. We demonstrate the applicability of the algorithms on a medium size example (a cruise controller) -both in an untimed and a timed version, and indicate how the unique timeand cost-optimizing features of UppAal can be used to generate optimal tests. The paper is organized as follows: Section 2 formally presents I/O EFSM's and tests. Section 3 presents the modelling of connectivity faults and illustrates how to test for such faults. Section 4 and Section 5 respectively present the algorithm for single and multiple faults. Section 6 presents the case study, and Section 7 elaborate on generation of time-and cost-optimal tests using UppAal's unique diagnostic trace features. Section 8 concludes and outlines future work.
Related Work
The use of diagnostic traces produced by model-checkers as test sequences has been proposed by many others [2, 3, 5-7, 10, 11, 15, 9] . A simple approach is based on manually stated test purposes, (i.e specific observation objectives to be made on the system under test) such as observing a given output, or bringing the SUT to a given state or mode, see e.g., [6] . The test purpose is then formalized and translated to a logical (reachability) property to be analyzed by a model-checker. The resulting diagnostic trace is interpreted as a test case for that test purpose.
Another common approach is based on producing test suites that satisfy some coverage criteria of the specification, e.g. state-or transition-coverage, def-use pair coverage, MC/DC coverage etc. The simplest way of realizing e.g.,ṫransition coverage is to formulate a property for each transition separately and use the model checker to produce a test case for each transition. More advanced techniques will naturally try to reduce the size of the test suite by removing redundant prefix-traces [15] or composing test cases by generating (minimal [9] ) transition tours, [11, 9] .
In [2] mutation testing is considered although in another setting than ours (they consider software testing). Mutations are used for generating tests to implementations of FSM's using model checking. Given is an FSM, M , and a constraining temporal logic formula, φ. A mutation may be either a change of a transition in M , or a change of φ. For each mutation a test is generated as a counter example as to why M |= φ (if M |= φ). Duplicates and test being prefixes of other tests are removed, hence they do not as in our case generate a smallest possible test suite.
I/O EFSM
In this section we define input/output extended finite state automata (I/O EFSM) and their semantics.
Definition 1. An I/O EFSM is a tuple
where S is a finite set of states, s 0 ∈ S is the initial state, I and O are finite disjoint sets of input and output labels respectively, X is a finite set of integer variables, and
is a set of guards, over the variables in X, and A X is a set of finite sequences (possibly the empty sequence ) of assignments to variables in X. Each guard is a boolean expression over integer constants and variables in X, and each assignment is on the form x := e where x ∈ X and e ∈ E X is a arithmetical expression over the variables in X and integer constants.
−→ s . Often we write α! (α?) whenever α is an output (input) symbol. Note that, for reasons of clarity and ease of presentation, we have omitted internal τ actions; our algorithms can easily be adapted to handle these as well.
Semantics
The semantics of an I/O EFSM M is a labelled transition system defined wrt. a valuation function assigning values to the variables of M and used to evaluate the guards on transitions.
Definition 2. A valuation v for a set of integer variables X is a function
We let V X denote the set of all valuations for X. 0 X ∈ V X is the valuation where
Given a valuation v ∈ V X , the value of a guard g ∈ G X with respect to v, denoted by v(g), is the obvious evaluation of the boolean expression g relative to v. Moreover, for a sequence of assignments
where n is the value obtained by evaluating expression e using the valuation v.
The semantics of an I/O EFSM is defined as a labelled transition system.
where
is the least relation satisfying:
We say that a transition system is I/O deterministic if for any state there are at most one output transition and at most one input transition for any input. M is I/O deterministic if its induced transition system T M is I/O deterministic.
Two automatons M and M are equivalent, M ∼ M , if the initial states in T M and T M are trace equivalent.
We only consider the parallel composition of I/O EFSM's at the semantic level. By convenience, and without loss of generality, we assume all machines have the same variables. It follows from the definition that output actions are broadcasted.
Definition 4. Let
T i = (S i × V X , L i , (s i 0 , 0 X ), −→ i ), i = 1, . . . , k be I/O EFSM in- duced labelled transition systems. The parallel composition Π k i=1 T i is defined by Π k i=1 T i = ((S 1 × . . . × S k ) × V X , L, ((s 1 0 , . . . , s k 0 ), 0 X ), −→) where L = ∪ k i=1 L i ,
and −→ is the least relation satisfying
−→ n and v is a valuation accumulating all the updates v 1 , . . . , v k .
Tests
In our setting a test is an I/O EFSM except that each state is annotated by either the verdict pass or fail . 
has precisely one complete run.
Modelling and Testing Connectivity Faults
As mentioned previously, a connection is assumed to be related to exactly one input 4 . That is, when the connection related to a given input (say α) is faulty, the software will not receive any α-input, i.e. the state of the software will remain unchanged, whenever the environment makes an input to the system via α. We can therefore model a connectivity fault as a so-called mutation M [α] of a correct model M by changing all α-transitions so that the state is not changed. This is made precise in the following definition: 
The Test Generation Algorithm
In this section we present an algorithm for generating a test that distinguish an I/O EFSM from a single mutation (if they are distinguishable). In the algorithm we use the following two operators: M ? is M where outputs become inputs and M (x := e) is M where on any transition x is updated by e. Formally we have 
The Algorithm
The problem we want the algorithm to solve is the following: Intuitively, the idea behind the algorithm is to put M and its mutation together in parallel with a third machine, the environment E. Only E is allowed to submit actions, the other machines are modified to contain solely input actions. The role of E is to broadcast actions such that whenever the two other machines do not agree on receiving an action (recall they are I/O deterministic) a fault has been detected. The algorithm searches for a shortest possible trace of actions broadcast by E leading to a fault.
Pseudo Code
1. Let x, y and z be disjoint variables none of which belong to X. and
The correctness of the construction of the test automaton follows from the theorem below. 
Example
If we apply the algorithm on input M (Figure 3 ) and action b then the three machines put in parallel, M 1 , M 2 , and E, are as devised in Figure 3 . For illustrative clarity x++ is taken to mean x incremented modolus 2. The test M fail awbc ({w}, {a, b, c}) is a minimal length test that may be constructed by the algorithm. Clearly, awbc is a shortest possible sequence leading to a state in T E T M1 T M2 where x = y, and since only M 2 can engage in the last event c the value of z is c 2 .
Next, we generalize the algorithm above such that a whole suite of test automatons are generated for a set of mutations (if all mutations are distinguishable from M ).
The Algorithm
The problem the algorithm solves is The main idea is to extend the previous algorithm by running all mutants concurrently, but tightly synchronized, with the unmutated automaton M . Whenever the unmutated machine M cannot match a transition by one of its mutations a connectivity error has been detected, and M needs to be reset (and only then) to extend the sequence to kill more mutants. 
Pseudo Code
To control when M has to be reset every α transition by M is now followed by a new output action called go which intuitively acknowledge to the environment E that M could match α. After having ouput an action, E waits for this acknowledgement before it sends a new action. If the acknowledge does not arrive E knows that M could not perform the action, implying that a test has been found for at least one of the mutations. In that case the only possible synchronization is a reset between M and the environment automaton.
In order to detect when a connectivity error has been identified we introduce an observation automaton M α for each mutation α. It consists of two states and one transition that fires when M and M [α] does not agree on some input or output transition, i.e. when x = x α . All mutations have been revealed when all observation automata has fired, i.e, when all y α = 1.
Based upon the trace t = t β found (if a trace is found) a set of test automatons are constructed. First all go's and τ 's are removed from t . Then t is split in the parts t 1 , . . . , t k separated by reset labels. For all t i , but t k , fail test automations, M fail ti (O, I) are created, since M clearly cannot perform those traces-that was the sole reason why M was reset. To be able to tell whether the final part, t k , should give rise to a fail or pass test automaton we force the trace to always end with either reset or go. This is done by introducing a variable z in the environment automaton that is set to 0 on transitions with labels in I ∪ O and to 1 when a go or reset is performed. Then searching for t we require z is not zero. Clearly, if the last event in t, i.e. β, is a go then M 
Example
Given the I/O EFSM M in Figure 2 
Cruise controller Example
In this section we exemplify and benchmark our technique on a medium sized cruise controller example. The cruise controller is commonly studied and found in many variations in the literature, and thus serves as an illustrative example, see e.g., [14, 2] . 
The Cruise Controller
The model consists of two automatons. The user interface controls the different modes of operation according to the various user inputs, whereas the speed control keeps the actual speed close to a given desired speed by affecting the throttle of the engine. The user interface ( Figure 5(a) ) basically has four modes, i.e. inactive when the engine is turned off, active when the engine is turned on, cruising when the speed control is enabled, and standby when the speed control is temporarily suspended. When the engine is turned on, the desired speed is cleared, and when cruise mode is entered, the actual speed is recorded and set as the desired speed. The cruise mode may be reentered from standby mode.
The speed control ( Figure 6 ) switches between its two operational modes disabled and enabled according to enable and disable control signals from the user interface. In disabled mode, it sets the desired speed to zero or to the sampled actual speed when commanded by the user interface. In enabled mode, it samples the actual speed and based on the difference between actual and desired speed (represented by variables cSpeed, dSpeed), it stops acceleration of the engine (output inc0), or commands the engine to do medium (output inc1) or high (output inc2) acceleration. Further, in enabled mode, the user can manually increase or decrease the desired speed.
The actions of the user interface are
For the system composed of the user interface and speed controller synchronizing internally 5 on actions O u ∩ I s , the actions are
5 Recall that our technique can be adapted to handle these. The semantics of the input fault mutations in a composed system is as if they were made to their (synchronous) product I/O EFSM, hiding internal communication channels. 
Generated Test Sequences
Unless the length of the test suite is important, the normal and computationally most efficient method is to generate a separate test sequence for each mutant. Our experimental results show that a sequence could be successfully generated for each mutant; also the sequences are quite short. The test suites generated for the cruise interface, the speed controller and the composed system contain respectively 5 (16), 7 (31), 8 (34) test cases (total steps). All were generated on a standard PC in less than one second. Table 1 lists some examples. These results indicate that our technique may be feasible for much larger systems, both in terms of test suite size and model size (number of inputs and state space). Since to check for connectivity of incr because inc0 would also be output if incr was disconnected (given that maxDiff = 2, dSpeed and cSpeed initially equals 0, acc becomes 0 in both cases). Hence at least two increments are needed.
Also note that-because our algorithms does not require the specification or implementation to be input enabled-not all sequences end with an output, meaning that if the last input can be performed by the tester, the test will pass (or fail, depending on the verdict). If this is felt to be unnatural for some applications, it is very easy to force our algorithms to produce tests that ends with an output. The generated test for engineOff is then M pass engineOn.clearSpeed.engineOff .engineOn.clearspeed (O u , I u ).
Multi-fault Test Sequences
In some cases it is important to produce a smallest test suite with as few and short tests as possible. A simple reduction technique like prefix elimination does not work well for connectivity testing (see sequences presented in Section 6.2). Our generalized algorithm from Section 5 is therefore more involved and guarantees that the minimal length test suite is computed, although at the expense of computational complexity (the problem is NP-hard [8] ). It involves analyzing a system consisting of all mutants running concurrently in a synchronized step-lock fashion. Thus, state space explosion theoretically limits how many mutants can be composed, and it should be examined where this limit occur in practice. The following experiments are run on a 8x900 MHZ Sun Sparc Fire v880R workstation with 32 GB memory running Sun Solaris 9 (SunOS 5.9). However, UppAal only exploits one CPU and addresses at most 4 GB of memory. The results are tabulated in Table 2 .
For the user interface, it turns out that it is possible to compute (using only a few seconds and megabytes of memory) a single test of 11 steps that detects all input faults M 
Timed Test Generation
We next demonstrate how connectivity tests for a class of timed systems can be generated. The tester now needs to be time aware to reveal them. This result requires no change to the basic algorithm if a real-time model-checker like UppAal is used.
Informally, a timed automaton [1] is an I/O EFSM equipped with a set of nonnegative real-valued variables called clocks that may be used in guards, and may be set to zero on transition assignments. In addition, location invariants forces the automaton to take a transition before it becomes false. The semantics of a timed automaton is defined in terms of an infinite timed transition system consisting of both discrete transitions and time delay transitions. To ensure testability we impose similar semantic restrictions as in [16] : Our model, called DOUTA, are deterministic, output urgent (an output or τ occurs as soon as it is enbled) timed automata. DOUTA is formally defined in [9] .
Consider the following real-time requirements for the user-interface automaton in Figure 5 (a). 1) For safety reasons, the engine must be on for at least onDelay before cruise control may be switched on. Earlier requests must be ignored. 2) When cruise mode is suspended, at least resumeDelay must elapse before reengagement to avoid too rapid enabling and disabling of the speed controller. 3) It takes controlDelay to enable or disable the speed controller (involves external communication), whereas the speed can be set or cleared with a zero delay (assumed internal communication). These requirements are satisfied by the DOUTA in Figure 5( are not a trivial insertion of the delay constants occurring in the model (e.g the 2800 ms between disableControl and resume). It is usually infeasible to compute these by hand because it involves solving a large set of inequations on clock variables. The zero delays in the above sequence can be avoided by replacing the universal environment E by a more accurate (and slower) environment model timed automaton E which restricts the choices of the tester.
UppAal also has efficient facilities for generation of time-and cost-optimal diagnostic traces [4, 12] . In fact, the above test is not only of minimal length, but also the fastest (minimal accumulated time delay). To avoid expensive operations, e.g., resets, UppAal can be used to generate suites with the fewest such operations. As a simple example, the generated multi-fault test presented in Section 6.3 for the speed controller required two tests, and thus one reset. 
Conclusions and Future Work
This paper describes two sound and complete algorithms that generate minimal test cases and test suites respectively for input connectivity faults. The algorithms are based on reachability analysis and may thus be implemented in most model-checkers. Based on experiments with a concrete model-checker, UppAal, and a medium sized example, we conclude that our techniques are feasible, and for the simple algorithm appear to scale to larger systems. For the generalized algorithm the number of simultaneous mutants that can be handled is limited due to state space explosion (recall that the problem is NP hard). Finally, we show how timed connectivity and examples of cost optimized test suites can be generated by the same algorithms.
We only looked at input connectivity faults, however it is trivial to generate test sequences for output connectivity faults, since this amounts to finding a sequence that visits a transition where the output is produced, hence making it observable.
As future work we plan to examine other more involved fault models, e.g. models where connections may be whole protocols. Since our algorithms are based on finding a trace that can be performed by the original automaton and not its mutant, or vice versa, our algorithms appear to be so general that many other fault models can be supported. In particular we plan to investigate how to test wrongly interconnected communicating (distributed) components that have been tested or verified in isolation. Also, we plan to investigate a timed connectivity fault model where disconnects are not permanent and we intend to do practical application and further experiments with time-and costoptimal test suite generation.
