Abstract. We present a semantics for fault tree analysis, a technique used for the analysis of safety critical systems, in the real-time interval logic Duration Calculus with Liveness and show how properties of fault trees can be checked automatically. We apply this technique in two examples and show how it can be connected to other verification techniques.
Introduction
In this paper we bring together the two worlds of safety engineering on the one hand and real-time model-checking on the other hand. We present an approach of using model-checking to determine whether a fault tree is designed properly. Fault tree analysis [VGRH81] is a technique widely used by engineers to analyse safety of safety-critical systems. Originally, it did not have a formal semantics and relied on the expertise of safety engineers. Recently there have been several attempts to define a formal semantics for fault trees [RST00, Han96] . In this paper we go one step further and show how to combine the fault tree analysis with real-time model-checking.
Both parties benefit from this combination. From the point of view of the safety engineer formal models and proofs by model-checking raise the quality of safety analysis. The aim is to make implicit assumptions on the behaviour of the system explicit and to discover problems that have been overlooked. So we add extra redundancy to the safety analysis itself.
On the other hand, model-checking benefits because the formal model is compared with the fault tree that is created from the system independently. Additionally, the knowledge of the system which is present in the fault tree can be used to simplify the verification process. Instead of verifying one complex property of the whole system, we decompose the property into simpler properties of subsystems using fault tree analysis. Then we verify that the decomposition is correct and finally show that the simple properties hold.
As the underlying formalism we use Duration Calculus with Liveness (DCL) [Ska94] , which is designed to describe and reason about real-time systems. As the operational formalism for model-checking we use Phase Automata [Tap01, DT03] because they have a semantics in Duration Calculus. Since we define the fault tree semantics in Duration Calculus with Liveness, too, we completely stay in this formal framework.
As an example consider the fault tree in figure 1. Let it be designed for a system in which a relay K2 controls a pump which pressurizes a tank. We assume that the tank will burst if the contacts of relay K2 are closed for more than 60 seconds. The fault tree decomposes this event and states that if it occurs then either an electromagnetic field (EMF) must have been applied to the coil for more than 60 seconds or erroneously the relay does not open. The aim is to verify that this is in fact true. To this end, we create an operational model of our relay to express our assumptions on its behaviour. We formalise each event given in the fault tree by a formula in Duration Calculus with Liveness. In this example let E 1 , E 2 and E 3 be these formalisations. We can be sure that no cause of the event E 1 is forgotten if the implication E 1 ⇒ (E 2 ∨ E 3 ) holds with respect to our operational model M of the relay. So in fact we have to verify
This is done by translating each formula and its complement into a Phase Automaton. Let these Phase Automata be called A E1 , A E2 , A E3 and A ¬E1 , A ¬E2 , A ¬E3 . We check whether there is a run of the model M which is also possible for A E1 , A ¬E2 and A ¬E3 . If this is not the case, the implication is true. Thus no causes of the event E 1 have been overlooked. In this paper we give precise semantics of fault trees to express that a fault tree is well designed. Apart from the or-connective considered in this example the other connectives that may appear in fault trees are also treated. For a subclass of DCL formulae which is relevant for fault trees we give algorithmic constructions of Phase Automata. And we show how they can be composed so we can use model-checking to establish that the fault tree is well designed for a given model of the system. The rest of this paper is organised as follows. In section 2 and 3, we introduce Duration Calculus with Liveness and Phase Automata. In section 4 we give a semantics for fault trees in Duration Calculus with Liveness. In section 5 we show how properties can be model-checked automatically using Phase Automata. This approach is applied to one example in section 6 and one case study in section 7 where we design and verify a more complex system. We integrate the fault tree analysis into a verification process with PLC-Automata [Die00] which can be directly compiled into software for embedded systems and into timed automata.
Duration Calculus with Liveness
Duration Calculus (DC for short) [ZHR91] is a real-time interval logic which allows reasoning about durations of states. As the properties which will be important for fault trees will be liveness properties, we use the extension Duration Calculus with Liveness (DCL) [Ska94] , which introduces special modalities to express real liveness properties.
Real-time systems are described by a finite number of observables (timedependent variables) which are denoted by X, Y, . . . and interpreted by an interpretation I which assigns to each observable X a function I(X) :
Here Time is the time domain -in this case the real numbers -and D is the finite domain of the observable. Additionally we use rigid variables denoted by x, y, . . . and valuations V which assign a real number to each rigid variable. State assertions π are generated by the grammar
and describe the state of the real-time system at a certain point of time, with the semantics:
and the usual definition for the propositional connectives. Duration terms θ are either rigid variables or derived from state assertions using the operator; their semantics depends on an interpretation I, a valuation V, and an interval [a, b] , and is defined by
Duration formulae F are generated by the grammar
and are evaluated in a given interpretation I, a valuation V, and a time interval [a, b] . The symbol p denotes a predicate symbol like =, ≤, ≥. In general, the meaning of a predicate p is given by the interpretation and denoted by p I . A formula F 1 ; F 2 holds iff the given interval can be "chopped" into two parts such that F 1 holds on the left part and F 2 on the right part. The expanding modalities ¡ and £ allow an expansion of the interval to the left respectively to the right. Additionally to negation and conjunction we allow quantification over rigid variables and observables. Other propositional connectives can be defined as abbreviations. Formally,
The definitions of the remaining connectives and quantifications over rigid variables and observables are like in first-order logic. Additionally, the following abbreviations will be used:
Phase Automata
As an operational model for real-time systems we use Phase Automata [Tap01] , which possess a formal semantics in DC and allow model-checking using the tool Moby/DC [DT03] . The intuition is similar to Timed Automata [AD94] . A Phase Automaton A = (P, E, C, cl, s, d, P 0 ) consists of finite sets of states P , and clocks C, a transition relation E ⊆ P × P , and a set P 0 of initial states. The function cl assigns a set of clocks to each state, the function s assigns a state assertion to each state, and the function d assigns to each clock a time interval.
A Phase Automaton can stay in the present state only if the state assertion holds. Additionally, for each clock c the amount of time the automaton stays in states in cl −1 (c) must be within the interval given by d(c). In figure 2 we present an example of a Phase Automaton modelling the formula Q L ( P ∧ 4 < ; Q ∧ 3 < ). The open intervals (0, ∞) and (4, ∞) express that the automaton may stay in s 0 and s 1 arbitrarily long but has to leave these states eventually, whereas the interval (3, ∞] allows the automaton to stay in s 2 forever. 
Semantics
The semantics of a Phase Automaton A is defined in terms of one big DC formula. It encodes the behaviour using one fresh observable ph A , which ranges over the state space of the automaton. The subformulae model the initial states, the successor state relation, and the clock constraints. To give a flavour of these formulae we just present one of them. It expresses that it is impossible to stay in a set of states which belong to the same clock c longer than the upper bound given in the clock interval d(c). It encodes the progress of the automaton.
Model-Checking and closure under complementation
The model-checker Moby/DC [Tap01, DT03] checks whether a set of Phase Automata running in parallel have a common run. To exploit this, we use the following automata-theoretic approach to model-checking. We model the system to be checked by a set of Phase Automata. The property which is to be verified is negated. For this negated property we also construct a Phase Automaton and check whether there is a run which satisfies both the model of our system and the negated property. If this is not the case, the property holds. Unfortunately, Phase Automata are -like Timed Automata -not closed under complementation. Therefore we will have to restrict ourselves to a subset of Phase Automata that permits complementation.
Fault Tree Analysis
Fault tree analysis (FTA) [VGRH81] is an engineering technique to identify causes of system failures. Its main area of application are safety critical components in nuclear and aviation industries. Starting with an undesired event (called top-event) all possible causes (called sub-events) are identified. These causes are joined using and and or gates to the top-event depending on whether all events have to occur to yield the top-event or whether one event is sufficient. This procedure is iterated until a given granularity is reached. Events that are not developed further are called basic-events. In figure 1 we gave an example taken from the Fault Tree Handbook [VGRH81] in which for one event two possible causes are identified. We use the notation defined by the IEC 61025 standard [IEC93].
DCL Semantics
In order to use model-checking techniques to verify that a fault tree is constructed properly and to combine it with other formal techniques in one verification process we need a formal semantics. Originally, there was no formal semantics [VGRH81] but there have been several attempts [Han96,BA93,RST00] to define one in order to avoid ambiguities.
Events. Events are formalised by DCL formulae. Gorski [Gór94] divides the events occurring in fault trees into three groups. So we will restrict ourselves to DCL formulae for these groups and give a DCL formula pattern for each of them. We require the events to be formalized by such a DCL formula.
where ∼∈ {<, ≤}, a and b are real numbers and π is a state assertion, describing the system state. Upper bounds for the duration are to be specified using the sequence pattern like Q L ( ¬π ; π ∧ ≤ b; ¬π ). Examples for these type of pattern could be:
The relay is eventually closed for more than 60 sec.
-Deadlock: Once the relay is closed, it will never open again.
-State sequence: Eventually the relay is closed for at least 60 sec and is open for at least 20 sec after that.
Gates. Events are joined by gates to express their dependence. We follow Reif et al.'s approach [RST00] and distinguish two kinds of gates:
-decomposition gates which do not impose any temporal relationship between the events joined by this gate, and -cause consequence gates which express temporal dependencies.
We consider the two decomposition gates and (∧) and or (∨) and three cause consequence gates. For the asynchronous or (∨-acc) gate we require the event to happen after one of the causes have happened. For the asynchronous and gate (∧-acc) we require the event to happen after all of the causes have happened. And for the synchronous and gate (∧-scc) we require the event to happen after all events occurred simultaneously, which means that there is a time interval in which all formulae expressing the event hold.
After having given ideas what events and gates are, we can proceed to define syntax and semantics for fault trees.
Syntax
The set of fault trees is defined inductively as follows.
-Every basic event E formalized as a DCL formula is a fault tree with top event E. -For a nonempty and finite set T of fault trees, a gate G ∈ {∧, ∨, ∧-scc, ∧-acc, ∨-acc} and an event E, the term (E, G, T ) is a fault tree with top event E. As all gate conditions will be associative we do not have to impose an order on the fault trees in T .
We continue to present fault trees graphically and not as a term structure. According to the IEC 61025 standard [IEC93] we use "&" and "≥ 1" in the graphical notation instead of "∧" and "∨" and omit and gates with one child.
Semantics
For each gate we define two DCL proof obligations, one stating that the occurrence of sub-events is necessary for the event to occur and the other that they are sufficient. For fault tree T , we define in our semantics [[T ]] S to be the conjunction of all sufficient conditions and [ [T ] ] N as the conjunction of all necessary conditions. The proof obligations for the necessary conditions are especially important. They express that no cause has been overlooked. If all necessary conditions can be proved by model-checking, we call a fault tree complete. Additionally, if it can be proved that the sub-events cannot happen, it follows that the event itself cannot happen. The precise semantics is defined inductively:
. . , T n }) be a fault tree and let E 1 , . . . , E n be the top events of T 1 , . . . , T n . Then we define
where the proof obligations F G S and F G N are given by
For the cause consequence gates we have to express the notion of "after" in terms of DCL. So we have to get rid of the expanding modalities in Q L and use the substitution {Q L /} which removes the leading occurrence of a Q L operator from a formula. After this substitution the formula is still a well-formed DCL formula.
If the consequence has already happened due to other causes the consequence does not need to happen again. So for the sufficient condition we do not require the consequence to happen after the causes.
Model-Checking
The proof obligations presented in the previous section should be verified automatically by model-checking. As one can only prove correctness propositions with respect to a given model, we assume that a formal model of the system is given in terms of Phase Automata. So the model-checking problem for a given fault tree T is to check if a model M satisfies
Zhou et al. [ZHS93] showed that the Duration Calculus is undecidable. As we can write every DC Formula F as true ⇒ F and this is the necessary condition of the fault tree (true, ∧, (F )) the model-checking problem for fault trees is undecidable in general. So a restriction to a subset of DCL is unavoidable. Therefore we only consider the three classes of events proposed by Gorsky [Gór94] and restrict the formulae to patterns given in section 4.1. If we assume for the third pattern that two different state expressions occurring in a DCL formula of that type cannot hold at the same time, all formulae matching one of these patterns can be translated into Phase Automata. Therefore the fragment is decidable. The expressiveness can only be evaluated by considering case studies which we do in sections 6 and 7.
Idea. To verify the gate conditions arising from fault tree analysis they are translated into Phase Automata, too. Model-checking establishes whether or not the conditions hold for the given model. First we present constructions of Phase Automata for each event pattern of section 4.1 and its complement. After that we show how the proof obligations arising from different gates can be modelchecked.
Constructing Phase Automata for Event Patterns
The Complement construction for Sequence Pattern. In general, Phase Automata are not closed under complementation and there are examples of Phase Automata for which no complement exists and which belong to the group of sequence formulae. To avoid these problems, we require that two state expressions within the pattern cannot be satisfied at the same time, that is
A construction for this case is sufficient for our case studies. The detailed construction is given in the appendix. The restriction to this kind of formula unfortunately disallows formulae like which describes that eventually π holds for less than t time units. This is the only way to describe upper bounds in DC. But this simple type of formula can still be translated although requirement (1) does not hold.
Decomposition Gates
Using the Phase Automata constructed in the previous subsection, we can express the negation of the different proof obligations by a set of Phase Automata as presented in table 1. Then we check if this set of Phase Automata has a run together with the Phase Automata model M of our system. If this is not the case, the proof obligation is verified. In table 1, A denotes the complement of the Phase Automaton A, denotes the parallel composition, and ∨ the alternative. The alternative A 1 ∨ A 2 is -like for finite automata-just the union of the two Phase Automata, where the name-spaces are disjoint.
Condition
Set of Phase Automata (Negation of Condition) 
Cause-Consequence Gates
To perform model-checking for cause-consequence gates, we use the same automata constructed above, but impose a new oberservable Syn to synchronise the automata and express the temporal dependencies. Obviously the second pattern expressing a final state cannot be regarded as a cause for an event which takes place after this event, so we do not consider this type of pattern for causeconsequence gates.
Asynchronous Gates. The sufficient condition is the same for asynchronous cause-consequence gates and for the decomposition gates.
To check the necessary condition we construct a set of Phase Automata which allow all runs which violate the property. For the and-gate condition F by Syn and add the assertion ¬Syn to all other states. So E has to be true finally and Syn must hold before, so at least one formula E i has not been true before.
The construction for or gates is similar, except that in step 2 instead of the union the parallel composition is used.
Synchronous Gates. Again, checking the sufficient condition is very easy and is skipped here. The necessary condition for synchronous and gate describes that an event occurs only if the sub-events occur at the same time, which means that several state assertions hold in the same interval. Therefore we only consider cases where the formulae E i for the sub-events are of the first type
1. Let Syn be a fresh boolean observable. 2. Construct an automaton for the complement of the formula
) using the construction in figure 3 . The semantics of this automaton is the set of all runs in which not all state assertions are true simultaneously. Add the assertion Syn to every state. Add a new state p true with the assertion ¬Syn and transitions from all other states to this state. So Syn is true as long as not all state assertions are true on the same time interval. 3. Construct the automaton A E and replace the condition true in the first state by Syn and add the assertion ¬Syn to all other states. So E has to be true finally and Syn must hold before, which means that beforehand at no time interval all state assertions in E 1 , . . . , E n have been true.
Example -Pressure Tank
We apply our approach to the classical pressure tank example [VGRH81] . In the original work, the fault tree analysis is done completely manually; no formal techniques are considered. We present the scenario, a part of our formal model of the system, the first part of the fault tree, and explicitly check one gate condition. 
Scenario
The pressure tank system shown in figure 6 consists of three parts: a pressure tank, a pump-motor device and an associated control system to regulate the operation of the pump. We use the following assumptions of the system [VGRH81]:
-It takes 60 sec to pressurise the tank.
-The pressure switch contacts are closed until the threshold pressure is reached.
-The tank is fitted with an outlet valve which drains the entire tank in negligible time. The valve is not a pressure relief valve.
Then the operation of this system is as follows:
-Initially the system is dormant: the switch S1 contacts are open, the relay K1 contacts are open, the relay K2 contacts are open and the pressure switch is closed. -Pressing switch S1 starts the system. Power is applied to the coil of relay K1, closing K1's contacts, so that relay K1 is electrically self-latched. -The closure of relay K1 applies power to the coils of relay K2, causes relay K2's contacts to close and starts the pump. 
Modelling
We use the following observables to model the state of the pressure tank system:
-tankstate ranging over the set {empty, f ill, f ull, rupture} to model the filling state of the pressure tank. -f lowstate ranging over {f low, nof low} to model whether the fluid is pumped into the tank or not. -K2Coil ranging over {K2EM F, K2noEM F } to model whether there is an electromagnetic field on the coil. -K2Contacts ranging over {K2open, K2closed} to model whether the contacts are open or closed.
For simplicity, additional oberservables for the states of the other components are skipped here. Phase Automata are used to model our assumptions on the behaviour of the system. The ones presented in figure 7 to figure 9 model the operation of relay K2, the tank and the assumption that the tank will not withstand more than 60 sec of continuous flow. The possible failure of relay K2 is modelled by an extra failure state. The rest of our system model is again skipped. 
Fault Tree Analysis
Figure 10 presents a simplified and shortened version of the fault tree developed by Veseley et al. [VGRH81] . Additionally, we have annotated every event with its DCL formula. 
Rupture of pressure tank
Pump operates continuously for t > 60 sec K2 relay contacts closed for t > 60 sec ≥ 1 EMF applied to K2 relay at K2 coil for t > 60 sec K2 relay fails to openE 1 = 3 L ( rupture ) E 2 = 3 L ( f low ∧ > 60) E 3 = 3 L ( K2closed ∧ > 60) E 31 = 3 L ( K2EM F ∧ > 60) E 32 = 3 L ( ¬K2EM F ∧ K2closed )
Verification
We are going to verify that the decomposition at the or gate is correct with respect to our model of the system. That means that the Events E 31 and E 32 are necessary for event E 3 . To this end, we have to check the validity of
where M is our model of the system in terms of Phase Automata. Therefore we use the construction given in section 5 for the first pattern to obtain Phase Automata A E3 , A ¬E31 , A ¬E32 representing E 3 , ¬E 31 and ¬E 32 as given in figure  11 and check whether they have a common run together with the automata of our system model. In fact we only need A K2 of our model to prove this. The answer is obtained in 1.2 seconds using the tool Moby/DC. This result holds only because we have neglected the time the relay K2 takes to open its contacts. If we considered this in our model, the implication would not hold any longer. Using this technique the engineer has to put all her assumptions on the behaviour of the system in the formal model which adds additional safety as implicit assumptions are discovered. On the other hand, the engineer can easily alter the model and check whether the fault tree remains correct under different assumptions. 
Combination with other Model-Checking Techniques
In the previous section we have shown how an engineer can benefit from the combination of fault tree analysis and real-time model-checking. In this section we look at the profit gained from the model-checking point of view. We demonstrate how fault tree analysis can be used as a decomposition method to allow model-checking of larger systems. The case study is the single track line segment [STL01] .
Scenario
Two trains drive on the tracks shown in figure 12 . On the outermost track the trains may go clockwise and on the innermost track counterclockwise. In the critical section trains may go in both directions and may change their direction once. The task is to design a distributed controller ensuring that no collision may happen in the critical section. Each component of the controller has three sensors (S) attached and controls one light signal (L) and one point. This controller has to allow two trains to pass the critical section one directly after the other. In this case the first train may not change its direction. 
Design
We built a real-life model of this case study using the Lego-Mindstorms and the open source operating system BrickOS We designed the controller using PLCAutomata [Die00] , which also have a semantics in DC. Using the tool Moby/PLC [TD98] these automata can be compiled into ST-Code for Programmable Logic Controller, into C++ Code for BrickOS (Lego-Mindstorms), and into Timed Automata [AD94] . We used the compilation into C++ Code for BrickOS.
Verification
The goal is to verify that two trains do not collide in the critical section. The obvious idea would be to compile the PLC-Automata for the distributed controller into Timed Automata. Then one would model the environment using Timed Automata, and finally use the model-checker Uppaal [BBD + 02] to verify that a collision in the critical section is impossible. But the model is too complex and hence direct model-checking failed. So instead we choose the following approach which is sketched in figure 13 .
We perform a fault tree analysis with the top event "collision of two trains in critical section". In the fault tree this top-event is iteratively decomposed until we obtain a number of basic events. For each gate in the fault tree we apply the technique described in section 5, i.e. we translate the events into Phase Automata and verify, using Moby/DC, that for each decomposition the sub-events are necessary for the upper event. The fault tree for this example consists of 38 events and 27 gates. It turns out that due to symmetry only 14 gate conditions have to be checked.
For each basic event we verify that it cannot occur. First, all basic events in which the first of two subsequent trains turns around in the critical section may not occur, simply because this behaviour is forbidden by the specification. Second, all other basic events are simple enough for automatic verification. We show that they cannot occur in the distributed controller modelled by PLCAutomata. To this end, we use MobyPLC to compile these automata into Timed Automata which are then checked by Uppaal against the basic events. Since none of the events is possible in the controller model, we conclude that the top-event, i.e. the collision, does not occur.
Modelling. Our formal model of the single track line segment system in terms of Phase Automata describes the topology of the tracks and the movement of the two trains.
Experimental Results. The verification that a basic event cannot occur took 1:04:37 h for the hardest one. We used Uppaal (Version 3.2.11) on a DualPentium with 450 Mhz and 1 GB RAM. Checking each gate condition takes about 10 seconds on a Sun Ultra-1 with 384 MB RAM using Phase Automata and Moby/DC.
Related Work
There are several approaches to define formal semantics for fault tree analysis. Special timed transition systems and a first order logic with special predicates are introduced by Gorski [Gór94] . Dugan et al. [DBB93] introduced Markov Models to resolve ambiguities. Bruns and Anderson [BA93] use a modal µ-calculus semantics to check the validity of formal system models. Hansen [Han96] gives a Duration Calculus semantics and uses fault tree analysis to derive safety requirements from a given fault tree. However, the work does not consider whether a fault tree is constructed properly.
In the FORMOSA project [RST00,STR02] semantics in Duration Calculus, CTL and ITL are considered. Discrete time model-checking, using Raven [Ruf01] and SMV, and fault tree analysis have been applied to several case studies but they are used rather independently and not tightly integrated; further integration is one aim of this project. Currently embedding fault tree analysis in the interactive theorem prover KIV is faced.
The ESACS project (http://www.cert.fr/esacs/) uses fault tree analysis and model-checking in different areas. It is used for test-case generation from fault trees and for compilation of mode automata into a boolean formula, which is presented as a fault tree. Furthermore a tool for the automatic generation of fault trees from a statemate model is developed. But neither order of events nor time is considered in current versions of this tool.
Conclusion and Future Work
We have shown how fault tree analysis can be turned into a formal method and how model-checking can be applied to prove necessary and sufficient conditions of this analysis. In the case study we integrated fault tree analysis with two other formal techniques, PLC-Automata and Timed Automata, to verify a larger system.
In our future work we would like to investigate whether we captured all usual cases of events which might occur in fault trees. We also would like to implement tool support. This tool should compile a given fault tree into Phase Automata and check which gate conditions hold and which do not. Translation into other operational models like Timed Automata may also be considered.
A Complement-Construction for Sequence-Pattern
We give a construction for the complement of an automaton corresponding to the sequence pattern Q L ( π 1 ∧ a 1 ∼ ; π 2 ∧ a 2 ∼ ∼ b 2 ; . . . ; π n1 ∧ a n−1 ∼ ∼ b n−1 ; π n ∧ a n ∼ ) and π i ∧ π j ≡ f alse for all i = j. The cases where the relation < occurs are analogous. This case is simpler than the more general one where only π i ∧ π i+1 ≡ f alse is required. But the sequences which occured in the case study presented in this paper were of this simpler type. For each state assertion π i which occurs in the given sequence we create four states.
-p i which is taken iff the assertion π i holds and the sequence up to π i has not yet been seen. -p * i which is taken iff the assertion π i holds and the sequence up to π i has been seen. -p i< iff π i holds and the duration is too short. -p i> iff π i holds and the duration is too long.
Additionally we have a state p else which is taken iff no state assertion in π 1 , . . . , π n holds. Let A S = (P, E, C, cl, s, d, P 0 ). The state space and transition relation is defined by We associate exactly one clock to each state. The state assertions for each state and the initial states and the assigned clock intervals are defined as follows.
= {p 2 , . . . , p n , p *
