Abstract. Model checking timed automata becomes increasingly complex with the increase in the number of clocks. Hence it is desirable that one constructs an automaton with the minimum number of clocks possible. The problem of checking whether there exists a timed automaton with a smaller number of clocks such that the timed language accepted by the original automaton is preserved is known to be undecidable. In this paper, we give a construction, which for any given timed automaton produces a timed bisimilar automaton with the least number of clocks. Further, we show that such an automaton with the minimum possible number of clocks can be constructed in time that is doubly exponential in the number of clocks of the original automaton.
Introduction
Timed automata [3] is a formalism for modelling and analyzing real time systems. The complexity of model checking is dependent on the number of clocks of the timed automaton(TA) [3, 2] . Many model checking and reachability problems use a region graph or a zone graph (for the timed automaton) whose sizes are exponential in the number of clocks. Hence it is desirable to construct a timed automaton with the minimum number of clocks that preserves some property of interest. It is known that given a timed automaton, checking whether there exists another timed automaton accepting the same timed language as the original one but with a smaller number of clocks is undecidable [9] . In this paper, we show that checking the existence of a timed automaton with a smaller number of clocks that is timed bisimilar to the original timed automaton is however decidable. Our method is constructive and we provide a 2-EXPTIME algorithm to construct the timed bisimilar automaton with the least possible number of clocks. We also note that if the constructed automaton has a smaller number of clocks, then it implies that there exists an automaton with a smaller number of clocks accepting the same timed language.
Related work: In [6] , an algorithm has been provided to reduce the number of clocks of a given timed automaton and produce a new timed automaton that is timed bisimilar to the original one. The algorithm detects a set of active clocks at every location and partitions these active clocks into classes such that all the clocks belonging to a class in the partition always have the same value. However, this may not result in the minimum possible number of clocks since the algorithm works on the timed automaton directly rather than on its semantics. Thus if a constraint associated with clock x implies a constraint associated with clock y, and both of them appear on an edge, then the constraint with clock y can be eliminated. However, the algorithm of [6] does not capture such implication. Also by considering constraints on more than one outgoing edge from a location, e.g. l 0 a,x≤3,∅ − −−−− → l 1 and l 0 a,x>3,∅ − −−−− → l 2 collectively, we may sometimes eliminate the constraints that may remove some clock. This too has not been accounted for by the algorithm of [6] .
In [17] , it has been shown that no algorithm can decide the minimality of the number of clocks while preserving the timed language and for the non-minimal case find a timed language equivalent automaton with fewer clocks. Also for a given timed automaton, the problem of finding whether there exists another TA with fewer clocks accepting the same timed language is undecidable [9] .
Another result appearing in [14] which uses the region-graph construction is the following. A (C, M )-automaton is one with C clocks and M is the largest integer appearing in the timed automaton. Given a timed automaton A, a set of clocks C and an integer M , checking the existence of a (C, M )-automaton that is timed bisimilar to A is shown to be decidable in [14] . The method in [14] constructs a logical formula called the characteristic formula and checks whether there exists a (C, M )-automaton that satisfies it. Further, it is shown that a pair of automata satisfying the same characteristic formula are timed bisimilar. Our result and method differ from the above paper in the following three ways: (1) Given a timed automaton A, we construct a timed automaton B with the least number of clocks such that B is timed bisimilar to A. (2) We use a zone graph which is usually a much more succinct representation of the state space of the timed automaton. (3) Our method does not involve any intermediate step of creating a logical formula.
The rest of the paper is organized as follows: in Section 2, we describe timed automata and introduce several concepts that will be used in the paper. We also describe the construction of the zone graph used in reducing the number of clocks. In Section 3, we discuss our approach in detail along with a few examples. Section 4 is the conclusion. Due to constraint in space, the proofs have been relegated to the appendix.
Timed Automata
Formally, a timed automaton (TA) is defined as a tuple A = (L, Act, l 0 , E, C) where L is a finite set of locations, Act is a finite set of visible actions, l 0 ∈ L is the initial location, E ⊆ L × B(C) × Act × 2 C × L is a finite set of edges and C is a finite set of clocks. The set of constraints or guards on the edges B(C) is given by the grammar g ::= x k | g ∧ g, where k ∈ N and x ∈ C and ∈ {≤, <, =, >, ≥}. Given two locations l, l , a transition from l to l is of the form (l, g, a, R, l ) i.e. a transition from l to l on action a is possible if the constraints specified by g are satisfied; R ⊆ C is a set of clocks which are reset to zero during the transition.
The semantics of a timed automaton(TA) is described by a timed labelled transition system (TLTS) [1] . The timed labelled transition system T (A) generated by A is defined as T (A) = (Q, Lab, Q 0 , { α −→ |α ∈ Lab}), where Q = {(l, v) | l ∈ L, v ∈ R ≥0 |C| } is the set of states, each of which is of the form (l, v), where l is a location of the timed automaton and v is a valuation assigned to the clocks of A; Lab = Act ∪ R ≥0 is the set of labels. Let v 0 denote the valuation such that v 0 (x) = 0 for all x ∈ C. Q 0 = (l 0 , v 0 ) is the initial state of T (A). A transition may occur in one of the following ways: (i) Delay transitions : (l, v)
Here, d ∈ R ≥0 and v + d is the valuation in which the value of every clock is incremented by d.
, where v [R←0] denotes that every clock in R has been reset to 0, while the remaining clocks are unchanged. From a state (l, v), if v |= g, then there exists an a-transition to a state (l , v ); after this, the clocks in R are reset while those in C\R remain unchanged.
For simplicity, we do not consider annotating locations with clock constraints (known as invariant conditions [13] ). Our results extend in a straightforward manner to timed automata with invariant conditions. In Section 3, we provide the modifications to our method for dealing with location invariants. We now define various concepts that will be used in the rest of the paper. Definition 1. Let A = (L, Act, l 0 , E, C) be a timed automaton, and T (A) be the TLTS corresponding to A.
Timed trace:
A sequence of delays and visible actions d 1 a 1 d 2 a 2 . . . d n a n is called a timed trace iff there is a sequence of transitions p 0
, with p 0 being the initial state of the timed automaton. 2. Zone: A zone Z is a set of valuations {v ∈ R |C| ≥0 | v |= β}, where β is of the form β ::= x k | x − y k | β ∧ β, k is an integer, x, y ∈ C and ∈ {≤, <, =, >, ≥}. Z ↑ denotes the future of the zone Z. Z ↑= {v + d | v ∈ Z, d ≥ 0} is the set of all valuations reachable from Z by time elapse. A zone by definition is a convex set. 3. Pre-stability: A zone Z 1 of location l 1 is pre-stable with respect to another zone
, where each γ i is an elementary constraint of the form x i k i , such that x i ∈ C and k i is a non-negative integer. A canonical decomposition of a zone Z with respect to g is obtained by splitting Z into a set of zones Z 1 , . . . , Z m such that for each 1 ≤ j ≤ m, and 1 ≤ i ≤ n, either ∀v ∈ Z j , v |= γ i or ∀v ∈ Z j , v |= γ i . For example, consider the zone Z = x ≥ 0 ∧ y ≥ 0 and the guard x ≤ 2 ∧ y > 1.
Z is split with respect to x ≤ 2, and then with respect to y > 1, hence into four zones : x ≤ 2 ∧ y ≤ 1, x > 2 ∧ y ≤ 1, x ≤ 2 ∧ y > 1 and x > 2 ∧ y > 1. An elementary constraint x i k i induces the hyperplane x i = k i in the zone graph of the timed automaton. 5. Zone graph: Given a timed automaton A = (L, l 0 , E, I), a zone graph G A of A is a transition system (S, s 0 , Lep, →), that is a finite representation of T (A). Here Lep = Act ∪ {ε}. G A consists of nodes and transitions which are the edges of G A . S ⊆ L × {Z} is the set of nodes, {Z} being the set of zones of A. The node
Here the zone Z is called a delay successor of zone Z, while Z is called the delay predecessor of Z . We call a zone Z corresponding to a location to be a base zone if Z does not have a delay predecessor. The relation ε is reflexive and transitive and so is the delay successor relation. A zone Z is called the immediate delay successor of a zone Z iff Z is a delay successor of Z and there does not exist any zone Z such that Z is a delay successor of Z, Z is a delay successor of Z and Z = Z = Z. 6. A hyperplane x = k is said to bound a zone Z from above if k is the smallest integer such that
A zone, in general, can be bounded above by several hyperplanes. A hyperplane x = k is said to bound a zone Z fully from above if k is the smallest integer such that
Vis-à-vis, we can also say that a hyperplane x = k bounds a zone from below if k is the largest integer and there exists a valuation v ∈ Z such that
Similarly, we can also define a hyperplane bounding a zone fully from below. When not specified otherwise, in this paper, a hyperplane bounding a zone implies that it bounds the zone from above. A zone Z is said to be bounded above if it has an immediate delay successor zone.
We create a zone graph such that for any location l the zones Z and Z of any two nodes (l, Z) and (l, Z ) in the zone graph are disjoint and all zones of the zone graph are pre-stable. This zone graph is constructed in two phases in time exponential in the number of the clocks. The first phase performs a forward analysis of the timed automaton while the second phase ensures pre-stability in the zone graph. The forward analysis may cause a zone graph to become infinite [10] . Several kinds of abstractions have been proposed in the literature [5, 10, 11] to make the zone graph finite. We use location dependent maximal constants abstraction [10] in our construction. In phase 2 of the zone graph creation, the 
Fig . 2 . Zone Graph for the TA in Figure  1 zones are further split to ensure that the resultant zone graph is pre-stable.
The following lemma states an important property of the zone graph which will further be used for clock reduction.
Lemma 1. Pre-stability ensures that there exists a hyperplane x = h, where x is some clock and h ∈ Z such that a zone is bounded fully from above by the hyperplane x = h.
Some approaches for preserving convexity and implementing pre-stability have been discussed in [18] . As an example, consider the timed automaton in Figure  1 . The pre-stable zones of location l 1 are shown in the right side of the figure. An algorithmic procedure for the construction of the zone graph is given in [12] . A relation R ⊆ Q × Q is a timed simulation relation if the following conditions hold for any two timed states (p, q) ∈ R. ∀ a ∈ Act, p a − → p =⇒ ∃q : q a − → q and (p , q ) ∈ R and
A timed bisimulation relation is a symmetric timed simulation. Two timed automata are timed bisimilar if and only if their initial states are timed bisimilar.
Clock Reduction
Unlike the method described in [6] , which works on the syntactic structure of the timed automaton, we use a semantic representation, the zone graph described in Section 2 to capture the behaviour of the timed automaton. This helps us to reduce the number of clocks in a more effective way. For a given TA A, we first describe a sequence of stages to construct a TA A 4 that is timed bisimilar to A. Later we prove the minimality in terms of the number of clocks for the TA A 4 . The operations involved in our procedure use a difference bound matrix (DBM) [4, 7] representation of the zones. A DBM for a set C = {x 1 , x 2 , . . . , x n } of n clocks is an (n + 1) square matrix E where an extra clock x 0 is introduced such that clock x 0 is always 0. An element E ij is of the form (m ij , ≺) where ≺∈ {<, ≤} such that x i − x j ≺ m ij . The following are important considerations in reducing the number of clocks.
-There may be some clock constraints on an edge of the timed automaton that do not affect the transition. Such constraints may be removed. The time required in this stage is proportional to the size of the zone graph and hence exponential in the number of clocks of the timed automaton.
Stage 2: Splitting locations and removing constraints not affecting transitions: Locations may also require to be split in order to reduce the number of clocks of a timed automaton. Let us consider an example timed automaton of Figure 1 and its zone graph in Figure 2 . There are three base zones corresponding to location l 1 in the zone graph, i.e. Z 1 = {0 ≤ y − x < 2, x ≤ 5}, Z 2 = {y − x = 2, x ≤ 5} and Z 3 = {2 < y − x ≤ 4, y ≤ 7}. This stage splits l 1 into three locations l 11 , l 12 and l 13 (one for each base zones Z 1 , Z 2 and Z 3 ) as shown in Figure 3 (a). While the original automaton, in Figure 1 , contains two elementary constraints on the edge between l 1 and l 2 , the modified automaton, in Figure  3 (a), contains only one of these two elementary constraints on the outgoing edges from each of l 11 , l 12 and l 13 to l 2 . Subsequent stages modify it further to generate an automaton using a single clock as in Figure 3(b) .
Splitting ensures that only those constraints, that are relevant for every valuation in the base zone of a newly created location, appear on the edges originating from that location. Since clocks can be reused while describing the behaviours from the individual locations created after the split, this may lead to a reduction in the number of clocks. We describe a formal procedure for splitting a location into multiple locations in Algorithm 1. We note that a zone can as well be perceived as a set of constraints defining it. Similarly a guard can also be considered in terms of the valuations satisfying it. Input of this algorithm, A 1 is the TA obtained after stage 1. If there are m base zones in G A1 corresponding to a location l i in A 1 , then Line 3 and Line 4 split l i into m locations l i1 , · · · l im in the new automaton, say A 2 . For each of these newly created locations Line 6 to Line 11 determine the constraints on their incoming edges.
For each incoming edge l r a,gr,Rr − −−−− → l i , there exists a zone Z rj such that Z rj has an a transition to Z ij , the j th base zone of l i . Line 8 calculates the lower bounding hyperplane of Z ij by resetting the clocks R r in the intersection of Z ij ↑ with R |C| ≥0 . In Line 11, f ree(Z ij , R r ) represents a zone that becomes the same as Z ij after resetting the clocks in R r . Further, g jr is calculated as the weakest guard that simultaneously satisfies the constraints g r , Z rj and f ree(Z ij , R r ) and has the same set of clocks as in g r . For our running example if we consider
We can see that x < 2 is the weakest formula such that x ≤ 4 ∧ x ≥ 0 ∧ y < 2 ∧ x = y ∧ x < 2 ⇒ x < 2 holds and hence g rj = {x < 2}.
Loop from Line 14 to 28 determines the constraints on the outgoing edges from these new locations. Line 15 checks if the zone Z ij ↑ has any valuation that satisfies the guard g i on an outgoing edge from location l i . If no satisfying valuation exists then this transition will never be enabled from l ij and hence this edge is not added in A 2 . Loop from Line 19 to 26 checks if some elementary constraints of the guard are implied by other elementary constraints of the same guard. If it happens then we can remove those elementary constraint from the guard that are implied by the other elementary constraints.
For our running example, the modified automaton of Figure 3 (a) does not contain the constraint x > 5 on the edge of from l l1 to l 2 even though it was 
4:
Remove location li and all incoming and outgoing edges to and from li 5:
for each j in 1 to m do 6:
for each incoming edge lr a,gr ,Rr
Split the constraints on the incoming edges to l i for the newly created locations 8:
Let Z i j be the base zone corresponding to l i j
9:
Let Zr j is a zone of location lr from which there is an a transition to Z i j
10:
Let gr j be the weakest formula such that 11:
gr ∧ f ree(Z i j , Rr) ∧ Zr j ⇒ gr j and gr j has a subset of the clocks used in gr.
12:
Create an edge lr a,gr j ,Rr
end for 14:
for each outgoing edge li
Do not create this edge from li j to lr since it is never going to be enabled for any valuation of Zi j ; 17:
Let Sr be the set of elementary constraints in gi 19:
else 23:
Create an edge li j a,g i ,R i end for 30: end for present on the edge from l 1 to l 2 . The reason being that the future of the zone of l l1 (that is 0 ≤ y − x < 2) along with the constraint y > 7 implies x > 5 hence we do not need to put x > 5 explicitly on the outgoing edge from l 11 to l 2 . Such removal of elementary constraints helps future stages to reduce the number of clocks. The maximum number of locations produced in the timed automaton as a result of the split is of the same order as the number of zones in the zone graph. However, we note that the zones of a location in the original TA are distributed across multiple locations as a result of the split. This gives us the following lemma.
Lemma 3. The splitting procedure does not increase the underlying state space of the original TA.
Splitting locations and removing constraints as described above do not alter the behaviour of the timed automaton that leads us to the following lemma.
Lemma 4. The operations in stage 2 produce a timed automaton A 2 that is timed bisimilar to the TA A 1 obtained at the end of stage 1.
Lemma 5. The splitting procedure described in this stage does not increase the number of clocks but may cause the timed automaton to have |L A2 | locations where |L A2 | is exponential in the number of the clocks of the given timed automaton A. However, there is no increase in the underlying state space of the TA.
The number of locations after the split can be exponential in the number of the clocks. The constraints on the incoming edges of l are also split appropriately into constraints on the incoming edges of the newly created locations. Hence this stage too runs in time that is exponential in the number of the clocks of timed automaton.
Stage 3: Removing constraints by considering multiple edges with the same action: We consider the example in Figure 4 again. Note that the constraints x ≤ 3 and x > 3 on the edges from l 0 to l 1 and from l 0 to l 2 respectively could as well be merged together to produce a constraint without any clock.
For every action a enabled at any location l, this stage checks whether a guard enabling that action at l can be merged with another guard enabling the same action at that location such that the timed bisimilarity is preserved. The transformation made in this stage has been formally described in Algorithm 2. The input to this algorithm is the TA obtained after stage 2, say A 2 . For each location l i , the algorithm does the following: for every action a ∈ Act, it determines the zones of l i from which action a is enabled. We call this set Z ia . Zone graph construction and splitting of locations in stage 2 ensures that all zones in Z ia form a linear chain connected by edges (as shown in Figure 5 ). We use to capture this total ordering relation. Let us use ordered indexed variable 1, . . . , m to name the zones in this total order, i.e.
Lemma 1 ensures that each Z i k , k ≥ 1 there exists a hyperplane that bounds the zone fully from above and and similarly, for each Z i k , k > 1 there exists a hyperplane that bounds the zone fully from below. For a zone Z, let LB(Z) and U B(Z) denote these lower and upper bounding hyperplanes of Z respectively. Let Γ (li,a) = {g | l i a,g,R −−−→ l ∈ E A2 } be the set of guards on the outgoing edges from l i in A 2 which are labelled with a. For any g ∈ Γ (i,a) let us define the following; -Strt(g) (li,a) = Z ∈ Z ia is the zone in Z ia which is bounded from below by the same constraints as the lower bound of constraints in g. -End(g) (li,a) = Z ∈ Z ia is the zone in Z ia which is bounded from above by the same constraints as the upper bound of constraints in g.
} is the set of zones ordered by relation in between Strt(g) (li,a) and End(g) (li,a) .
In Algorithm 2 we use a rather informal notation g := [C1, C2] to denote that C1 and C2 are the constraints defining the lower and upper bound of g. We define a total order ≪ on
. Similar to the zones let us use ordered indexed variable g i1 , . . . , g ip to denote
One such total order on guards is shown in Figure 5 . The loop from Line 5 to Line 38 in Algorithm 2 traverses the elements of Γ (i,a) in this total order with the help of a variable next initialized to 2. In every iteration of this loop the invariant g curr ≪ g inext holds. Three possibilities exist based on whether the set union of zones corresponding to these guards is (i) not convex (ii) convex but non-overlapping, or (iii) convex as well as overlapping. If the union is non-convex then both g curr and index are changed in Line 7 to pick the next ordered pair in this order. For cases (ii) and (iii,) new guards are created by merging corresponding zones as long as the modified automaton preserves timed bisimilarity. If timed bisimilarity is preserved then the modified automaton is set as the current automaton (Line 13 and Line 28) and next is incremented to process the next guard. Otherwise the guard g curr is set to g inext and next is incremented by 1 (Line 15 and Line 36). Therefore the only difference in these two cases is in creating the new guard.
For case (ii), convex but non-overlapping zones, a new guard is created from the lower bound of Strt(g curr ) (li,a) and the upper bound of End(g inext ) (li,a) . For case (iii), there are three possibilities of combining guards, mentioned in Line 20, Line 22 and Line 24. First possibility is the same as in case (ii). Second and third possibilities are replacing the upper bound of g curr with the lower bound of Strt(g inext ) (li,a) and the lower bound of g inext with the upper bound of End(g curr ) (li,a) respectively.
A zone graph captures the behaviour of the timed automaton and hence timed bisimilarity between two TAs can be checked using their zone graphs [19, 12] . This is the reason for creating the pre-stable zone graph as described in Section 2 as it enables one to directly check timed bisimilarity on this zone graph [12] . for each a ∈ sort(li) do sort(l i ) is the set of actions in l i that can be performed from l i 4:
Let A be the TA obtained by replacing all occurrences of gcurr and gi next with g curr and g inext respectively in A3 12:
if A is timed bisimilar to A3 then 13: 
18:
There are three ways to combine gcurr and g i next , and
19:
Resultant new guards should be checked for timed bisimilarity in the following order 20:
(ii).
while 1 ≤ i ≤ 3 do
Corresponding to the three cases above 27: Let A be the TA obtained by replacing all occurrences of gcurr and gi next with i th g curr and g inext respectively in A3 28:
if A is timed bisimilar to A3 then 29:
A3 := A , gcurr := g i next , next := As mentioned above, in this stage, while merging the constraints, timed bisimilarity is checked and the number of bisimulation checks is bounded by the number of zones in the TA obtained after stage 2. Checking timed bisimilarity is done in EXPTIME [15] . The zone graph is constructed prior to every bisimulation check and the construction is done in EXPTIME. Hence this entire stage runs in EXPTIME.
Stage 4: Active clocks, clock replacement and renaming: Given a location l, an iterative method for finding the set of active clocks at l,denoted act(l), is given in [6] . The method has been modified and stated below for the situation where clock assignments (of the form x := y , where x, y ∈ C) are disallowed.
Determining active clocks : For a location l, let clk(l) be the set of clocks that appear on the constraints in the outgoing edges of l after the original timed automaton is modified through the previous three stages. Let ρ : (2 C × E) → 2 C be a partial function such that for an edge e = l g,a,R −−−→ l , ρ(act(l ), e) gives the set of active clocks of l that are not reset along e. For all l ∈ L, act(l) is the limit of the convergent sequence act 0 (l) ⊆ act 1 (l) . . . such that act 0 (l) := clk(l) and act i+1 (l) :
Removing redundant resets : Once we find the active clocks of a location l, we remove all resets of clock x on the incoming edges of l if x / ∈ act(l). Partitioning active clocks : Using the DBM representation of the zones, one can determine from the set of active clocks in every location whether some of the clocks in the timed automaton can be expressed in terms of other clocks and thus be removed. Here we investigate the presence of such clocks. Any x, y ∈ act(l) belong to an equivalence class iff the same relation (of the form x − y = k, for some integer k) is maintained between these clocks across all zones of l. This is checked using the DBM of the zones of l. In this case either x can be replaced by y + k or y can be replaced by x − k. Let π l be the partition induced by this equivalence relation.
We note that the size of the largest partition does not give the minimum possible number of clocks required to represent a TA while preserving timed bisimulation. An example is shown in Figure 6(a) . Though the automaton in the figure has two active clocks in every location, a timed bisimilar TA cannot be constructed with only two clocks. Assigning the minimum number of clocks to represent the timed automaton so that timed bisimilarity is preserved can be reduced to the problem of finding the chromatic number of a graph as described below.
Clock graph colouring and clock renaming : A clock graph, G
Moreover, if at least one clock, say c, is common in two classes corresponding to two different locations without any intervening reset of c then only one vertex represents these two classes. For example, in Figure 6 , clock x is active in both locations 0 and 1. {x} forms a class in the partition of the active clocks for each of locations 0 and 1. Thus we create vertices x 0 and x 1 corresponding to these two classes. However, since there is no intervening reset of clock x between locations 0 and 1, the vertices x 0 and x 1 are merged together in the clock graph. Thus after merging some classes into one class, the resultant class can have active clocks corresponding to multiple locations. For a class T , let loc(T ) represent the set of locations whose active clocks are members of T .
Finding the minimum number of clocks to represent the TA D is thus equivalent to colouring this graph with the minimum number of colours so that no two adjacent vertices have the same colour. The number of colours gives the minimum number of clocks required to represent the TA. If a colour c is assigned to a vertex r, then all the clocks in the class corresponding to r, say T , are renamed c. The value of c can be chosen to be equal to some clock in T that is considered to be the representative clock for that class. The constraints involving the rest of these clocks in T are adjusted appropriately and any resets of the clocks, different from the representative clock, present on the incoming edges to l such that l ∈ loc(T ) are also removed.
For example, suppose vertex r corresponds to a class T having clocks x, y and z such that the valuations of the clocks are related as : x − y = k 1 and y − z = k 2 . If colour c is assigned to vertex r, then the clocks x, y and z in class T are replaced with c. If the value of clock c is chosen to be the same as clock y, then every occurrence of x in T is replaced with y + k 1 , while every occurrence of z in T is replaced with y − k 2 in the constraints involving x and z. The corresponding resets of clocks x and z are also removed.
In Figure 6 (a), a TA with three locations is shown. In locations 0, 1 and 2, the sets of active clocks are {x, y}, {w, x} and {w, y} respectively. At every location, in this example, each of the active clocks itself makes a class of the partition. Since there are six classes in total, we draw initially six vertices. As mentioned earlier, the vertices x 0 and x 1 are merged into a single vertex. Similarly w 1 , w 2 and y 0 , y 2 are also merged. We call the resultant vertices x 0,1 , w 1,2 and y 0,2 . Adding the edges as described previously, we get the clock graph which is a triangle as shown in Figure 6 Determine act(l) 3:
Remove resets of clocks on the incoming edges of l that are not active at l 4:
Determine partition π(l) of act(l) 5: end for 6: Construct clock graph and colour it with the minimum possible number of colours. 7: In each class of the partition, replace all clocks in the class with a representative clock of the class and modify the constraints involving each clock that is not a representative clock appropriately. Remove resets of those clocks that are not representative clocks of a class from the incoming edges of the locations the class corresponds to, to obtain the final TA A4.
consists of finding the active clocks and renaming them, we have the following lemma.
Lemma 7. The operations in stage 4 produce a timed automaton A 4 that is timed bisimilar to the TA A 3 obtained after stage 3.
Further, we can reduce from determining the chromatic number of a graph to the problem of clock renaming which gives us the following lemma given that the reduction also holds in the other direction as described above.
Lemma 8. The problem of clock renaming operation on the clock graph as described above is NP-complete in the size of the input.
We look at the complexity of the operations in this stage. The sequence of computation of active clocks converges within n iterations and every iteration runs in time O(|E|), where there are n locations and |E| edges respectively in the timed automaton after the first three stages. This is due to the fact that in iteration i, for some location l, its active clocks are updated so as to include the active clocks of the locations l that are not reset between l and l and there exists a path of length at most i between l and l . In each iteration, each edge is traversed once for updating the set of active clocks of the locations. Thus the complexity of finding active clocks is O(n × |E|). Partitioning the active clocks of each of the locations too requires traversing the zone graph and checking the clock relations from the DBM of the zones. This can be done in time equal to the order of the size of the zone graph times the size of DBM which is in EXPTIME.
Finally, determining the chromatic number of a graph is possible in time exponential in the number of the vertices of the graph [16, 8] . Since the number of locations after the splitting operation in stage 2 is exponential in the number of clocks in A, renaming the clocks using the clock graph runs in time doubly exponential in the number of the clocks of the original timed automaton A. Thus we have the following theorem. Theorem 1. The stages mentioned above run in 2-EXPTIME.
In the presence of an invariant condition, considering an edge l g,a,R −−−→ l , a zone Z of l is initially created from a zone Z of l such that Proof of Minimality of clocks: Given a TA A, let A 4 be the TA obtained from A through the four stages described earlier. We can show that for each location l in A 4 , for every clock x ∈ act(l), there exists at least one constraint involving clock x which is indispensable for any TA that preserves timed bisimilarity.
Lemma 9.
In the TA A 4 , for each location l and clock x ∈ act(l), there exists at least one constraint involving x on some outgoing edge of l or on the outgoing edge of another location l reachable from l such that there is at least one path from l to l without any intervening reset of x. Moreover the TA obtained by removing the constraint is not timed bisimilar to the given timed automaton A.
Proof sketch : In stage 1, the construction of the zone graph removes the edges and the constraints from the given TA A, that never enable a transition, to produce the TA A 1 . In stage 2, a location l in A 1 is split corresponding to every base zone of l. This leads to some guards g on the outgoing edges of location l in A 1 to be modified while creating A 2 . In A 2 , the locations created by splitting location l retain only those elementary constraints of g that affect the behaviour of all the valuations in the base zone of the newly created locations. Thus further splitting of the locations do not eliminate any of the elementary constraints on the guards.
In stage 3, some of the consecutive zones connected by ε edges, such that from each of them an action a is enabled, are merged. This results into some of the guards being combined as specified in Lines 9 and 10 in Algorithm 2 or some of the elementary constraints being removed as done in case (ii) and (iii) (Lines 22 to 25), whenever such transformations involving merging of guards or removal of elementary constraints preserves timed bisimilarity. Thus after stage 3, none of the elementary constraints appearing in the guards on the edges can further be removed. Removing any of them changes the behaviour and the resultant TA does not remain timed bisimilar any more.
In stage 4, clocks are renamed and thus the Lemma holds for the TA A 4 . A more rigorous proof that uses an induction on the structure of the TA A 4 is given in the appendix. One can show that the clocks of a minimal bisimilar TA can replace the clocks of A 4 which gives us the following lemma.
Lemma 10. The timed automaton A 4 has the same number of clocks as a minimal bisimilar TA for A.
Proof sketch : As stated in the proof of Lemma 9, all the elementary constraints in each of the guards are necessary. In stage 4, further, clock renaming is done in a way so that the constraints can be specified using minimum number of clocks, i.e. no other renaming can lead to a smaller number of clocks than used in A 4 .
Since all the elementary constraints are necessary to preserve timed bisimilarity, any TA timed bisimilar to A will have constraints that affect the behaviour of the timed automaton in the same way as the constraints of A 4 . Formally we can show that the clocks of a minimal bisimilar TA can replace the clocks of A 4 . Replacing the clocks can be considered to be a renaming operation and since after stage 4, any renaming cannot reduce the number of clocks further in A 4 , the number of clocks in A 4 is equal to the number of clocks in the minimal bisimilar TA. Corollary 1. The timed automaton A 4 obtained by applying the four stages described above has the minimum possible number of clocks and it is timed bisimilar to the original automaton A.
Theorem 2.
There exists an algorithm to construct a TA A 4 that is timed bisimilar to a given TA A such that among all the timed automata that are timed bisimilar to A, A 4 has the minimum number of clocks. Further the algorithm has a time complexity which is doubly exponential in the number of clocks of A.
Conclusion
Since model checking of timed automata uses region graph or zone graph whose size increases exponentially with the number of clocks, it is desirable to consider the problem of finding an equivalent automaton with smaller number of clocks.
While the problem of checking whether a timed automaton accepting the same timed language but with a smaller number of clocks exists is known to be undecidable [9] , in this paper, we have described an algorithm, which given a timed automaton A, produces another timed automaton A 4 with the smallest number of clocks that is timed bisimilar to A. It also follows trivially that A 4 accepts the same timed language as A. If we find such a A 4 with fewer clocks than A, this also implies the existence of a timed automaton with fewer clocks accepting the same timed language.
For reducing the number of clocks of the timed automaton, we rely on a semantic representation of the timed automaton rather than its syntactic form as in [6] . This helps us to reason about the behaviour of the timed automaton more effectively. Besides, the zone graph we use in our approach is usually much smaller in size than the region graph and its size is independent of the constants used in the timed automaton. Also our method excludes the intermediate step of construction of a characteristic formula as done in [14] . There is an exponential increase in the number of locations while producing the TA with the minimal number of clocks. However, this does not increase the underlying state space of the timed automaton since the splitting of a location l involves distributing the zones of l across the locations l is split into.
A Proofs of Lemmas

A.1 Proof of Lemma 1
Proof. We show the proof for two clocks. The same argument holds for arbitrary number of clocks. Consider a |C| dimensional zone Z. Since |C| = 2 , a zone Z that is bounded above is of the form
, where x and y are the two clocks, ≺∈ {<, ≤}. Let Z be bounded above by two hyperplanes x = k x2 and y = k y2 . As can be seen from Figure  7 , there are two zones Z 1 and Z 2 such that Z 1 is defined by the inequations k xy1 ≺ x − y, k x2 ≺ x and y ≺ k y2 , while Z 2 is defined as x − y ≺ k xy2 , x ≺ k x2 and k y2 ≺ y. Making Z pre-stable will divide it into two parts:
Z is bounded fully from above by the hyperplane k x2 while Z is bounded fully from above by the hyperplane k y2 . Note that the inequations defining the zones Z and Z may vary depending on the relation among the various constants k xy1 , k xy2 , k x1 , k x2 , k y1 and k y2 . In all cases, pre-stability will ensure that there exists a single hyperplane that fully bounds a zone from above.
x−y = k xy2 Fig. 7 . For every zone, there exists a hyperplane that bounds it fully from above in a pre-stable zone graph
A.2 Proof of Lemma 2
Proof. During the construction of the zone graph, an edge is entirely removed if the corresponding transition is never enabled. Removing those edges corresponding to which no transition takes place does not affect the behaviour of the TA. Hence after stage 1, the resultant TA remains timed bisimilar to the original TA.
A.3 Proof of Lemma 4
Proof. In the second stage, a location l is split such that corresponding to every base zone of l, new locations l 1 , . . . , l n are created. Consider valuations v 1 , . . . , v n in l 1 , . . . , l n respectively. Since the zones of l are distributed over locations l 1 , . . . , l n , before the split, all the valuations v 1 , . . . , v n are reachable in l. For all v i , 1 ≤ i ≤ n, (l, v i ) (before l is split) is timed bisimilar to (l i , v i ) after the split. Hence the timed automaton after stage 2 remains timed bisimilar to the original TA A.
A.4 Proof of Lemma 6
Proof. In the third stage, a zone of some location l may be merged with another zone if their union is convex and this process may be repeated a finite number of times. This operation is done if the resultant zone graph is still timed bisimilar to the original zone graph, i.e. the initial states of the two zone graphs are timed bisimilar. The changes in the zone graph are reflected in the timed automaton in the following way. If from a location l i there are edges of the form l i a,g1,R1
− −−−− → l in in the timed automaton A such that they are labelled with the same action a, then some of the constraints may be replaced with a constraint equivalent to g i ∪ g j ∪ . . . if the resultant timed automaton is timed bisimilar to the original timed automaton. Hence the operations in this stage create a timed bisimilar TA.
A.5 Proof of Lemma 7
Proof. In stage 4, clocks are renamed. Also in every zone in each location, the active clocks are identified and partitioned such that all clocks belonging to a class in the partition are represented with a single clock, say c. If there is a clock in the partition such that x − c = m, where m is an integer, then the constraint of the form x k is replaced with c k − m. The clock renaming and replacing the integer constants in the way mentioned above do not change any bisimulation property of a timed state. Hence the operations in stage 4 preserve the timed bisimulation property of the original TA A.
A.6 Proof of Lemma 8
Proof. Consider any arbitrary graph G. We show that there exists a TA whose clock graph is the same as G. Thus coloring the vertices of G is reduced to assigning minimum number of clocks to the clock graph.
The reduction is the following: Let the vertices of G be x 1 , x 2 , . . . , x n . Corresponding to every edge e between x i and x j , create a location l of the timed automaton whose active clocks are x i and x j . For the reduction to be correct, we also need to show that x i cannot be replaced with x j in location l. Let the locations of the timed automaton be numbered l 1 , l 2 , . . . , l m where m is the number of edges in G. Now note that every location l 1 , . . . , l m has exactly two active clocks. We create three additional locations l 0 , l m+1 and l m+2 and draw edges from l 0 to l 1 , l m to l m+1 and from l m+1 to l m+2 . If the active clocks of l 1 are x i and x j , then we draw an edge from l 0 to l 1 with the constraint x i ≤ 1 and reset x j . The other edges in the timed automaton are drawn in the following manner.
An edge is drawn between two locations l u and l v if u < v and act(l u ) ∩ act(l v ) = ∅. Note that there can be at most one active clock common between two distinct locations l u and l v . Also note that since we add edges from l u to l v such that u < v ≤ m, the timed automaton thus produced does not have any cycle. Now we describe the constraints and the resets on the edges of the timed automaton. Consider two locations l u and l v connected by an edge from l u to l v and let the active clocks of l u be x i and x j while the active clocks of l v be x j and x t . On this edge we add a constraint x i ≤ k, where k ∈ N and reset x t . The value of k is chosen in the following way. Suppose clock x i was reset on an incoming edge of l w such that there is a path from l w to l u without any reset of x i . We define the weight of clock x i on the path from l w to l u , denoted wt(x i ) w,u , to be the sum of the integers used in the constraints on the path from l w to l u . Now we assign to k the value 1 + max w (wt(x i ) w,u ), i.e. one added to the maximum weight of clock x i at location l u computed over all incoming paths. This value of k ensures that the locations l 1 , . . . , l m are not further split.
If the active clocks at l m be x r and x s , then we add an edge from x m to x m+1 with the constraint x r ≤ k r wlog and an edge from x m+1 to x m+2 with the constraint x s ≤ k s . k r and k s are calculated in the same way as k as described above.
Clearly, the timed automaton thus constructed has a clock graph that is the same as G. Figure 8(a) shows a graph G and 8(b) shows the TA constructed following the procedure described above. Note that there are 6 edges in G corresponding to which there are 6 locations in the TA which are l 1 to l 6 . There are three additional locations l 0 , l 7 and l 8 . Corresponding to every location in l 1 to l 6 , its active clocks are written in blue inside parentheses. For example, the active clocks of l 1 are x 1 and x 2 denoting that the edge in G corresponding to location l 1 connect vertices x 1 and x 2 in G.
A.7 Proof of Lemma 9
Proof. We prove this lemma by induction on the structure of the timed automaton. We prove that the statement of the lemma holds for a location l if it holds for all the locations reachable from l through a single transition. The proof proceeds by considering the exhaustive set of cases shown in Figure 9 . In the rest of the proof, wlog, we consider the label of an edge as the action a whenever not specified otherwise.
Let us suppose that we have edges in the timed automaton A 4 from location l to locations l 1 , . . . , l m . By induction hypothesis, the lemma holds for l 1 , . . . , l m .
. Consider a clock c 0 ∈ act(l). Also note that a clock c 0 will be replaced by some clock in
act(l i ) to ensure minimality while coloring the clock graph. We show that if c 0 ∈ act(l), then there is a constraint involving c 0 that cannot be removed while preserving timed bisimilarity. If there exists some location l i , 1 ≤ i ≤ m, such that on the edge from l to l i , c 0 has not been reset, then by IH, c 0 ∈ act(l i ) and thus the lemma holds at l too. Consider two edges l gi,a,Ri −−−−→ l i and l gj ,a,Rj − −−−− → l j such that with reference to Algorithm 2, g curr is same as g i and g next is same as g j while g curr is the same as g appearing below. Let the hyperplane bounding End(g i ) (l,a) be c 0 = k that is induced by the elementary constraint c 0 k of U B(End(g i ) (l,a) ). If Ran(g i ) (l,a) ∪ Ran(g j ) l,a is not convex, then the constraints g i and g j cannot be merged so that the resulting TA remains timed bisimilar.
For the following cases, we consider Ran(g i ) (l,a) ∪ Ran(g j ) (l,a) to be convex but Ran(g i ) (l,a) and Ran(g j ) (l,a) do not overlap. We show that if g i and g j could not be merged to produce a constraint g as shown in Line 9 of Algorithm 2, then there exists a clock constraint involving c 0 that cannot be removed while preserving timed bisimilarity. The case where Ran(g i ) (l,a) and Ran(g j ) (l,a) overlap can also be reasoned about similarly.
Suppose c 0 is reset on all of the edges from l to l i where it appears as part of some constraint, 1 ≤ i ≤ m. Now since the resets of c 0 exist on the edges from l to l i and from l to l j , c 0 ∈ act(l i ) and c 0 ∈ act(l j ). Otherwise, the resets on the edges would have been removed in stage 4. Since c 0 has been reset on edges to both l i and l j , the clock constraints involving c 0 from l i and from l j have to be the same for replacing g i and g j with g. Since it was not replaced in the TA A 4 , it implies that the constraint c 0 k cannot be removed while preserving timed bisimilarity. gi ∪ gj is not convex.
Constraints having c0 on edge from li and lj are not the same.
Constraints having c0 on edge from li and lj are the same.
c1 ∈ act(lj) and c1 ∈ act(l) but c1 / ∈ act(li), c1 = c0 c1 ∈ act(lj) and c1 ∈ act(l) but c1 ∈ act(li), c1 = c0
Upper bound of constraint involving c1 on edges from both li and lj are different.
Upper bound of constraint involving c1 on edges from both li and lj are same.
(lj, v) is a state and v(c1) less than the lower bound of constraint involving c1 on an edge reachable from lj The following cases include those where c 0 is reset on all of the edges from l to l i , 1 ≤ i ≤ m and the constraints in the paths from l i and l j , l i = l j involving c 0 are the same for that part of the path where there is no intervening reset of c 0 . Now wlog we consider a clock c 1 ∈ act(l j ) and c 1 different from c 0 such that c 1 has not been reset on the edge from l to l j . This implies that c 1 ∈ act(l) as well. If c 1 / ∈ act(l i ) and if g i is replaced with g, then the behaviour of the states in zone Z i[Ri←0] of l i will depend on an additional constraint involving clock c 1 which was not the case in the original automaton. Thus g i and g j cannot be replaced with g which implies that the constraint c 0 k cannot be removed while preserving timed bisimilarity. Now if c 1 ∈ act(l i ) and c 1 ∈ act(l j ), then the upper bound of the constraint involving clock c 1 on both the edges from l i to l j have to be the same for replacing g i and g j with g. The upper bound is k, if the constraint is of the form c 1 < k or c 1 ≤ k, k ∈ N. Otherwise, the states in the zone Z i[Ri←0] will behave differently if g i is replaced with g in which case the constraint c 0 k cannot be removed while preserving timed bisimilarity. Now consider the case where in the constraints involving c 1 on the edges reachable from l i and l j without any intervening reset of c 1 , the upper bounds of c 1 are the same but the lower bounds of c 1 are not the same. Now let the lower bound of c 1 on the constraint on the edge reachable from l j be b low , i.e. the constraint is of the form c 1 > b low or c 1 ≥ b low , b low ∈ N. We note that in location l j , if every reachable state (l j , v j ) be such that v j (c 1 ) > b low , then the constraint is eliminated in stage 2 since all reachable valuations in l j satisfy the constraint implicitly. If there exists a state (l j , v j ) that is reachable from some state (l, v) such that b low > v j (c 1 ), then we argue that g i and g j cannot be merged. Consider a state (l i ,v i ) reachable from (l,v) such that v i (c) < b low . Such a state (l i ,v i ) exists since zone Z j of l is an immediate delay successor of zone Z i . We havev i =v [Ri←0] . If g i and g j are merged, then both g i and g j are replaced with g and we have a state (l j ,v j ) reachable from (l,v) such thatv j =v [Rj ←0] . Thus with the merging of g i and g j , we have a transition
such that the behaviour of (l j ,v j ) is affected by a constraint of the form c 1 b low that was not the case in the original timed automaton. Since the merging of g i and g j does not preserve timed bisimulation, we cannot merge them. This further implies that the constraint c 0 k cannot be removed while preserving timed bisimilarity. Now we consider the case where the lower bounds of c 1 are not the same and the lower bound on the constraint on the edge reachable from l i be greater than v(c 1 ), where v is some valuation in the zone Z lj [Rj ←0] of location l j . Arguing analogously to the previous case, we can show that g i and g j cannot be merged, i.e. g i and g j cannot be replaced with g while preserving timed bisimilarity. This further implies that the constraint c 0 k cannot be removed while preserving timed bisimilarity.
Otherwise, if the lower bounds on the constraints involving c 1 on the edges reachable from both l i and l j are not the same and the lower bounds on these constraints involving c 1 on the edges reachable from both l i and l j be less than or equal to v(c 1 ) for every valuation v reachable in l j , then the pre-stabilization operation replaces the constraints from l to l i and from l to l j involving c 0 with constraints involving c 1 and other active clocks in l as shown in Figure 10 . In the figure, x represents c 0 while y represents c 1 . Thus this particular case cannot occur in the automaton A 4 . Also l i and l j as described in the previous cases can be considered to be l 2 and l 3 in the TA on the left. Note that the value of clock y in every state reachable from l 3 is greater than 3 while the lower bounds on the constraints on y reachable from l 2 and l 3 are 3 and 0 respectively.
Hence the lemma holds for location l. Towards defining the clock mapping from D to A 4 , we first define a corner point trace (cp-trace) in the zone graph.
Definition 3.
A corner point of a zone in the zone graph is a state where corresponding to each clock x, the hyperplanes of the form x = c that define the zone intersect each other. Considering M to be the largest constant appearing in the constraints or the location invariants of the timed automaton, each coordinate of the corner point is of the form n, n + δ or n − δ where n ∈ {0, 1, . . . , M } and δ is any infinitesimally small value. In a zone Z, there can be two kinds of corner points, an entry corner point and an exit corner point. For an entry corner point, there is a unique exit corner point, that can be reached from the entry corner point by performing a delay d such that any delay more than d will lead to a state that is in the immediate delay successor zone of Z.
For a zone, it is possible to have a pair of such entry and exit corner points that are not distinct, for example, in Figure 1 , in zone Z 4 , the corner point (x = 5 + δ, y = 7) is both an entry as well as an exit corner point, while the corner point (x = 5+δ, y = 5+δ) is another entry corner point for the same zone Z 4 and the corresponding exit corner point is (x = 7, y = 7). If the entry and the exit corner points are distinct, then a non-zero or a non-infinitesimally small delay can be performed from the entry corner point to reach the corresponding exit corner point whereas an infinitesimally small delay from the exit corner point causes it to evolve into an entry corner point of the immediate delay successor zone. A zone that is not bounded above does not have any exit corner point. For an entry corner point, the corresponding exit corner point is the next delay corner point whereas for an exit corner point, the entry corner point in the immediate delay successor zone is the next delay corner point. We consider a minimal TA D 1 timed bisimilar to the given TA A and apply the operations in the four stages on it to produce a TA D. The application of these stages on the minimal TA does not add to the number of clocks and since D 1 is already minimal, D has the same number of clocks as D 1 . Due to the transformation from D 1 to D, it is ensured that in D's zone graph, for each zone in every location, there exists a hyperplane that fully bounds it. Similarly in the zone graph of A 4 too, for each zone of every location, there exists a hyperplane that fully bounds it.
We use the zone graphs of A 4 and D for mapping the clocks of A 4 to the clocks of D. From Lemma 9, for a location l A4 of A 4 , corresponding to every clock x ∈ act(l A4 ), there is a hyperplane of the form x = k corresponding to a constraint x k which cannot be removed so that the modified TA remains timed bisimilar. Since A 4 and D are timed bisimilar, there is a corresponding hyperplane, say y = k , induced by a constraint y k which too cannot be removed from D while preserving timed bisimulation.
The cp-trace we consider below are the ones that start from the initial state of the zone graph and the delays in the trace are such that it moves from an entry corner point to the exit corner point of the same zone and from an exit corner point to an entry corner point of the immediate delay successor zone. We consider multiple finite cp-traces in the zone graph such that all the corner points in the zone graph are traversed by some cp-trace at least once. Now we consider a part of a cp-trace in the zone graph of A 4 from the initial state to an exit corner point (l A4 , v) of zone Z A4 . Suppose the hyperplane x = k bounds the zone Z A4 and the constraint x k inducing the hyperplane be such that it cannot be removed while preserving timed bisimulation. Since A 4 and D are timed bisimilar, there exists a state (l D , v ) in the zone graph of D that is timed bisimilar to (l A4 , v) and (l D , v ) is an exit corner point of some zone Z D . Continuing this way, we consider all the different exit corner points in Z A4 and the corresponding bisimilar corner points in Z D to find out the clock involved in the hyperplane bounding the zone Z A4 , that is x and the clock in the corresponding hyperplane bounding Z D that is say y and thus we can replace clock x in A 4 with clock y of D. This is repeated until all the occurrences of all the clocks in A 4 are replaced with clocks in D.
As mentioned above, here x ∈ act(l A4 ) and y ∈ act(l D ). Since an active clock of a location cannot be replaced with another active clock of the same location, every active clock in every location in A 4 , can be mapped uniquely to a clock in D which can replace the clock in 
