Abstract We consider one of the weakest variants of cost register automata over a tropical semiring, namely copyless cost register automata over N with updates using min and increments. We show that this model can simulate, in some sense, the runs of counter machines with zerotests. We deduce that a number of problems pertaining to that model are undecidable, in particular equivalence, disproving a conjecture of Alur et al. from 2012. To emphasize how weak these machines are, we also show that they can be expressed as a restricted form of linearly-ambiguous weighted automata.
Introduction
Cost register automata (CRA) [2] encompass a wealth of computation models for functions from words to values (herein, integers). In their full generality, a CRA is simply a DFA equipped with registers that are updated upon taking transitions. The updates are expressions built using a prescribed set of operations (e.g., +, ×, min, . . .), constants, and the registers themselves.
In this work, we will focus on CRA computing integer values, where the updates may only use "+c", for any constant c, and min. For instance:
a, r1 ← r1 + 1 r2 ← r2
With r 1 initialized to 0 and r 2 to ∞, this CRA computes the length of the minimal nonempty block of a's between two #'s. This model has the same expressive power as weighted automata (WA) over the structure (Z, min, +), but the use of registers can simplify the design of functions.
The example above enjoys an extra property that can be used to restrain the model (since a lot of interesting problems are undecidable on WA [1] ). Indeed, no register is used twice in any update function; this property is called copylessness. This syntactic restriction, introduced by Alur et al. [2] and studied by Mazowiecki and Riveros [6] , provably weakens the model. It was the hope of Alur et al. that this would provide a model for which equivalence is decidable.
Semilinearity and decidability of equivalence. Recall that a set R ⊆ Z k is semilinear if it is expressible in first-order logic with addition: FO[<, +]. This latter logic being decidable [8] , semilinearity is a useful tool to show decidability results. For instance, let f, g : A * → Z be expressible in some model for which the images of functions are effectively semilinear. Suppose further that the function h : w → min{2 × f (w), 2 × g(w) + 1} is also in that model. Since the image h(A * ) is effectively semilinear, one can check whether it is always even: this would show that f (w) ≤ g(w) for all w. A first natural question is thus, is copyless CRA (CCRA) such a model?
Iterating min breaks semilinearity. Deterministic automata equipped with copyless registers with only "+c" updates are quite well-behaved [3, Section 6] ; in particular, the set R = {r | r are the values of the registers at the end of an accepting run} is semilinear. Naturally, min{x, y} is expressible in FO[<, +], hence FO[<, +] = FO[<, +, min] (even, and this is not immediate, when the extra value ∞ is added [4] ). This entails that if we were to give to these automata the ability to do a constant number of min, we would still have that R is semilinear. In this paper, it is shown that if the number of min is unbounded along runs, then the set is not semilinear (see the proof of Theorem 3 for a simple construction), and that it is undecidable to check whether R is semilinear.
Contributions. Beyond considerations on semilinearity, we show that CCRA over N can simulate the runs of counter machines with zero-tests (Theorem 1). Intuitively, the only words mapped by the CCRA to an even value are the correct executions of the counter machine. This construction is then used to show that equivalence is undecidable for CCRA over N and that upper-boundedness is undecidable for WA. To better gauge the expressiveness of CCRA, we show that they are a weak form of linearly-ambiguous WA, that is, WA for which no word w has more than k × |w| accepting runs, for some constant k (see drawing on page 6). Since the problems we tackle are decidable for finitely-ambiguous WA, CCRA are arguably the simplest generalization of deterministic WA for which equivalence is undecidable.
Preliminaries
We assume familiarity with automata theory, for which we settle some notations. We write N for {0, 1, 2, . . .}, Z for the integers, and define N ∞ = N ∪ {∞} and Z ∞ = Z ∪ {∞}. Naturally, min{. . . , ∞, . . .} stays the same when removing the ∞ value, and we set min ∅ = ∞. For any k ≥ 1, we write [k] for {1, 2, . . . , k}. We write ε for the empty word.
Automata. An automaton (NFA) is a tuple (Q, A, δ, q 0 , F ), where Q is the set of states, A the alphabet, δ ⊆ Q × (A ∪ {ε}) × Q the transition relation, q 0 the initial state, and F ⊆ Q the set of final states. We rely on the usual vocabulary pertaining to automata: a run is a word in δ * starting in q 0 , and such that each transition is consistent with the next; it is accepting if the last reached state is in F ; a word w ∈ A * is accepted if there is an accepting run labeled by w. If δ is a function from Q × A to Q, the automaton is deterministic (DFA). If there is a k ∈ N such that each accepted word w is the label of at most k × |w| accepting runs, the automaton is linearly-ambiguous.
Tropicalities. The only semirings (i.e., algebraic structures) that we will use are (Z ∞ , min, +) and (N ∞ , min, +), often dubbed "tropical semirings." When the discussion is not specific to one of the two semirings, we simply write K for both. As with rings, matrix multiplication is well-defined in semirings; e.g., if (b ij ) and (c ij ) are 2 × 2 matrices and (a ij ) = (b ij ) · (c ij ), then:
Weighted automata. Weighted automata will only be used in Section 3 and Theorem 4. A weighted automaton W over K (K-WA) is a tuple (A, λ, µ, ν)
where A = (Q, A, δ, q 0 , F ) is an NFA, and λ ∈ K, µ : δ → K, and ν : F → K. Given a run t 1 · t 2 · · · t n ∈ δ * ending in a state q ∈ F in cA, its weight is λ + µ(t 1 ) + µ(t 2 ) + · · · + µ(t n ) + ν(q). The weight W(w) of a word w ∈ A * is the minimum weight for all accepting runs over w in the NFA (hence it is ∞ if the word is not accepted). The K-WA is deterministic (resp. linearly-ambiguous) if A is. We use K-DetWA and K-LinWA for these restrictions.
Registers and counters. A central goal of this work is to present a simulation of some counter machine with zero-tests by a register machine without zero-test but with more complicated update functions. To avoid confusion, we will stick to that vocabulary, and use c i for counters and r i for registers.
Cost register automata. In this work, we only consider cost register automata over K ∈ {Z ∞ , N ∞ } where the registers are updated using expressions that use min and "+c" for c ∈ K. A precise, formal definition of the model will only be needed for Proposition 2; to present the main constructions, we will simply rely on the following more intuitive definition.
A K-CRA C of dimension k is a DFA equipped with k registers r 1 , r 2 , . . . , r k taking values in K. The initial values of the registers are specified by a vector in K k , and each transition further induces a transformation of the form:
where each m i,j is in K (hence it can be ∞, making the subexpression irrelevant). Each final state is paired with an output function of the shape:
where again the m i 's are in K. Given a word w ∈ A * , the value of C on w, written C(w), is ∞ if w is not accepted by the underlying DFA, and otherwise computed in the obvious way: the registers are initialized, then updated along the (single) run in the DFA, and the output is determined by the output function at the final state.
The K-CRA is said to be copyless (K-CCRA) if all the update functions satisfy, using the notations above, that for all i ∈ [k], |{j | m i,j = ∞}| ≤ 1; in words, for each i, at most one of the subexpressions "r i + m i,j " will evaluate to a non-∞ value: the value of r i impacts at most one register.
Vector addition systems with states and zero-tests. The main construction of this paper focuses on simulating counters with zero-tests. The precise formalism for our counter machines is a variant of vector addition systems with states (VASS) over Z k , equipped with transitions that can only be fired if a designated counter is zero. For any k, we define the update alphabet C k as:
the intended meaning being that inc i will increment the i-th counter, dec i will decrement it, and chk i will check that it is zero.
A
writing (e i ) ∈ Z k for the standard basis:
We say that the Z-VASS z reaches a state q if (q 0 , 0) reaches, by a sequence of configurations, (q, c) for some c. We write L V,q ⊆ (C k )
* for the reachability language of q, that is, the language of updates along the runs reaching q.
Proposition 1. The following problem is undecidable:
Given:
A Z-VASS z V and a state q Question: Is L V,q empty? The problem stays undecidable even if |L V,q | ≤ 1 is guaranteed.
Proof. We define an extension of Z-VASS z that can implement classical Minsky machines to streamline the reduction. Define
machine is an automaton over C k , with the Z-VASS z semantics, augmented with the property that a transition labeled chk i can only be taken if the i-th counter is nonzero.
Minsky [7] showed that the emptiness of reachability languages is undecidable for these machines-in particular, even if it is assumed that there is at most one run reaching the given state. To show the same for Z-VASS z , we need only remove the transitions labeled chk i , while preserving the reachability languages. To do so, it suffices to replace them with the following gadget, where j is a new counter and some states are omitted:
It is easily checked that upon reaching state q 2 , the i-th counter is restored to its value in q 1 , the j-th is 0, and the state can only be reached if the i-th counter were strictly positive.
CCRA and weighted automata
With the plethora of models computing functions from words to values in modern literature, it is imperative to justify studying the seemingly artificial CCRA. In this section, we provide a normal form that will demonstrate that these machines are but deterministic weighted automata with a small dose of nondeterminism.
In particular, all the problems we show to be undecidable in Section 6 turn out to be decidable for deterministic (or even finitely-ambiguous) weighted automata;
this gives credence to the assertion that N ∞ -CCRA is one of the weakest models for which equivalence, for instance, is undecidable. In the following proposition, it is shown that any K-CCRA can be expressed as a DFA making nondeterministic jumps into a K-DetWA; graphically, every K-CCRA is equivalent to:
There are a DFA A with state set Q and initial state q 0 , a K-DetWA W with state set Q , and a function η : Q → P(Q ) such that:
where W q is W with the initial state set to q, and q 0 .u is the state reached by reading u in A.
Proof. We first sketch the proof idea. Consider a nondeterministic variant of a given K-CCRA C where updates of the form r 1 ← min{r 2 , r 3 } become nondeterministic jumps between the updates r 1 ← r 2 and r 1 ← r 3 . The final value of this variant is set to be the minimum output of any run. Then this variant has the same output value as the original CRA, by distributivity of addition over min. We implement that strategy using a DFA A which, on resets (r 1 ← 0), starts a new run within a DetWA that follows the increments (r 1 ← r 1 + m) and movements (r i ← r 1 ) of the register.
We now formalize the definition of CRA. A K-CCRA C of dimension k is a tuple (A , λ, µ, ν) where A = (Q, A, δ, q 0 , F ) is a DFA, λ ∈ K 1×k is the initial value of the k registers, ν : F → K (k+1)×1 gives the output function for each final state, and µ : Q × A → K (k+1)×(k+1) provides the update functions. To compare with the definition on page 4, using the notation therein, µ(q, a) is:
It can be readily checked that (r , 0) = (r, 0) · µ(q, a) indeed satisfies, for all i ∈ [k] that:
(Recalling that the multiplication is made in the semiring (K, min, +).) Note that the (k + 1)-th component is a virtual register that will be maintained to 0. Given
* in A , the output value is then defined as:
Now that the precise definition of K-CRA is settled, we present the construction. We will assume that λ ∈ {0, ∞} k and that the updates are in one of two possible forms:
In symbols, this means that if µ(q, a) = (m i,j ), then for any i ∈ [k], either m k+1,i is ∞ or all m j,i , for j ∈ [k], are ∞. Any K-CCRA can be put under that form using standard techniques.
The automaton A is the underlying automaton of C, augmented with the information of which registers were reset by the previous transition. More precisely,
; note that the final states are irrelevant. The transition function δ A is defined by:
The K-DetWA W consists of k copies of C, one for each register. Formally,
here, the initial valuation is irrelevant. We now define the transition function δ B and the weight function µ W . Let (q, x) be a state of B and a ∈ A. By copylessness, there is at most one y such that µ(q, a) x,y is not ∞. If one such y exists, then:
The output function of W is then, for any
Consider a word w ∈ A * , and a factorization w = uv. The word u reaches a state q in C, and a state (q, E) in A. The last transition taken in C reading u updated all the registers r i , i ∈ E, with the value 0. For each of these i's, there will be a run over v in W, starting at (q, i), which follows the updates applied to r i . This process thus simulates the nondeterministic variant of C described above, showing the Proposition.
Proof. With the notations of Proposition 2, let us see A, η, and W as a single K-WA, where the weights in the A part are set to 0. For any word w, each run on w consists of a run over a prefix u within A, and a run over the leftover suffix v within W starting in some state q ∈ η(q 0 .u). Thus there are at most |w| × |Q | runs, hence the WA is linearly ambiguous.
As an application of this specific form, it is not hard to show that some specific functions are not expressible using a Z ∞ -CCRA. Let minblock (resp. lastblock) be the function from {a, #} * to N which, given w = #a
Proposition 3. The following functions are not expressible by a Z ∞ -CCRA:
Proof (sketch). In both cases, one has to reason about when the nondeterministic jump, given by η in Proposition 2, is made in the minimal run, bearing in mind that neither minblock nor lastblock are computable by a DetWA. For the first example, the jump has to be made at the beginning of the minimal block of a's, after reading a #; thus the number of c's cannot be taken into account. For the second example, if the jump is made just before the last block of a's in v, then the value of the last block in u is disregarded. If it is made just before the last block in u, then the DetWA part has to compute lastblock on v, which is not possible.
Remark 1.
Note that the first function of Proposition 3 is expressible by a LinWA, and the second by an unambiguous WA (i.e., at most one run per accepted word). Moreover, since minblock is not expressible by an unambiguous WA but is by a CCRA (see the Introduction), the classes of functions expressed by the two models are incomparable. We also remark that Proposition 2 and Corollary 1 hold for any semiring.
Simulation of Z-VASS
z using Z ∞ -CCRA Let V be a Z-VASS z and q a state of V. Recall that C k is the update alphabet of symbols inc i , dec i , and chk i , and that L V,q ⊆ (C k ) * is the reachability language of q. In this section, we devise a simulation of V using Z ∞ -CCRA in the following sense: Given a word w ∈ (C k ) * , the Z ∞ -CCRA will output 0 iff w ∈ L V,q . Compared to the simulation by N ∞ -CCRA of the forthcoming Section 5, the Z case is quite straightforward, and reminiscent of the methodology of [1] ; it however provides some intuition for the construction for N.
We present how the counter increments (inc), decrements (dec), and zerotests (chk) are implemented for a single counter before showing how multiple counters can be handled. The automaton structure of the source Z-VASS z , with accepting state q, can then be followed by the CRA while simulating the counters.
Simulation of a single counter
Since we are working with a single counter, we drop the indices of the letters in C 1 . A single counter c will be simulated by 3 registers: r + and r − , carrying the values of c and −c, respectively, and r z which shall be 0 if each time the letter chk was read, c was 0. If at any time chk was read while c was nonzero, then r z will be strictly smaller than 0. This is implemented as follows:
inc :
If r z becomes strictly smaller than 0, it will stay so after reading any word in (C 1 ) * .
Observation 2. Assume r + = r − = r z = 0. After reading i letters inc and j letters dec, in any order, then reading a final chk, the new values of the registers satisfy:
This simulates the original counter in the following sense:
Proposition 4. Let V be a Z-VASS z of dimension 1 and q a state of V. There is a Z ∞ -CCRA C with C(w) ≤ 0 for any w and such that:
Proof. Let V = (Q, C 1 , δ, q 0 , F ) and q ∈ Q. The Z ∞ -CCRA C with 3 registers is defined as having (Q, C 1 , δ, q 0 , {q}) as the automaton structure, and the updates are dictated by the letter being read, as above. On state q, C outputs r z .
Simulation of multiple counters
It is quite straightforward to combine multiple r z registers into one. Indeed, if k counters are simulated using registers r Proposition 5. Let V be a Z-VASS z of dimension k and q a state of V. There is a Z ∞ -CCRA C with C(w) ≤ 0 for any w and such that:
Remark 2. Here, we were mostly interested in having a specific output if the simulated execution was correct. If we wanted, by contrast, to output one of the counters on correct executions, we would need one more idea; we present it here since it is similar to the techniques of the next section.
Suppose that we wish to output the register r iff flag is 0; recall that flag may only be 0 or negative. We will do so by repeatedly reading a new letter, and having r be the only possible even output value, provided flag is 0-no even value is produced if flag is negative.
We may assume that, by construction, flag is even and r is a multiple of 4; we further assume that we have a register r 1 2 that contains half of r's value. We add the letter z to our alphabet, to be read at the end of the simulation; reading z increases r 1 2 by 2 and flag by 4. The output value is then set to:
Write s for the value of r before reading the z's, and f for the value of flag. After reading i letters z, the new values of the registers are:
For an even output to be produced, r 5 Simulation of Z-VASS z using N ∞ -CCRA Let V be a Z-VASS z and q a state of V. In this section, we devise a simulation of V using N ∞ -CCRA in the following sense: Given a word w ∈ (C k ) * , the N ∞ -CCRA will output an even value iff w ∈ L V,q .
Translating the strategy for Z to the N setting turns out to be a nontrivial matter. Indeed, one might expect that it would be enough to increase the updates so that no negative number appears therein. This would contribute a linear blowup to the values, but does not seem to change the overall behavior. However, the resets made while reading chk i would have to be equal to that blowup, and this would require copying.
The simulation will thus follow two phases. First, one that corresponds to the strategy for Z with the updates tweaked to be positive; second, after reading a chk i , a climb-back phase that puts the registers back in a manageable state (called "ready" later on). For this latter phase, the N ∞ -CCRA will read a word in cb * i · chkcb i -the letter cb standing for climb-back. Further, combining the acceptance conditions of multiple counters will also require some new letters; the alphabet of the automaton is thus:
Simulation of a single counter
Again, since we are working with a single counter, we drop the indices of the letters in C 1 . A single counter in the Z-VASS z will be simulated by 7 different registers, each with a simple intended meaning:
-r + and r − should respectively count the number of increments and decrements of the counter; -r u increases each time the counter is either incremented or decremented; it counts the number of updates to the counter; -The register r u 2 should be half of r u ; -r z will be a witness that the chk letter has always been read when the simulated counter was zero and that the climb-back phases were done correctly; -Finally, we will need two internal registers r cb and r 2cb , used solely in the climb-back phase.
To simplify the discussion, we give names to some register configurations:
In the first two configurations, we also assume that the r cb = r 2cb = 0.
Goal of the construction. We will show that if the registers are ready and we read an equal number of inc's and dec's followed by a chk, then the registers become to-climb. There is then a precise number i such that reading cb i · chkcb will put the registers back in ready mode. Crucially, if the numbers of inc's and dec's are not equal, or an incorrect number of cb's is read, then the registers become dead.
The updates are as follows, where the registers not shown are simply preserved. As we saw in Remark 2, we will require that the values of the registers be divisible by some values, hence rather than incrementing with 1, we increment by a value e ∈ N (for Einheit, unit) to be determined later. Note that these are indeed copyless updates.
. If the registers are dead, they will stay so after reading any word in (C 1 ) * .
Lemma 1. Assume the registers are ready. After reading i letters inc and j letters dec, in any order, then reading a final chk, the new values of the registers satisfy:
1. If i = j, then they are to-climb; 2. Otherwise, they are dead.
Proof. Suppose r
2 × r u , and let us name that value s. After reading i letters inc and j letters dec, the new values are:
2 × r u , thus reading chk will indeed make the registers to-climb. Otherwise, one of r + or r − is smaller than r z , and reading chk will make the registers dead. 
Reading chkcb thus makes the registers ready. If i is smaller than 2×s e then r cb < r z ; if it is greater, then r u < r z : reading chkcb thus makes the registers dead.
Simulation of multiple counters
We just saw how to simulate a single counter in the sense that the registers are not dead iff the input word describes a correct run (i.e., one in which chk is only read if the counter is 0). Let us now exhibit a method that combines multiple such simulations, and outputs an even value iff none of the simulations is dead.
To do so, we will repeatedly read new letters z 1 , z 2 , . . . , z k at the very end of the execution, in a similar fashion as Remark 2.
Let us suppose we have k simulated counters, hence k sets of 7 registers. For this phase, we will only use r
Applications
We draw a number of undecidability results as consequences of these simulations.
Theorem 2 (Equivalence). The following problem is undecidable:
Proof. Let V be a Z-VASS z and q a state of V, and consider the N ∞ -CCRA C that simulates L V,q . We reduce deciding if that language is empty (which is undecidable by Proposition 1) to the problem at hand. Equation (1), defining the output of C, is such that r avg is the minimum iff the execution was correct. Thus replacing this output function by:
changes the output value of a word iff it was a correct run. Calling C this modified version, it holds that
Clearly, it is undecidable whether the image of an N ∞ -CCRA is always odd. Further, that image may be nonsemilinear (see the following proof), and:
Theorem 3 (Semilinearity). The following problem is undecidable:
Given: An N ∞ -CCRA C over A * Question: Is C(A * ) semilinear, i.e., an eventually periodic set?
Proof. We provide an independent construction which bears some similarities to the "climb-back" method. It doubles a register r in the following sense: if r is a register with starting value s, then reading inc s/2 · chk doubles the value of r; if any other number of inc's is read (which happens in particular when s is odd), the new value of r will be some odd number.
Consider a register r with initial value s, and suppose we have an additional register r holding 2 × s. We introduce two new registers, r cb and r 2cb initialized with 0. Upon reading a word cb i · chkcb, we apply the updates:
After reading cb i , it holds that r = s + 2 × i, r cb = 4 × i, and r 2cb = 8 × i. . In both cases, after reading chkcb, r becomes odd, and will stay so after reading any other word.
As a side note, consider the N ∞ -CCRA with the above updates and r initialized to 2, that reads words in (cb * · chkcb) * . Then the only even outputs of this machine are the powers of two, a nonsemilinear set.
This concludes the construction, and we now present the reduction. Let V be a Z-VASS z and q a state of V, and consider the N ∞ -CCRA C that simulates L V,q . We assume that |L V,q | ≤ 1, and again reduce deciding L V,q = ∅ to the problem at hand.
First we note that we may assume that C outputs all the odd numbers, for instance by adding a letter and, upon reading n , outputting 2 × n + 1. Also recall that if L V,q is nonempty, then there is a unique w such that C(w) is even.
We now modify C into C to incorporate the above machinery. We simply store in a new register r the output value of C, and proceed by reading words of the form cb i · chkcb with the updates as above. If L V,q = ∅, then C ((C k ) * ) is all the odd numbers, a semilinear set. Otherwise, there is one (and only one) even value s in the image of C, and it holds that:
a nonsemilinear set.
Theorem 4 (Upperboundedness). The following problem is undecidable:
Proof. Let V be a Z-VASS z and q a state of V, and consider the Z ∞ -CCRA C that simulates L V,q . Relying on Proposition 2, let W be a Z ∞ -WA equivalent to C. Tweak W to output the same as C plus one, hence W(w) is 1 iff w ∈ L V,q . Now let W be W with an added letter # that jumps from the final states of W to its initial state; formally, let W = (A, λ, µ, ν) with A = (Q, A, δ, q 0 , F ), then W is (A , λ, µ , ν) where A = (Q, A {#}, δ ∪ {(q, #, q 0 ) | q ∈ F }, q 0 , F ), and µ agrees with µ on δ and is extended by µ(q, #, q 0 ) = ν(q) + λ.
In essence, W is iterating W:
W(w i ) .
From this, we see that if W is always negative or zero, W is bounded, otherwise, if W(w) = 1, then W ((w#) c · w) = c + 1, hence W is unbounded.
Conclusion
Deceptively powerful, copyless cost register automata with increments and min operations were shown to be able to simulate and check runs of counter machines. The constructions show that the repeated use of min enables behaviors that appear outside the scope of copylessness, e.g., an N ∞ -CCRA can double the value of a register (or, more precisely, can attempt to do so while knowing when it failed). As a main consequence, equivalence of N ∞ -CCRA is undecidable.
We wish to highlight two open questions. First, Theorem 4 comes short of telling us anything about the decidability of upper-boundedness for Z ∞ -CCRA (the same being decidable for N ∞ -CCRA and N-WA in general [5] ). Note that it cannot be decided whether a Z ∞ -CCRA is upper-bounded by a given constant (from Proposition 5).
Second, the normal form of Proposition 2 hints to the possibility that linearly ambiguous WA can be put into a similar form. More precisely, it seems that any such WA can be decomposed into two unambiguous WA, the first one making nondeterministic jumps into the second. Does this hold?
