Abstract. We study the synthesis problem for external linear or branching specifications and distributed, synchronous architectures with arbitrary delays on processes. External means that the specification only relates input and output variables. We introduce the subclass of uniformly well-connected (UWC) architectures for which there exists a routing allowing each output process to get the values of all inputs it is connected to, as soon as possible. We prove that the distributed synthesis problem is decidable on UWC architectures if and only if the output variables are totally ordered by their knowledge of input variables. We also show that if we extend this class by letting the routing depend on the output process, then the previous decidability result fails. Finally, we provide a natural restriction on specifications under which the whole class of UWC architectures is decidable. Synthesis problem and Distributed systems and Synchronous architectures.
Introduction
Synthesis is an essential problem in computer science introduced by Church [6] . It consists in translating a system property which relates input and output events, into a low-level model which computes the output from the input, so that the property is met. The property may be given in a high level specification language (such as monadic second order logic) while the low-level model can be a finite state machine. More generally, the problem can be parametrized by the specification language and the target model.
The controller synthesis problem, in which a system is also part of the input, extends the synthesis problem. The goal is to synthesize a controller such that the system, synchronized with the controller, meets the given specification. Thus, the synthesis problem corresponds to the particular case of the controller synthesis problem with a system having all possible behaviors. Both problems have a classical formulation in terms of games. See for instance [27, 28] for a presentation of relationships between two-player infinite games in an automata-theoretic setting, and the synthesis problem. Both problems also have several variants. Let us review some of them, in order to relate the contribution of the present paper to existing work.
Some variants of the synthesis problem.
Closed vs. open systems. Early approaches consider closed systems, in which there is no interaction with an environment [7] . Synthesis has later been extended to open systems [22, 1] , that is, to systems interacting with an unpredictable environment. The goal is to enforce the specification no matter how the environment acts. In this work, we consider open systems. Centralized vs. distributed systems. A solution to Church's problem for centralized systems has been presented by Büchi and Landweber [5] , for monadic second order specifications. A distributed system is made up of several communicating processes. The additional difficulty showing up with distributed systems is that the information acquired by each individual process about the global state of the system is only partial. Indeed, data exchanges between processes are constrained by a given communication architecture. For controller synthesis, the controller itself is required to be distributed over the same communication architecture, so that each of its components cannot have a complete knowledge of what happens. In this paper we also consider distributed systems.
• Full specifications are the most general ones: they may refer to any variable.
• External specifications only refer to input and output variables, but not to internal ones.
• Local specifications are Boolean combinations of p-local specifications, where p denotes a process. For a given process p, a specification is said p-local if it only refers to variables read or written by process p.
In this work, we use external specifications. Before discussing this choice and presenting our contributions, let us review the most salient existing results on the synthesis problem.
1.2. Synthesis for distributed systems: related work. For asynchronous systems, synthesis has first been studied in [23] for single-process implementations and linear-time specifications.
In [17] , the synthesis problem in the distributed setting is proved decidable for trace-closed specifications, yet for a quite specific class of controllers. This result has been strengthened in [18] , where restrictions on the communication patterns of the controllers have been relaxed. Another subclass of decidable systems, incomparable with the preceding one, has been identified in [10] , using an enhanced memory for controllers. The synthesis of asynchronous distributed systems in the general case of µ-calculus specifications was studied in [9] . Also, the theory of asynchronous automata has been applied in [26] to solve the synthesis problem of closed distributed systems.
For synchronous systems, undecidability is the point in common to most existing results. This question has been first studied in [24] , where synthesis has been proved undecidable for LTL specifications and arbitrary architectures. For pipeline architectures (where processes are linearly ordered and each process communicates to its right neighbor), synthesis becomes non elementarily decidable for LTL specifications. The lower bound follows from a former result on multiplayer games [21] . Even for local specifications, constraining only variables local to processes, the problem is still undecidable for most communication architectures [16] . Synthesis has been shown decidable for the pipeline architecture and CTL * full specifications [15] . A decision criterion for full specifications has then been established in [8] . It implies that the problem is undecidable for the architecture of Figure 1 . The reason is that full specifications make it possible to enforce a constant value on variable t, breaking the communication link between processes p 0 and p 1 .
1.3. Contributions. We address the synthesis problem for open distributed synchronous systems and temporal logic specifications. In contrast to the situation in the asynchronous setting, most decidability results for synthesis of synchronous systems are negative. The goal of this paper is to investigate relevant restrictions to obtain decidability. Undecidability often arises Figure 1 . Architecture decidable/undecidable for external/full specifications.
when dealing with full specifications. For the rare positive statements, as for the pipeline architecture, allowing full specifications strengthen the decidability result [15] . On the other hand, for the undecidability part of the criterion obtained in [8] , allowing full specifications weakens the result by yielding easy reductions to the basic undecidable architecture of Pnueli and Rosner [24] (see Figure 1) , for instance by breaking communication links at will. In the seminal paper [24] , specifications were assumed to be external, or input-output: only variables communicating with the environment could be constrained. The way processes of the system communicate was only restricted by the communication architecture, not by the specification. This is very natural from a practical point of view: when writing a specification, we are only concerned by the input/output behavior of the system and we should leave to the implementation all freedom on its internal behavior. For that reason, solving the problem for external specifications is more relevant and useful-albeit more difficult-than a decidability criterion for arbitrary specifications. We will show that the synthesis problem is decidable for the architecture of Figure 1 and external specifications, that is, if we do not constrain the internal variable t. Results. We consider the synthesis problem for synchronous semantics, where each process is assigned a nonnegative delay. The delays can be used to model latency in communications, or slow processes. This model has the same expressive power as the one where delays sit on communication channels, and it subsumes both the 0-delay and the 1-delay classical semantics [24, 15] .
To rule out unnatural properties yielding undecidability, the specifications we consider are external, coming back to the original framework of [24, 6] . In Section 3, we first determine a sufficient condition for undecidability with external specifications, that generalizes the undecidability result of [24] . We next introduce in Section 4 uniformly well-connected (UWC) architectures. Informally, an architecture is UWC if there exists a routing allowing each output process to get, as soon as possible, the values of all inputs it is connected to. Using tree automata, we prove that for such architectures and external specifications, the sufficient condition for undecidability becomes a criterion. (As already pointed out, synthesis may be undecidable for full specifications while decidable for external ones.) We also propose a natural restriction on specifications for which synthesis, on UWC architectures, becomes decidable. We call such specifications robust specifications. Finally, we introduce in Section 5 the larger class of wellconnected architectures, in which the routing of input variables to an output process may depend on that process. We show that our criterion is not a necessary condition anymore for this larger class. The undecidability proof highlights the surprising fact that in Figure 1 , blanking out a single information bit in the transmission of x 0 to p 1 through t suffices to yield undecidability. This is a step forward in understanding decidability limits for distributed synthesis. It remains open whether the problem is decidable for robust external specifications and well-connected architectures.
An extended abstract of this work appeared in [11] .
Preliminaries
Trees and tree automata. Given two finite sets X and Y , a Y -labeled X-tree (also called full tree) is a total function t : X * → Y where elements of X are called directions, and elements of Y are called labels. A word σ ∈ X * defines a node of t and t(σ) is its label. The empty word ε is the root of the tree. A word σ ∈ X ω is a branch. In the following, a tree t : X * → Y will be called an (X, Y )-tree. A non-deterministic tree automaton (NDTA) A = (X, Y, Q, q 0 , δ, α) runs on (X, Y )-trees. It consists of a finite set of states Q, an initial state q 0 , a transition function δ : Q × Y → 2 Q X and an acceptance condition α ⊆ Q ω . A run ρ of such an automaton over an (X, Y )-tree t is an (X, Q)-tree ρ such that ρ(ε) = q 0 , and for all σ ∈ X * , (
The specific acceptance condition chosen among the classical ones is not important in this paper.
Architectures.
is a finite directed acyclic bipartite graph, where V ⊎ P is the set of vertices, and E ⊆ (V × P ) ∪ (P × V ) is the set of edges, such that |E −1 (v)| ≤ 1 for all v ∈ V . Elements of P will be called processes and elements of V variables. Intuitively, an edge (v, p) ∈ V × P means that process p can read variable v, and an edge (p, v) ∈ P × V means that p can write on v. Thus, |E −1 (v)| ≤ 1 means that a variable v is written by at most one process. An example of an architecture is given in Figure 2 , where processes are represented by boxes and variables by circles. Input and output variables are defined, respectively, by
Variables in V \ (V I ∪ V O ) will be called internal. We assume that no process is minimal or maximal in the graph: for p ∈ P , we have E(p) = ∅ and E −1 (p) = ∅. Each variable v ranges over a finite domain S v , given with the architecture. For U ⊆ V , we denote by S U the set v∈U S v . A configuration of the architecture is given by a tuple s = (s v ) v∈V ∈ S V describing the value of all variables. For U ⊆ V , we denote by s U = (s v ) v∈U the projection of the configuration s to the subset of variables U . The initial configuration is s 0 = (s v 0 ) v∈V ∈ S V . We will assume that |S v | ≥ 2 for all v ∈ V , because a variable v for which |S v | = 1 always has the same value and may be ignored. It will be convenient in some proofs to assume that {0, 1} ⊆ S v and that s v 0 = 0 for all v ∈ V . Each process p ∈ P is associated with a delay d p ∈ N that corresponds to the time interval between the moment the process reads the variables v ∈ E −1 (p) and the moment it will be able to write on its own output variables. Note that delay 0 is allowed. In the following, for v ∈ V , we will often write d v for d p where E −1 (v) = {p}.
Runs.
A run of an architecture is an infinite sequence of configurations, i.e., an infinite word over the alphabet S V , starting with the initial configuration s 0 ∈ S V given by the architecture. 
Specifications. Specifications over a set U ⊆ V of variables can be given, for instance, by a µ-calculus, CTL * , CTL, or LTL formula, using atomic propositions of the form (v = a) with v ∈ U and a ∈ S v . We then say that the formula is in L(U ) where L is the logic used. Specifications over U are external if U ⊆ V I ∪ V O . The validity of an external formula on a run tree t (or simply a run) only depends on its projection 
Programs
The F -run tree is the run tree t : (S V I ) * → S V such that each branch is labeled by a word
. This shows that the F -run tree is unique.
Distributed synthesis problem. Let L be a specification language. The distributed synthesis problem for an architecture A is the following: given a formula ϕ ∈ L, decide whether there exists a distributed program F on A such that every F -run (or the F -run tree) satisfies ϕ. We will then say that F is a distributed implementation for the specification ϕ. If for some architecture the synthesis problem is undecidable, we say that the architecture itself is undecidable (for the specification language L).
Memoryless strategies. The strategy f v is memoryless if it does not depend on the past, that is, if there exists
In case d v = 0, this corresponds to the usual definition of a memoryless strategy.
Summaries. For a variable
i , corresponding to the composition of all local strategies used to compute v.
Smallest cumulative delay. Throughout the paper, the notion of smallest cumulative delay of transmission from u to v will extensively be used. It is defined by
, if there is no path from u to v in the architecture, and for
d-compatibility for summaries. The compatibility of the strategies F = (f v ) v∈V \V I with the delays extends to the summariesF = ( Figure 3 . Architectures A and A ′
Architectures with incomparable information
In this section, we state a sufficient condition for undecidability; this relies on an easy generalization of the undecidable architecture presented in [24] . Figure 3 has incomparable information. The following proposition extends the undecidability result of [24, 8] .
Proposition 3.2. Architectures with incomparable information are undecidable for LTL or CTL external specifications.
In [24] , the architecture A ′ shown in Figure 3 is proved undecidable, both for LTL and CTL specifications. We will reduce the synthesis problem of A ′ to the synthesis problem of an architecture with incomparable information. This reduction is rather natural but not completely straightforward, for instance the specification needs to be changed in the reduction. For the sake of completeness, we give a precise proof of the reduction in the rest of this section.
Let
be an architecture with incomparable information. Without loss of generality, we assume that s v 0 = 0 for all v ∈ V . By definition, we find x 0 , y 0 ∈ V I and x n , y m ∈ V O such that x 0 / ∈ View(y m ) and y 0 / ∈ View(x n ). Consider paths
Note that the sets of variables {x 0 , . . . , x n } and {y 0 , . . . , y m } are disjoint.
be the architecture of Figure 3 with
with unchanged domains for output variables: S ′xn = S xn and S ′ym = S ym ; with S ′x 0 = S ′y 0 = {0, 1} as domain for input variables; and with s ′ 0 = s V ′ 0 . The delays for x n and y m are the smallest cumulative delays of transmission from x 0 to x n and y 0 to y m as defined earlier:
The architecture A ′ is undecidable for LTL or CTL specifications (it suffices to adapt the proofs of [24, 8] taking into account different delays on processes). We reduce the distributed synthesis problem for A ′ to the same problem for A. We first consider CTL specifications.
Note that we do need to modify the specification when reducing the distributed synthesis problem from A ′ to A. Indeed, observe that the specification
is not implementable over A ′ whereas it is implementable over A, provided View(x n ) \ {x 0 } = ∅ and assuming no delays.
To define an implementation F ′ over A ′ given an implementation F over A, we simulate the behavior of F when all variables in V I \ V I ′ are constantly set to 0. This will be enforced when defining the reduction of the specification from A ′ to A, using the formula χ = (x 0 ∈ {0, 1}) ∧ (y 0 ∈ {0, 1}) ∧ v∈V I \V I ′ (v = 0). We define a reduction that maps a formula ψ of
that ensures ψ only on the subtree of executions respecting χ. This reduction is defined by
We use the following notation: for r ∈ S ′V I ′ , we definer ∈ S V I byr V I ′ = r andr v = 0 for all v ∈ V I \ V I ′ , and we extend this definition to words (withε = ε). This allows us to fix the
The reduction of the formula is correct in the following sense: 
Proof. By an easy induction on ψ. Let t : (S
The cases of boolean connectives are trivial. So let ψ = E ψ 1 U ψ 2 and assume that t,ρ |= ψ. Then we find
′ . By induction we obtaint, ρ · r 1 · · · r n |= ψ 2 and t, ρ · r 1 · · · r i |= ψ 1 for all 0 ≤ i < n. Therefore,t, ρ |= ψ. The converse implication can be shown similarly.
The cases EX and EG are left to the reader. Now we prove the reduction:
Proof. Let F ′ = (f ′xn , f ′ym ) be a distributed implementation for ψ over A ′ . We will define a distributed strategy F = (f v ) v∈V for ψ over A so that the projection on V ′ of any F -run will be an F ′ -run. More precisely, if σ ∈ (S V ) + is a prefix of an F -run with σ x 0 ∈ {0, 1} + , then the following will hold:
and similarly for (y 0 , y m ).
To do so, we use the variables x 1 , . . . , x n−1 to transmit the value of x 0 through the architecture. Formally, at each step, f x k copies the last value of x k−1 it can read -the one that was written d x k steps before: for 0 < k < n and τ ∈ (S R(x k ) ) + , we define
By definition, f x k is clearly compatible with the delay d x k . It is easy to check that if we provide ρ ∈ {0, 1} ω as input on x 0 and follow the strategies (f x k ) above then we get on x n−1 the outcome
In order to satisfy (1), the last strategy f xn simulates f ′xn taking into account the shift by
Note that f ′xn (σ x 0 ) only depends on the prefix of σ x 0 of length |σ 2 | which is precisely σ x n−1 2 due to the shift induced by the strategies (f x k ) 0<k<n (see Figure 4) . Hence, in this case, we define f xn (σ R(xn) ) = f ′xn (σ
By definition, f xn is clearly compatible with the delay d xn . Also, we have explained that (1) holds with this definition of (f x k ) 0<k≤n . For 0 < k ≤ m, we define similarly f y k and for every other variable v, we set f v = 0. The resulting distributed strategy F = (f v ) v∈V is indeed compatible with the delays. It remains to show that F is a distributed implementation for ψ over A.
Let t : (S V I ) * → S V be the F -run tree over A. We show below thatt : (S ′V I ′ ) * → S V ′ is in fact the F ′ -run tree over A ′ . Then, since F ′ is a distributed implementation of ψ, we deducẽ t, ε |= ψ, and Lemma 3.3 implies t, ε |= ψ. Hence F is a distributed implementation of ψ.
First, it is easy to see thatt is a run-tree over
. Using the same arguments, we also obtain thatt(ρ) ym = f ′ym (ρ y 0 ), and thatt is the F ′ -run tree. Proof. Suppose F = (f v ) v∈V \V I is a distributed implementation of ψ over A. We need to define the strategies f ′xn : (S ′x 0 ) + → S xn and f ′ym : (S ′y 0 ) + → S ym of the variables in A ′ . The difficulty here is that f ′xn may have less input variables than f xn so it cannot simply simulate it. To overcome this, we use the fact that, due to the special form of ψ, the F -run tree t satisfies ψ if and only if the sub-tree restricted to branches where all input variables other than x 0 and y 0 are always 0 also satisfies ψ. So the processes of A ′ will behave like the processes of A writing respectively on x n and y m in the special executions when the values of input variables other than x 0 and y 0 are always 0.
Formally, for ρ ∈ (S ′V I ′ ) + , we set f ′xn (ρ x 0 ) =f xn (ρ View(xn) ). Observe that, due to incomparable information,f xn does not depend onρ y 0 . Hence f ′xn only depends on ρ x 0 and is a correct strategy for variable x n in the architecture A ′ . Moreover,f xn is d-compatible and so f ′xn is d ′ -compatible. We define f ′ym similarly. It is easy to check that F ′ = (f ′xn , f ′ym ) is a distributed implementation of ψ over A ′ : let t be the F -run tree and t ′ be the F ′ -run tree. We
Hence t ′ =t and since t, ε |= ψ, Lemma 3.3 implies thatt, ε |= ψ and F ′ is a distributed implementation of ψ on A ′ .
We consider the reduction for LTL specifications. In this case, the specification over A only needs to ensure ψ when the input values on x 0 and y 0 are in the domain allowed by A ′ . We use the reduction
where the formula ξ is defined by ξ = (x 0 ∈ {0, 1}) ∧ (y 0 ∈ {0, 1}).
The same constructions as the ones described in the proofs of Lemma 3.4 and Lemma 3.5 yield the reduction. Indeed, let F ′ be a distributed implementation of ψ over A ′ , and let F be defined as in the proof of Lemma 3.4. Let ρ ∈ (S V I ) ω be an input sequence and
Then σ V ′ is an F ′ -run, and σ V ′ , ε |= ψ. Since ψ ∈ LTL(V ′ ) we deduce σ, ε |= ψ. We obtain that any F -run σ is such that σ, ε |= (G ξ) → ψ, and F is a distributed implementation of ψ over A.
Conversely, given F a distributed implementation of ψ over A, define F ′ as in the proof of Lemma 3.5. Let ρ ∈ (S ′V I ′ ) ω be an input sequence and σ = s 0 s 1 s 2 · · · ∈ (S V ) ω be the F -run induced byρ. By definition ofρ, we have σ, ε |= G ξ and since F is a distributed implementation of ψ we get σ, ε |= ψ. Again, ψ ∈ LTL(V ′ ) implies that σ V ′ , ε |= ψ. Given that σ V ′ is in fact the F ′ -run induced by ρ (this is immediate from the definition of f ′xn and f ′ym ), F ′ is a distributed implementation of ψ over A ′ .
We have defined a reduction from the distributed synthesis problem over the architecture A ′ to the distributed synthesis problem over an architecture with incomparable information, for LTL or CTL specifications. Since the synthesis problem is undecidable both for LTL and CTL specifications over A ′ , we obtain its undecidability for architectures with incomparable information.
Uniformly well-connected architectures
This section introduces the new class of uniformly well-connected (UWC) architectures and provides a decidability criterion for the synthesis problem on this class. It also introduces the notion of robust specification and shows that UWC architectures are always decidable for external and robust specifications.
Definition. A routing for an architecture
Observe that a routing does not include local strategies for output variables. Informally, we say that an architecture is uniformly well connected if there exists a routing Φ that makes it possible to transmit with a minimal delay to every process p writing to an output variable v, all the values of the variables in View(v). 
In case there is no delay, the uniform well-connectedness refines the notion of adequate connectivity introduced by Pnueli and Rosner in [24] , as we no longer require each output variable to be communicated the value of all input variables, but only of those belonging to its view. In fact, this gives us strategies for internal variables, that are simply to route the input to the processes writing on output variables.
Observe that, whereas the routing functions are memoryless, memory is required for the decoding functions. Indeed, consider the architecture of Figure 5 . The delays are written next to the processes, and all variables range over the domain {0, 1}. Observe first that this architecture is UWC: process p writes to t the xor of u 1 and u 2 with delay 1. This could be written t = Y u 1 ⊕ Y u 2 where Y x denotes the previous value of variable x. In order to recover (decode) Y u 2 , process q 1 memorizes the previous value of u 1 and makes the xor with t: Y u 2 = t ⊕ Y u 1 . But if we restrict to memoryless decoding functions, then we only know u 1 and t and we cannot recover Y u 2 . Figure 5 . A uniformly well-connected architecture 4.2. Decision criterion for UWC architectures. We first show that distributed programs are somewhat easier to find in a UWC architecture. As a matter of fact, in such architectures, to define a distributed strategy it suffices to define a collection of input-output strategies that respect the delays given by the architecture. u∈View(v) be respectively the routing and the decoding functions giving the uniform well-connectedness of the architecture A. We use the routing functions f v as memoryless strategies for the internal variables
Lemma 4.2. Let
A = (V ⊎P, E, (S v ) v∈V , s 0 , (d p ) p∈P ) be a UWC architecture. For each v ∈ V O , let h v : (S View(v) ) + → S v be
an input-output mapping which is d-compatible. Then there exists a distributed program
. We need to verify that this is well-defined. Let i > 0 and ρ, ρ ′ ∈ (S V I ) i . Let σ, σ ′ ∈ (S V \V O ) i be the corresponding Φ-compatible sequences, and assume σ R(v) 
By the above, f v is well-defined and obviously it depends only on
Thus, it is indeed d-compatible. Now, let ρ ∈ (S V I ) + , and let σ be the F -run induced by ρ. We get, by definition of summaries,
We now give a decision criterion for this specific subclass of architectures.
Theorem 4.3. A UWC architecture is decidable for external (linear or branching) specifications if and only if it has linearly preordered information.
We have already seen in Section 3 that incomparable information yields undecidability of the synthesis problem for LTL or CTL external specifications. We prove now that, when restricted to the subclass of UWC architectures, this also becomes a necessary condition.
We assume that the architecture A is UWC and has linearly preordered information, and therefore we can order the output variables
In the following, in order to use tree-automata, we extend a local strategy f : (S X ) + → S Y by letting f (ε) = s Y 0 , so that it becomes an (S X , S Y )-tree. We proceed in two steps. First, we build an automaton accepting all the global input-output 0-delay strategies implementing the specification. A global input-output 0-delay strategy for A is an (
0 . This first step is simply the program synthesis for a single process with incomplete information (since we may have View(v 1 ) V I ). This problem was solved in [13] for CTL * specifications. 
To check the existence of such trees (h v ) v∈V O , we will inductively eliminate the output variables following the order v 1 , . . . , v n . It is important that we start with the variable that views the largest set of input variables, even though, due to the delays, it might get the information much later than the remaining variables. Let V k = {v k , . . . , v n } for k ≥ 1. The induction step relies on the following statement.
trees, such that a tree t is accepted by A k+1 if and only if there exists an (S
The proof of Proposition 4.5 is split in two steps. Since
is the projection of t on U ). So one can first transform the automaton A k into A ′ k that accepts the trees t ∈ L(A k ) such that t v k is d-compatible (Lemma 4.6). Then, one can build an automaton that restricts the domain of the directions and the labeling of the accepted trees to S View(v k+1 ) and S V k+1 respectively.
Proof. Intuitively, to make sure that the function t v is d-compatible, the automaton A ′ will guess in advance the values of t v and then check that its guess is correct. The guess has to be made K = max{d(u, v), u ∈ View(v)} steps in advance and consists in a d-compatible function g : (S View(v) ) K → S v that predicts what will be K steps later the values of variable v. During a transition, the guess is sent in each direction r ∈ S View(v) as a function r −1 g defined by (r −1 g)(σ) = g(rσ) which is stored in the state of the automaton. Previous guesses are refined similarly and are also stored in the state of the automaton so that the new set of states is Q ′ = Q × F where F is the set of d-compatible functions f : (S View(v) ) <K → S v , where Z <K = i<K Z i . The value f (ε) is the guess that was made K steps earlier and has to be checked against the current value of v in the tree.
To formalize this, we define the (transition) function ∆ :
Intuitively, if we are in state (q, f ) ∈ Q × F at some node τ and move in direction r ∈ S View(v) then ∆(f, r) computes the set of functions in F that could label the node τ · r. Observe that f ′ is determined by f and r for any σ such that |σ| < K − 1 and corresponds to the specialization of f according to the new direction r. The functions f ′ ∈ ∆(f, r) differ only on values f ′ (σ) for |σ| = K − 1 which correspond to the new guesses. Now, the transition function of A ′ is defined for (q, f ) ∈ Q ′ and s ∈ S U only if s v = f (ε) (this ensures that the guess made K steps earlier was correct) and sends in each direction r ∈ S V iew (v) of the tree a copy of the automaton in the state (q r , g r ) where q r corresponds to the simulation of a run of A and g r ∈ ∆(f, r).
Finally, the set of initial states of A ′ is I ′ = {q 0 } × F and α ′ = π −1 (α) where π : (Q × F) ω → Q ω is the projection on Q, i.e., a run of A ′ is successful if and only if its projection on Q is a successful run of A.
Let t be an (S View(v) , S U )-tree accepted by A and suppose that t v is d-compatible. Let ρ : (S View(v) ) * → Q be an accepting run of A over t. There is a unique way to extend ρ to a run ρ ′ : (S View(v) ) * → Q × F of A ′ over t. The only possibility is to label a node τ ∈ (S View(v) ) * by the map (v) ) <K so that all guesses are correct. Since t v is d-compatible, we deduce that f τ is also d-compatible, hence it belongs to F. Then we can define the run ρ ′ by ρ ′ (τ ) = ρ(τ ), f τ for τ ∈ (S View(v) ) * . We show that it is an accepting run of A ′ over t. First, we prove that at each node τ ∈ (S View(v) ) * the transition function δ ′ is satisfied. Let (q,
for all r ∈ S View(v) . This is obvious from the definitions:
Finally, the run ρ ′ is successful since its projection on Q is ρ which is successful.
Conversely, suppose there is a successful run ρ ′ of A ′ over t. We need to show that t v is dcompatible and that t ∈ L(A). Let ρ ′ : S View(v) * → Q × F be such a run. We have ρ ′ = (ρ, H) with ρ : S View(v) * → Q and H : (S View(v) ) * → F. By definition of δ ′ , we immediately get that ρ is a run of A, which is successful since ρ ′ is successful.
It remains to prove that t v is d-compatible. Since ρ ′ is a run and the transition function δ ′ is only defined on ((q, f ), s) when s v = f (ε), we deduce that t v (τ ) = H(τ )(ε) for all τ ∈ (S View(v) ) * . Hence, we need to show that the map
. We can show, by successive applications of the transition function δ ′ and by definition of ∆, that the value of H(τ 1 τ 2 )(ε) is indeed the guess made at node τ 1 for the direction defined by τ 2 , i.e.,
of Proposition 4.5. We consider the NDTA compat v k (A k ). It remains to project away the S v k component of the label and to make sure that the S V k+1 component of the label only depends on the S View(v k+1 ) component of the input. The first part is the classical projection on S V k+1 of the automaton and the second part is the narrowing construction introduced in [13] . The automaton A k+1 fulfilling the requirements of Proposition 4.5 is therefore given by narrow View(v k+1 ) (proj V k+1 (compat v k (A k )) ). Note that, even when applied to a NDTA, the narrowing construction of [13] yields an alternating tree automaton. Here we assume that the narrowing operation returns a NDTA using a classical transformation of alternating tree automata into NDTA [20] . The drawback is that this involves an exponential blow up. Unfortunately, this is needed since Lemma 4.6 requires a NDTA as input.
We can now conclude the proof of Theorem 4.3. Using Proposition 4.5 inductively starting from the NDTA A 1 of Proposition 4.4, we obtain a NDTA A n accepting an (S View(vn) , S vn )-tree h vn if and only if for each 1 ≤ i < n, there exists an (S View(v i ) , S v i )-tree h v i which is d-compatible and such that h v 1 ⊕ · · · ⊕ h vn is accepted by A 1 . Therefore, using Lemma 4.2, there is a distributed implementation for the specification over A if and only if L(compat vn (A n )) is nonempty. The overall procedure is non-elementary due to the exponential blow-up of the inductive step in Proposition 4.5. We do not know for now the lower bound of the complexity of this problem.
Decidability for UWC architectures and robust specifications.
We now show that we can obtain decidability of the synthesis problem for the whole subclass of UWC architectures by restricting ourselves to specifications that only relate output variables to their own view.
Note that a robust formula is always external.
Proposition 4.8. The synthesis problem for robust CTL
* specifications is decidable over UWC architectures. 
Since F implements ϕ, we have t |= ϕ and then t |= ϕ v . We can prove by structural induction on the formula that for any ψ ∈ CTL * (View(v) ∪ {v}), any branch σ ∈ (S V I ) ω and any position i we have t, σ, i |= ψ if and only if t ′ , σ View(v) 
) and we obtain as above that t |= ϕ v . Therefore, t |= ϕ and F implements ϕ on A.
Well-connected architectures
It is natural to ask whether the decision criterion for UWC architectures can be extended to a larger class. In this section, we relax the property of uniform well-connectedness and show that, in that case, linearly preordered information is not anymore a sufficient condition for decidability.
Definition 5.1. An architecture is said to be well-connected, if for each output variable v ∈ V O , the sub-architecture consisting of (E −1 ) * (v) is uniformly well-connected.
Intuitively this means that for each output variable v there is a routing making it possible to transmit the values of the input variables in View(v) to the process that writes on v, but such a routing may vary from one output variable to another, in contrast with the case of UWC architectures, where a single routing is used for all output variables. For instance, the architecture of Figure 2 is well-connected. Indeed, to transmit the values of u and v to z ij , it is enough to write u on z i and v on z j . Note that this does not give a uniform routing. Actually, the architecture of Figure 2 is not UWC assuming that variables values range over {0, 1} (as shown by Proposition 5.3 below). Hence, the subclass of UWC architectures is strictly contained in the subclass of well-connected architectures.
In the proof of Proposition 5.3, we use the following lemma, established in [25] for solving the network information flow problem introduced in [2] .
We say that two functions f and g from S 2 to S are independent if (f, g) : S 2 → S 2 is invertible. This lemma asserts that over a small alphabet, one cannot build a large set of pairwise independent functions. In our setting, it implies the following result: Proof. It is easy to see that the architecture A of Figure 2 is well-connected. However, it is not uniformly well-connected. Indeed, suppose it is. Then there exist a routing Φ = (f z 1 , f z 2 , f z 3 , f z 4 ) consisting of four memoryless strategies, and for all v ∈ V O , a decoding function g v : {0, 1} 2 → {0, 1} 2 . Therefore, uniform well-connectedness of A implies that every pair (f z i , f z j ) is invertible, using g z ij as inverse. This is in contradiction with Lemma 5.2, which implies that for Boolean variables, there are at most three pairwise independent functions. Hence the architecture is not uniformly well-connected.
Interestingly enough, the size of the alphabet has an influence on the possibility to have a uniform routing and Lemma 5.2 helps to understand why. In our setting, this means that by enlarging the domains of internal variables, we may obtain uniform well-connectedness from a well-connected architecture.
The following theorem asserts that, unfortunately, the decision criterion cannot be extended to well-connected architectures.
Theorem 5.4. The synthesis problem for LTL specifications and well-connected architectures with linearly preordered information is undecidable.
Let A be the architecture of Figure 6 , in which all the delays are set to 0, and which is clearly well-connected and linearly preordered. To show its undecidability, fix a deterministic Turing machine M with tape alphabet Γ and state set Q. We reduce the non halting problem of M starting from the empty tape to the distributed implementability of an LTL specification over A. Let S z = {0, 1} for z ∈ V \ {x, y} and S x = S y = Γ ⊎ Q ⊎ {#} where # is a new symbol. As usual, the configuration of M defined by state q and tape content γ 1 γ 2 , where the head scans the first symbol of γ 2 , is encoded by the word γ 1 qγ 2 ∈ Γ * QΓ + (we require that γ 2 = ε for technical reasons, including in it some blank symbols if necessary). An input word u ∈ 0 * 1 p 0{0, 1} ω encodes the integer n(u) = p and similarly for v. We construct an LTL specification ϕ M forcing any distributed implementation to output on variable x the n(u)-th configuration of M starting from the empty tape. Processes p 0 and p 6 play the role of the two processes of the undecidable architecture of Pnueli and Rosner (A ′ in Figure 3 ). The difficulty is to ensure that process p 6 cannot receive relevant information about u. Figure 6 . Undecidable, well-connected and linearly preordered architecture
is a conjunction of five properties described below that can all be expressed in LTL(V I ∪ V O ).
(1) The processes p i for 1 ≤ i ≤ 5 have to output the current values of (u, w) on (u i , w i ) until (including) the first 1 occurs on w. Afterwards, they are unconstrained. Process p 6 must always output the value of w on w 6 . Moreover, after the first 1 on w, it also has to output the current value of u on u 6 . Formally, this is defined by the LTL formula α:
This is expressed by β = β u,x ∧ β v,y , where
We next express with a formula γ M that if n(u) = 1 then x has to output the first configuration C 1 of M starting from the empty tape. That is, if the input is in 0 q 10{0, 1} ω , then the corresponding output is # q+1 C 1 # ω . The LTL formula is
where (x ∈ C 1 # ω ) can be expressed easily. (4) We say that the input words are synchronized either if u, v ∈ 0 q 1 p 0{0, 1} ω or else if u ∈ 0 q 1 p+1 0{0, 1} ω and v ∈ 0 q+1 1 p 0{0, 1} ω . We use a formula δ to express the fact that if u and v are synchronized and n(u) = n(v), then the outputs on x and y are equal. We first define the LTL formula
to express the fact that the input words u and v are synchronized and n(u) = n(v). Then the formula δ is defined by:
Finally, one can express with an LTL formula ψ M that if the input words are synchronized and n(u) = n(v) + 1 then the configuration encoded on x is obtained by a computation step of M from the configuration encoded on y. We use the LTL formula (n(u) = n(v) + 1) defined by
to express the fact that u and v are synchronized and n(u) = n(v) + 1. The formula ψ M is defined by
where Trans(y, x) expresses the fact that the factor of length 3 of x is obtained from the one of y by a transition of the Turing machine M . We have
Furthermore, is the blank symbol of the tape and T is the set of transitions of M (the transition (p, a, q, b, dir ), taken when M is in state p and scans symbol a, switches the state to q, writes symbol b and moves the head according to the direction dir ∈ {←, →}). We first show that there exists a distributed implementation of ϕ M over A. Let ⊕ be the addition modulo 2 (xor). Process p 0 forwards u to z 0 . Process q forwards u to z 1 , u ⊕ w to z 2 and w to z 3 . The strategy for z 4 is not memoryless. Process q forwards w to z 4 until (including) the first 1 on w and then it forwards u ⊕ w to z 4 . Formally, f z 4 (u, 0 q b) = b and f z 4 (ua, 0 q 1wb) = a ⊕ b. We also use memoryless strategies for the processes p i so that α is satisfied. For instance, the strategy for p 1 is f 1 (b 1 , b 2 ) = (b 1 , b 1 ⊕ b 2 ) and the strategy for p 6 (y excluded) is f 6 (b 3 , b 4 ) = (b 3 ⊕ b 4 , b 3 ). It is easy to see that with these strategies, the first property α of the specification is satisfied. Note that, until the first 1 on w, p 6 outputs 0 on u 6 , and after this first 1, p 5 cannot decode u and w anymore.
The strategy f x (respectively f y ) is to output the p-th configuration of M starting from the empty tape when u (respectively v) encodes p. Then, the rest of the specification, β∧γ M ∧δ∧ψ M , is satisfied.
Remark 5.5. Actually, one can define another distributed implementation by changing only the strategy f z 4 : at each step, process q transmits to p 6 the value of u at the preceding step as the mod 2 difference between z 3 and z 4 , until the first 1 occurs on w. Formally, f z 4 (a, b) = b, f z 4 (u · a 1 · a 2 , 0 q b) = a 1 ⊕ b and f z 4 (ua, 0 q 1wb) = a ⊕ b. We also adapt the strategies of p 1 , . . . , p 6 so that α is satisfied. Note that these strategies are no longer memoryless, they have to remember the last bit of u. By xoring its two arguments, process p 6 can then recover the whole history of u, except the bit occurring simultaneously with the first 1 of w. Hence, we are almost in the situation of the decidable architecture of Figure 1 , but surprisingly, missing only one bit of information suffices to yield undecidability.
Let now F = (f v ) v∈V \V I be a distributed implementation of ϕ M on the architecture A of Figure 6 . We prove that f x must simulate the computation of M starting from the empty tape.
Step 1: relating the strategies for z 3 and z 4 . Lemma 5.6. Let g 1 , g 2 , g 3 : {0, 1} 2 → {0, 1} be pairwise independent functions. Then, there exists ε ∈ {0, 1} such that for all a, b ∈ {0, 1}:
Step 3: enforcing output of the n(u)-th configuration of M on x. Proof. The proof is by induction on p. The case p = 1 follows from the specification γ M . Let now p > 1 and assume that u ∈ 0 q 1 p+1 0{0, 1} ω . Let v = 0 q+1 1 p 0 ω and w = 0 q 1 ω . By induction, for u 0 ∈ 0 q+1 1 p 0{0, 1} ω the output is x = # q+1+p C p # ω . Using δ, we deduce that on the input triple (u 0 , w, v) the output is y = x = # q+1+p C p # ω . Now, by Lemma 5.8, on the input pairs (u 0 , w) and (u, w), the outputs on z 3 and z 4 are the same. Hence, on the input triples (u 0 , w, v) and (u, w, v) the outputs on y must be y = # q+1+p C p # ω by the above. Using ψ M , we deduce that on the input triple (u, w, v) the output on x must be x = # q+1+p C p+1 # ω . This concludes the proof since x only depends on u.
By masking one bit of u to p 6 , we cause uncertainty with respect to the value of n(u), preventing this process to "cheat". In turn, process p 0 , which has no information about the other input values, only knows that p 6 is not always able to cheat, and has then to always output the correct Turing machine configuration.
of Theorem 5.4 . Starting from a Turing machine M , we have shown that any distributed implementation of the specification ϕ M is forced to output on x the n(u)-th configuration of M . Therefore, there is a distributed implementation on this architecture for the formula ϕ M ∧ G(x = halt) if and only if M does not halt starting from the empty tape. We have thus reduced the non halting problem of a Turing machine on the empty tape to the LTL distributed synthesis problem over a well-connected architecture with linearly preordered information, proving that this latter problem is undecidable (more precisely not co-RE).
Conclusion
In this paper, we have shown that every decidable architecture must have linearly preordered information, and that this condition is sufficient for deciding external specifications on UWC architectures. On the other hand, we have exhibited a well-connected architecture with linearly preordered information, yet undecidable for external LTL specifications, by simulating the loss of a single information bit on the UWC architecture of Figure 1 .
Finally, we have shown that all UWC architectures are decidable for robust specifications, i.e., specifications constraining external variables which are causally related by a communication path. A challenging problem is to find whether this still holds for well-connected architectures.
