Verifying that a network configuration satisfies a given boolean predicate is a fundamental problem in distributed computing. Many variations of this problem have been studied, for example, in the context of proof labeling schemes (PLS), locally checkable proofs (LCP), and non-deterministic local decision (NLD). In all of these contexts, verification time is assumed to be constant. Korman, Kutten and Masuzawa [PODC 2011] presented a proof-labeling scheme for MST, with poly-logarithmic verification time, and logarithmic memory at each vertex.
Introduction
A fundamental problem in distributed computing is to determine if a network configuration satisfies some predicate. In the distributed setting, a network configuration is represented by an underlying graph, where each vertex represents a processor, edges represent communication links between processors, and each vertex has a state. For example, the state of every vertex can be a color, and the predicate signifies that the coloring is proper, i.e., that every edge has its endpoints colored differently. Processors learn about the network by exchanging messages along the edges. Some properties are local by nature and easy to verify, yet many natural problems-for example, testing if the network contains cycles-cannot be tested in less than diameter time, even if message size and local computational power are unbounded.
In order to cope with strong time lower bounds, Korman, Kutten, and Peleg have introduced in [13] a computational model, called proof-labeling schemes (PLS), where vertices are given auxiliary global information in the form of labels. This auxiliary information may allow vertices to verify that a property is satisfied more efficiently than could be achieved without the aid of labels. Specifically, a PLS consists of two components, a prover and a verifier . The prover is an oracle which assigns labels to vertices. The verifier is a distributed algorithm which runs on the labeled configuration and outputs true or false at each vertex as a function of its state, its label, and the labels it receives. A PLS is complete if for every legal configuration (satisfying the predicate), prover can assign labels such that all vertices output true. The PLS is sound if for every illegal configuration (which does not satisfy the predicate) for every labeling, some vertex outputs false.
Schemes for verifying a predicate are useful in many applications. One such application is checking the output of a distributed algorithm [3, 8] . For example, if a procedure is meant to output a spanning-tree of the network, it may be useful to periodically verify that the output does indeed not contain cycles. If the original procedure which finds the spanning-tree can additionally produce labels, verification may be achieved substantially faster than diameter time required without the aid of labels. A simple procedure for checking the legality of the current state is very useful in the construction of self stabilizing algorithms [1, 12, 5, 2] . Other applications include estimating the complexity of logics required for distributed run-time verification [8] , establishing a general distributed complexity theory [7] , and proving lower bounds on the time required for distributed approximation [6] .
Distributed verification has been formalized in various models to suit its myriad applications. These models include proof-labeling schemes (PLS) [13] , locally checkable proofs (LCP) [9] , and non-deterministic local decision (NLD) [7] . All three of these verification schemes are local in the sense that they require a constant number of rounds, independent of the size of the graph. While a fast procedure is certainly a desirable feature in verification algorithms, it may be the case that other computational resources-space or communication-need also be considered. For example, in the case of PLS, deterministically verifying a sub-graph is acyclic requires labels of size Ω(log n) per vertex [13] . However, specifying a sub-graph only requires O(∆) space (the maximum degree of a vertex) per vertex. Thus, if we restrict attention to local verification algorithms, the space requirement to store labels may be unboundedly larger than the space required to specify the instance.
Korman, Kutten and Masuzawa [12] presented a PLS for minimum spanning-tree with polylogarithmic verification time and logarithmic memory at each vertex. In the present work we also consider super-constant time verification and address tradeoffs between computational resources in distributed verification algorithms: label size, communication, computation space, and time. Specifically, we address the following questions: If verification algorithms are allowed to run a super-constant time, can labels be significantly shorter? What are the tradeoffs between label size and verification time? Can verification be achieved using (per processor) space which is linear in the label size? We focus on the acyclicity problem and prove that labels can indeed be shortened by a factor of t-the run-time of the algorithm-compared to constant-round verification. Moreover, computation space for each vertex can be made linear in the label size. Note that in this model it does not trivially hold that each message contains exactly one label, since in each round every vertex receives a (potentially different) label from each neighbor, and the scheme should specify the message to be sent in the following round. We show that in our schemes messages are small enough so that the total communication is the same as in one-round verification.
Our Contributions
In this paper we consider proof-labeling schemes with super-constant verification time, and analyze tradeoffs between time, label size, message size, and computation space. In Subsection 3.1, we describe a universal scheme which can verify any property P. Suppose G s , with n vertices, m edges, and each state can be represented using s bits. Then for every t ∈ O(diam(G s )), our scheme verifies P in t rounds using labels and messages of size O((ns + min{n 2 , m log n})/t). For t = 1 this is the known universal scheme [13, 9, 4] . When t ∈ Ω(n), we get labels and messages of size O(s + min{n, (m/n) log n}). Overall, labels are significantly smaller, and total communication is the same. Subsection 3.2 proves a general lower bound technique for label size of t-round schemes.
In Section 4 we consider the problem determining if a graph is acyclic. Using the lower bound technique of Subsection 3.2, we prove in Subsection 4.1 that labels of size Ω((log n)/t) are required for the acyclic problem. Subsection 4.2 shows that this lower bound is tight. Our scheme for acyclic additionally uses optimal space and messages of size O((log n)/t). The verifier for acyclic assumes that vertices are given some truthful information about the round number, for example, by being told when (a multiple of) t rounds have elapsed. We prove that such information is necessary for any super-constant and sub-linear time distributed algorithm in Appendix A. In Subsection 4.3, we describe a recursive scheme for acyclic which uses space O(log * n) and constant communication per vertex per round. The recursive verifier runs in time O(n) in the worst case, but there are always correct labels which will be accepted in time O(log diam(G)). We note that in order to break the logarithmic space barrier, our schemes in Subsections 4.2 and 4.3 crucially do not rely upon unique identifiers for the vertices. Conversely, the lower bounds of Subsections 3.2 and 4.1 hold for a stronger model where vertices have unique identifiers, and labels may depend on the unique identifiers.
Related Work
Distributed verification has been studied extensively. It was studied and used in the design of self stabilizing algorithms, first in [1] , where the notion of local detection was introduced, and recently in [12] , where a super-constant time verification scheme was presented. Both papers use verification in the design of a self stabilizing algorithm for constructing a minimum spanning-tree. Verification has also received attention of its own. For example, [11] presented tight bounds for minimum spanning-tree verification. In [13] , Korman, Kutten, and Peleg formalized the concept of local verification and introduced the notion of proof-labeling schemes. In their paper, verification is defined to use one communication round, and among other results they show a Θ(log n) bound on the complexity (label size and communication) for acyclic. Recently, [4] suggested using randomization in order to break the lower bounds of deterministic schemes, and among other results they show a Θ(log log n) bound on the communication complexity of acyclicity. In this paper, we show that if we use super-constant verification time, we can break the lower bound of space consumption (label size and computation space), while the total amount of communication is the same as in one deterministic verification round. Proof-labeling schemes with constant, greater than one, verification time was studied in [9] , and with super-constant verification time was presented in [12] . The question of what properties can be verified using a constant verification time was studied in [7] , and several complexity classes were presented, including LD-local decision-which includes all properties that can be decided using constant number of rounds and no additional information, and NLD-non-deterministic local decision-which includes all properties that can be decided in a constant number of rounds with additional information in the form of a certificate given to each vertex. While NLD and PLS are closely related, they differ in that NLD certificates are independent of vertex identifiers. Since PLS labels may depend on vertex identifiers, there is a PLS for every sequentially decidable property on ID based networks, while not all sequentially decidable properties are in NLD. Our lower bounds in Subsections 3.2 and 4.1 allow labels to depend on unique vertex identifiers, so our arguments give identical lower bounds for certificate sizes in the weaker NLD model. Nonetheless, the schemes for acyclic in Subsections 4.2 and 4.3 do not require unique identifiers.
Model and Definitions

Computational Framework
A graph configuration G s consists of an underling graph G = (V, E), and a state assignment function ϕ : V → S, where S is a state space. The state of a vertex includes all of its local information. It may include the vertex's identity (in an ID based configuration), the weight of its adjacent edges (in a weighted configuration), or the result of an algorithm executed on the graph, for example, its color according to a coloring algorithm.
In a proof-labeling scheme, an oracle assigns labels ℓ : V → L. Verification is performed by a distributed algorithm on the labeled configuration in synchronous rounds. In each round every vertex receives messages from all of its neighbors, performs local computation, and sends a message to all of its neighbors. At the beginning of each round, a vertex scans its messages in a streaming fashion, and the computational space is the maximum space required by a vertex in its local computation. Each vertex may send different messages to different neighbors in a round. When a vertex halts, it outputs true or false. If the vertex labels contain unique identifiers, then we require that an algorithm has the same output for all legal assignments of unique IDs.
Proof-Labeling Schemes and t-PLS
We start with a short description of proof-labeling schemes (PLS) as introduced in [13] . Given a family F of configurations, and a boolean predicate P over F, a PLS for (F, P) is a mechanism for deciding P(G s ) for every G s ∈ F. A PLS consists of two components: a prover p, and a verifier v. The prover is an oracle which, given any configuration G s ∈ F, assigns a bit string ℓ(v) to every vertex v, called the label of v. The verifier is a distributed algorithm running concurrently at every vertex. The verifier v at each vertex outputs a boolean. If the outputs are true at all vertices, v is said to accept the configuration, and otherwise (i.e., v outputs false in at least one vertex) v is said to reject the configuration. For correctness, a proof-labeling scheme (p, v) for (F, P) must be (1) complete and (2) sound . Formally, for every G s ∈ F, we say (p, v) is 1. complete if P(G s ) = true then, using the labels assigned by p, the verifier v accepts G s , and 2. sound if P(G s ) = false then, for every label assignment, the verifier v rejects G s .
The verification complexity of a proof-labeling scheme (p, v), according to [13] , is the maximal label size-the maximal length of a label assigned by the prover p on a legal configuration (satisfying P). A PLS is defined to use one verification round, in which neighbors exchange labels. In this case, label size and message size are the same.
In this paper we consider proof-labeling schemes with more than one verification round, in particular it can use super-constant time, and hence we define the message size of the scheme (p, v) to be the largest message a vertex sends during the execution of v on a legal configuration with the labels assigned by p. We denote a proof-labeling scheme with t-round verification by t-PLS.
General Space-Time Tradeoff Results
If there exists a PLS for (F, P) with label size κ (and hence, message size κ), then there exists a t-PLS for (F, P) with label size κ and message size κ/t. Indeed, vertices can communicate their κ-bit label in t different shares of size κ/t. In this section we give general results for label size reduction, along with message size, in a t-PLS. The idea is to take a 1-PLS, and break it into smaller shares where vertices are assigned only a single share of the original label. We refer to this technique as label sharing . In particular, we present a universal scheme and provide a tool for obtaining lower bounds.
Universal t-PLS
A universal scheme is a scheme that verifies every sequentially decidable property. In this subsection we assume that every vertex has an identifier, and identifiers in the same configuration are pairwise distinct. We give an upper bound on the label and message size of a universal scheme that uses t communication rounds. Theorem 1. Let F be a family of configurations with states set S and diameter at least D, let P be a boolean predicate over F and suppose that every state in S can be represented using s bits. For every t ∈ Ω(D) there exists a t-PLS for (F, P) with label and message size O((ns + min{n 2 , m log n})/t) where n is the number of vertices, and m is the number of edges in the graph.
In the proof of this theorem we use a known universal PLS [13, 9, 4] . Labels consist of the entire representation of the graph configuration. Nodes then verify that they have the same representation, and that it is consistent with its local view. Finally, they verify individually that the label represents a legal configuration. Since every configuration can be represented using O(ns + min{n 2 , m log n}) bits-by listing the state of each vertex and an adjacency matrix or an edge list-this is the label (and message) size of this scheme.
The idea of the universal t-PLS is to disperse the configuration representation into shares such that each vertex can collect the purported graph configuration from its t-neighborhood.
Proof. Let F be a family as described in the statement, let P be a boolean predicate over F and G s = (V, E, ϕ : V → S) ∈ F. We first describe the scheme. Consider some fixed vertex v ∈ V . For every vertex u ∈ V , let dist(u, v) = d and define j ≡ d mod (t/4). Denote R = (ns + min{n 2 , m log n}). The universal label of u, denoted by c(u), consists of:
• an orientation label a(u) ∈ {0, 1, 2} encodes (d mod 3), and
• a share of representation r(u) ∈ {0, 1} (4R)/t which encodes the j-th part (out of t/4 parts, of length
In the first round, each vertex sends its label to all of its neighbors. In the first t/2 rounds we use the orientation indicated by the orientation label of each neighbor for an efficient pipelining of labels in two directions. The message of every vertex in each of the first t/2 rounds is composed of two parts, one for pipelining of labels towards v and the other for pipelining of labels away from v. For every vertex u ∈ V , let Y (−1) be all neighbors y of u with a(y) ≡ a(u) − 1 mod 3, and let Y (+1) be all neighbors y of u with a(y) ≡ a(u) + 1 mod 3. The pipelining towards v is done by receiving labels only from Y (+1) and sending labels only to Y (−1) . Let L i (+1) be the set of labels u received in round i from all its Y (+1) neighbors. The vertex u verifies that all non empty labels in L i (+1) are equal, and sends this label to Y (−1) . The pipelining away from v is done similarly, with the roles of Y (−1) and Y (+1) reversed. The distinguished vertex v verifies that it has only Y (+1) neighbors, and in each round all non empty labels in L i (+1) are equal, and sends this label to all its neighbors. Every vertex u = v verifies that during the first t/2 rounds it has received from Y (−1) two labels (in two different rounds) with 'first in block' indication, f = 1. If the first had also 'v-indication' then u concatenates all 'shares of representation' of these labels, in order, excluding the last. Otherwise (the first had no 'v-indication'), u concatenates all 'shares of representation' of these labels, in reverse order, excluding the first. The distinguished vertex v verifies that it has 'vindication', 'first in block' indication, and 'orientation label' 0, and concatenates the t/4 first 'shares of representation' it sees, in order (including r(v)). Every vertex u ∈ V considers its concatenation, denoted by g(u), as a representation of a configuration, and verifies that it is consistent with its local view. In the last t/2 rounds u verifies that for every neighbor w it holds that g(w) = g(u), by sending g(u) in t/2 disjoint shares. Finally, if all verifications succeed, the output of u is whether the configuration represented by g(u) satisfies P.
The label size is O(R/t). In the first t/2 rounds, every message contains exactly two labels, and hence message size is also O(R/t). For every u, by definition, g(u) is the concatenation of at most t/2 'shares of representation' (t/2 rounds, and at most one 'share of representation' is concatenated in each round). Therefore, in the last t/2 rounds every message size is not more than the size of one 'share of representation', which is also O(R/t). So, the label and message size requirements hold.
We now prove the correctness of the scheme. If all vertices output true, by the last part of the scheme we know that they all have the same representation, and that it is consistent with their local view. Therefore, it must be the case where all vertices hold the correct representation of G s . Since all vertices output true, by construction of the scheme, P(G s ) = true. If P(G s ) = true and labels are assigned according to the scheme, we have the following. Denote by c j the label of a vertex with distance j from v. Let u ∈ V be a vertex and let dist(u, v) = d. In round i, by construction of the scheme, u receives from Y (−1) (and v from Y (+1) ) the label c |d−i| . If d < t/4, by construction, the first label u receives with 'first in block' indication (after less than t/4 rounds) is c 0 . Afterwards it receives c 1 , c 2 , . . . , c t/4−1 and c t/4 which is the second with 'first in block' indication. If d ≥ t/4, the first label u receives with 'first in block' indication (after less than t/4 rounds) is not c 0 , and hence has no 'v-indication'. By construction, it must be c Z , where Z = t/4 · k for some natural number k > 0. Afterwards it receives c Z−1 , c Z−2 , . . . , c Z−t/4+1 and c Z−t/4 which is the second with 'first in block' indication. It is easy to see that in both cases u constructs the correct representation of G s . Therefore, the equality and local view verifications succeed, and since P(G s ) = true, all vertices output true.
Lower Bound Tool
We start with some definitions. Although we consider only networks represented by undirected graphs, we will define an orientation on an edge to indicate a specific ordering of its endpoints. We denote by H(e) the head of a directed edge e, and by T (e) the tail of e. 1 , e 2 , G) , is the graph obtained from G by replacing e 1 and e 2 , by the edges (T (e 1 ), H(e 2 )) and (T (e 2 ), H(e 1 )).
Definition 2 (Edge Crossing
Edge crossings were used many times before, and were formalized as a tool for proving lower bounds of verification complexity in [4] . We now show how to use edge crossing in order to prove lower bounds for label size of t-PLS.
Proposition 4. Let (p, v) be a deterministic t-PLS for (F, P) with label size |ℓ|. Suppose that there is a configuration G s ∈ F which satisfies P and contains r directed edges e 1 , . . . , e r , whose t-neighborhoods N t (e 1 , G s ), . . . , N t (e r , G s ) are pairwise disjoint, contain q vertices each, and there exist r state preserving isomorphisms
. , r} such that σ i (H(e 1 )) = H(e i ) and σ i (T (e 1 )) = T (e i ). If |ℓ| < (log r)/q, then there exist i, j with 1 ≤ i < j ≤ r such that every connected component of C(e i , e j , G s ) is accepted by (p, v). (p, v) and G s be as described above, and assume that |ℓ| < (log r)/q. Consider a collection {σ i : V (N t (e 1 , G s )) → V (N t (e i , G s )), i = 1, . . . , r} of r state preserving isomorphisms,such that σ i (H(e 1 )) = H(e i ) and σ i (T (e 1 )) = T (e i ). Order the vertices of N t (e 1 , G s ) arbitrarily. For every i, consider the concatenation of labels given by p to the vertices of N t (e i , G s ), in the order induced by the ordering of N t (e 1 , G s ) and σ i . Denote this concatenated string L i . By label size assumption, it holds that |L i | < log r for every i, and thus there are less than r different options for L i . Therefore, by the pigeonhole principle, there are i = j such that L i = L j . Denote C(e i , e j , G s ) by G ′ s , and consider the labels provided by p to G s . For every vertex v / ∈ N t (e i , G s ) ∪ N t (e j , G s ), its t-neighborhood is the same in G s and in G ′ s . N t (e i , G s ) and N t (e j , G s ) are disjoint, isomorphic, and have the same states and labels according to some isomorphism which maps H(e i ) to H(e j ) and T (e i ) to T (e j ). Thus, for every vertex v ∈ N t (e i , G s ) ∪ N t (e j , G s ), its t-neighborhood in G s is the same as in G ′ s . Since the output of the verifier v at each vertex in G s is only a function of the states and labels at its t-neighborhood, if the output of v in G s is true at all vertices, then the output of v in every connected component of G ′ s must be true, and the proposition follows.
Proof. Let
The following theorem, which is a consequence of Proposition 4, is the tool we use to prove lower bounds of label size in a t-PLS.
Theorem 5. Let F be a family of configurations, and let P be a boolean predicate over F. Suppose that there is a configuration G s ∈ F which satisfies 1. P(G s ) = true, 2. G s contains r directed edges e 1 , . . . , e r , whose t-neighborhoods N t (e 1 , G s ), . . . , N t (e r , G s ) are pairwise disjoint, contain q vertices each, and there exist r state preserving isomorphisms {σ i :
. , r} such that σ i (H(e 1 )) = H(e i ) and σ i (T (e 1 )) = T (e i ), and 3. for every i = j, there exists a connected component H s of C(e i , e j , G s ) such that P(H s ) = false.
Then the label size of any t-PLS for (F, P) is Ω((log r)/q).
Acyclicity
In this section we focus on the acyclicity property, and give tight t-PLS lower and upper bounds. Definition 6 (Acyclicity). Let F be the family of all connected graphs. Given a graph configuration G s ∈ F, acyclic(G s ) = true if and only if the underling graph G is cycle free.
Lower Bound for acyclic
Theorem 7. Every scheme which verifies acyclic in t communication rounds requires labels of size Ω ((log n)/t).
Proof. We will show a configuration as described in Theorem 5, with r = Ω (n/t) and q = O(t), to derive the stated lower bound on label size of any scheme that verifies acyclic. Let G s be the n-vertex path v 0 − v 1 − · · · − v n−1 where all states are the empty string. Obviously acyclic(G s ) = true. Let r = ⌊n/(2t + 2)⌋ − 1, and consider the set
s ) contains exactly 2t + 2 vertices, and thus q = 2t + 2. Every pair of t-neighborhoods N t (e i , G s ) and N t (e j , G s ), for i = j, is disjoint since the distance between e i and e j is at least 2t + 1. For every i < j, C(e i , e j , G s ) contains exactly two connected components. One of them is the cycle H s = v qi+1 − v qi+2 − · · · − v qj − v qi+1 where all its edges are marked. By definition, P(H s ) = false. Hence, the conditions of Theorem 5 are satisfied, and the lower bound follows.
Upper Bound for acyclic
In this section, we describe a t-PLS for acyclic which matches the lower bound presented in Theorem 7.
Theorem 8. Suppose G = (V, E) is a graph with diameter diam(G). For every t ≤ min {log n, diam(G)}, there exists an O(t)-PLS for acyclic with label and messages of size O((log n)/t). Further, the verifier v uses space of size O((log n)/t).
Remark 9. In this subsection, we assume that each vertex has access to some means of deciding (correctly) when t communication rounds have elapsed. This can be achieved either by allowing each vertex a log t bit counter, or by giving each vertex access to an oracle which alarms when (an integer multiple of) t rounds have elapsed. We discuss the necessity of this assumption in Subsection 4.3, and prove that such information is necessary for any distributed algorithm with super-constant and sub-linear run-time in Appendix A.
The following scheme can be used to verify that the graph contains no cycles using label of size O(log n) in a single round Remark 11. An s-cyclic labeling induces an orientation on G where an edge (u, v) is oriented such that u = P (v). That is, each edge is oriented away from the parent.
Lemma 12. Suppose G = (V, E) is a connected graph and ℓ an s-cyclic labeling. Then either G is acyclic or G contains a unique cycle of length k, where s divides k. Further, if G contains a cycle, C, then C is an oriented cycle in the orientation induced by ℓ, and all oriented paths in G are oriented away from vertices in C.
is a cycle in G. In the orientation described in Remark 11, every vertex has in-degree at most 1. Let deg in (v i ) denote the in-degree of v i in C and similarly
In the latter case, ℓ(v k−1 ) − ℓ(v 0 ) ≡ −k ≡ 0 mod s, and the desired result holds.
Since every vertex v i ∈ C has in-degree 1 in C, all edges that leave C must be oriented away from vertices in C. Similarly, any path w 0 , w 2 , . . . , w k with w 0 ∈ C and w i / ∈ C for i ≥ 1 must be oriented away from C. Thus no such path may lead to another cycle C ′ , nor could another cycle C ′ share a path with C. Thus since G is connected C must the unique cycle.
To achieve labels of length O((log n)/t) for acyclic, we simulate the "distance-to-root" scheme described above. The idea is to break the O(log n)-bit labels indicating the distance to the root into shares of size O((log n)/t). Unlike the universal scheme described in Subsection 3.1, vertices do not reconstruct the (log n)-bit distance-to-root labels directly, but check the labeling is correct distributively. Thus the verifier v only uses space linear in the label size.
Formally, for a vertex v, an acyclicity label consists of:
• an orientation label a(v) ∈ {0, 1, 2} which defines an orientation on edges away from the root of the tree,
• a block label b(v) ∈ {head, mid, tail} which indicates v's position within a block,
• a block color c(v) ∈ {0, 1}, and
(log n)/t which encodes a share of a distance to the root.
It is clear that an acyclicity label can be recorded in O((log n)/t) bits. The semantics of acyclicity labels are described below. the block containing w 0 and contained in C. Inductively define blocks B 1 , B 2 , . . . ⊆ C such that B i+1 is a child of B i . By the pigeonhole principle, we must have B i = B j for some i < j. However, the correctness of the distance labels implies that
Correct orientation labels
In order to prove Theorem 8, by Proposition 16, it suffices to show there is a verifier v for acyclicity labels which runs in time O(t) using messages and memory of size O((log n)/t). Verification of the correctness of the orientation labels a, block coloring c, and conditions 1 and 3 in the correctness of the block labels b can be accomplished in a single communication round with constant communication. Thus, we must verify conditions 2 and 4 in the correctness of the block labels as well as the correctness of distance labels.
After the initial sharing of labels with neighbors in the first round, the verification algorithm Verify(v, a, b, c, d ) continues as follows (see Algorithm 1 for pseudo-code). For t − 1 steps, each vertex relays the message from its parent to all of its children. At the end of t rounds, each vertex verifies that at some point, it received a message from a head vertex. is_zero ← true 6: end if 7: for i = 1 to t-1 do
if b(w) = head then 10: head_check ← true In each iteration of the algorithm, each vertex stores at most a constant number of labels, hence the memory usage is O((log diam(G))/t) as well. Finally, the overall run-time is 3t. The label sending procedure in lines 7-21 is accomplished in t rounds, while the incrementation procedure in lines 25-7 requires at most 2t rounds: t rounds where the head vertices increment, and another t to propagate carries. In particular, the run-time is O(t).
Recursive Acyclicity Checking
The scheme described in Subsection 4.2 gives asymptotically optimal label size for t ≤ log n. Further, the communication per round and local memory usage is linear in the label size. However, the scheme above crucially requires each vertex to be given a truthful representation of the parameter t. In fact, for ω(1) ≤ t ≤ o(n), it is necessary for the vertices to be given some truthful information about t (see Appendix A). In this subsection, we describe a verifier for acyclic that only assumes that the space provided to each processor is O(log * n). The tradeoff is that our algorithm runs in time which may be linear in n in the worst case.
Theorem 18. There exists a O(n)-PLS for acyclic which uses labels and space of size O(log * n).
In each round, the communication per-edge is O(1).
Remark 19. While verification time in Theorem 18 is O(n) in the worst case, the actual time depends on the labels given to the vertices. In particular, for every acyclic graph G there exists a correct labeling which will be accepted in time O(log diam(G)). Thus there is a tradeoff between the time of the algorithm and the amount of truthful information about t given to the vertices.
The idea of the algorithm is to simulate the verifier Verify (Algorithm 1) without the benefit of truthful information about t. As before, the labels designate blocks of length t. Within each block, the vertices store shares of the distance of that block to the root, where in this case, the shares consist of a single bit. Since t (the length of the block) is not known to the vertices in advance, they must first compute t. However, storing t requires log t bits, so the computed value of t is stored in shares in sub-blocks of length log t. In order to verify the correctness of the sub-blocks, the vertices must count to log t using log log t bits of memory. This value is again stored in shares in sub-sub-blocks of length log log t. This process of recursively verifying the lengths of blocks continues until the block length is constant. Thus log * n levels of recursion suffice. Formally, in our recursive scheme, recursive acyclicity labels closely resemble those in Subsection 4.2. For each vertex v and each level i = 1, 2, . . . , k = log * n, we have an associated block label b i (v) and block color c i (v). We refer to the labels associated to each i as a level , denoted L i . The top level L 1 additionally contains orientation labels, a(v) and distance labels d(v) for each vertex. Each level i has an associated length, denoted by t i . We emphasize that the t i are not initially known to the vertices at the beginning of an execution. The semantics and correctness of the block labels b i and block colors c i are precisely the same as those described in Subsection 4.2, where blocks at level i have length t i . As before, the distance labels d(v) encode (a share of) the purported distance of the L 1 block containing v to the root.
Definition 20. Suppose ℓ is a family of recursive acyclicity labels for a graph G = (V, E). We say that a family ℓ of recursive acyclicity labels is correct if the L 1 labels are correct as in Definition 14, and for i ≥ 2 the block labels in b i and block colors c i are correct as in Definition 14 with t i = ⌊log t i−1 ⌋.
Remark 21. For simplicity of presentation, we assume that for all i ≥ 2 that t i divides t i−1 . Thus, each block in L i−1 contains an integral number of sub-blocks. The general case can be obtained by allowing "overlap" of the last sub-block of B in level i with the first sub-block of B ′ in i where B is the parent block of B ′ .
Analogously to Proposition 16, we obtain the following result.
Proposition 22. Let G = (V, E) be a graph. Then G is acyclic if and only if it admits a correct family C of recursive acyclicity labels.
It is clear that recursive acyclicity labels are of length O(log * n). Indeed, each of the labels in the log * n recursive levels has length O(1).
Lemma 23. Let G = (V, E) be a graph, and C a family of recursive acyclicity labels on G. Suppose that for some i, the labels in L i+1 are correct. Then there exists a verifier v i for the labels in L i with run-time O(2 t i+1 ), constant communication per round, and constant space. The proof that d is correct when i = 1 if and only if no vertex rejects in lines 11-15 in RVerify(i, L i ) is analogous to the argument in Lemma 17. Finally, it is clear that the per-round communication is constant, as is the space requirement (assuming that only levels L i and L i+1 are stored). As for the run-time, notice that Count(ctr, m, i) always terminates in time at most 2 mt i+1 by the verification at lines 11-13 of Increment. Further, if no vertex fails during the call to count Count, then Add and Send will similarly halt after 2 t i+1 ≤ t i rounds.
Proof of Theorem 18. By Proposition 22, it suffices to prove the existence of a verifier v of recursive acyclicity labels with the claimed communication, space, and time. We induct on k − i (where k = log * n) that the correctness of L i can be verified in the desired run-time, using constant communication and space. When i = k, the correctness of labels is a local property (independent of the size of the network 
Send(dcount, dcount, 1) 14:
We describe a verifier RVerify (Algorithm 2) for L i assuming L i+1 is correct. Suppose B is a block in level i, and B 1 , B 2 , . . . , B s its sub-blocks for s = t i /t i+1 , with B j the parent of B j+1 . By assumption, the block labels for the B j are correct. The head v 0 of B verifies that it is also the head of B 1 , and sends a token T count to all of its children. The vertices in B bounce T count to the tail, which then bounces T count back up to v 0 . Meanwhile, the vertices of each B j hold shares of a counter tcount j , which computes t i by incrementing itself until T count returns to the head. If the counter tcount j ever exceeds 2 t i+1 (i.e., if the bit held by the tail of B j is ever incremented twice), then the vertices in B j will halt and reject the label. It is clear that this step of the verification will always halt in time O(2 t i+1 ). After counting, the blocks in L i+1 verify that they agree on tcount j . Further, tails of B j verify that their share of tcount is 1, implying that 2 t i−1 −1 < t i ≤ 2 t i−1 .
There is a slight complication in the verification algorithm described above that arises when a block B terminates prematurely in a leaf (a vertex of degree 1) which is not a tail. In correct block labels, if v 0 is the head of overlapping complete blocks (i.e., all have tails at distance t i from the head) then v 0 should receive T count from all of its children at the same time, 2t i . However, if some block containing v 0 is incomplete (terminates prematurely with a leaf) then v 0 may receive messages from its children in different rounds. To avoid this problem, leaves which are not labeled tail respond with a token T leaf to their parent upon receiving T count . The parent then knows not to expect a T count from this child. Similarly, if an internal vertex receives T leaf from all of its children (perhaps in different rounds), it sends T leaf to its parent. Then vertices check that they receive T count from all children at the same time, except those which have sent T leaf if a previous round.
Finally, if i = 1, the vertices must additionally verify the correctness of the distance labels d(v). Remark 24. We can modify the recursive scheme described here to use only finitely many levels of recursion, but with the tradeoff of using more memory per-vertex. In particular, if only the labels of L 1 are given, but each vertex has access to a counter with log t bits of memory, we recover precisely the scheme of Subsection 4.2 in the case where t = Ω(log n). If we give labels in L 1 and L 2 , and each vertex has a counter with log log t bits of memory, then the scheme will still be correct. However, we get a greater degradation of run-time due to round-off errors in log log t. Specifically, if we have m − 1 < log log t ≤ m, then we obtain
Thus, even if log log t is given truthfully as the size of the counter, the run-time of RVerify may be quadratic in t if the L 1 labels are improperly formed. Finally, given labels L 1 , L 2 , and L 3 , and a counters of size log (3) t, the run-time may vary exponentially from log n. Thus, our worst-case runtime is already only O(n). The fully recursive scheme thus achieves the same worst-case run-time with log * n memory per vertex.
