Abstract. Path checking, the special case of the model checking problem where the model under consideration is a single path, plays an important role in monitoring, testing, and verification. We prove that for linear-time temporal logic (LTL), path checking can be efficiently parallelized. In addition to the core logic, we consider the extensions of LTL with bounded-future (BLTL) and past-time (LTL+Past) operators. Even though both extensions improve the succinctness of the logic exponentially, path checking remains efficiently parallelizable: Our algorithm for LTL, LTL+Past, and BLTL+Past is in AC 1 (logDCFL) ⊆ NC.
Introduction
Linear-time temporal logic (LTL) is the standard specification language to describe properties of computation paths. The problem of checking whether a given finite path satisfies an LTL formula plays a key role in monitoring and runtime verification [14, 12, 7, 2, 5] , where individual paths are checked either online, during the execution of the system, or offline, for example based on an error report. Similarly, path checking occurs in testing [3] and in several static verification techniques, notably in Monte-Carlo-based probabilistic verification, where large numbers of randomly generated sample paths are analyzed [31] .
Somewhat surprisingly, given the widespread use of LTL, the complexity of the path checking problem is still open [26] . The established upper bound is P: The algorithms in the literature traverse the path sequentially (cf. [12, 26, 14] ); by going backwards from the end of the path, one can ensure that, in each step, the value of each subformula is updated in constant time, which results in running time that is quadratic in the size of the formula plus the length of the path. The only known lower bound is NC 1 [9] , the complexity of evaluating Boolean expressions. The large gap between the bounds is especially unsatisfying in light of the recent trend to implement path checking algorithms in hardware, which is inherently parallel. For example, the IEEE standard temporal logic PSL [15] , an extension of LTL, has become part of the hardware description language VHDL, and several tools [7, 5, 11] are available to synthesize hardware-based monitors from assertions written in PSL. Can we improve over the sequential approach by evaluating entire blocks of path positions in parallel?
Parallelizing LTL path checking. We show that LTL path checking can indeed be parallelized efficiently. Our approach is inspired by work in the related area of evaluating monotone Boolean circuits [13, 10, 20, 4, 25, 6] . Rather than sequentially traversing the path, we consider the circuit that results from unrolling the formula in positive normal form over the path using the expansion laws of the logic. Using the positive normal form of the formula ensures that the resulting circuit is monotone. The size of the circuit is quadratic in the size of formula plus the size of the path. For logarithmic measures, the circuit thus is of the same order as the input. Figure 1 shows such a circuit for the formula ((a U b) U (c U d)) U e and a path of length 5. Yang [30] and, independently, Delcher and Kosaraju [8] have shown that monotone Boolean circuits can be evaluated efficiently in parallel if the graph of the circuit has a planar embedding. Unfortunately, this condition is already violated in the simple example of Figure 1 as shown in Figure 2 . Individually, however, each operator results in a planar circuit: for example, d U e results in e 0 ∨ (d 0 ∧ (e 1 ∨ (d 1 ∧ . . .) · · · ). The complete formula thus defines a tree of planar circuits.
Our path checking algorithm works on this tree of circuits. We perform a parallel tree contraction [1, 19, 18] to collapse a parent node and its children nodes into a single planar circuit. Simple paths in the tree immediately collapse into a planar circuit; the remaining binary tree is contracted incrementally, until only a single planar circuit remains. The key insight for this construction is that a contraction can be carried out as soon as one of the children has been evaluated. Initially, all leaves correspond to atomic propositions. During the contraction all leaves are evaluated. Because no leaf has to wait for the evaluation of its sibling before it can be contracted with its parent, we can contract a fixed portion of the nodes in every sequential step, and therefore terminate in at most a logarithmic number of steps. The path checking problem can, hence, be parallelized efficiently. The key properties of LTL that are exploited in the construction are the existence of a positive normal form of linear size and expansion laws that, when iteratively applied, increase the size of the Boolean circuit only linearly in the number of iteration steps. The combinatorial structure of the resulting circuit allows for an efficient reduction of the evaluation problem to the evaluation problem of planar Boolean circuits. In addition to planarity, our construction maintains some further technical invariants, in particular that the circuits have all input gates on the outer face. Analyzing this construction, we obtain the result that the path checking problem is in AC 1 (logDCFL):
Theorem 1. The LTL path checking problem is in AC 1 (logDCFL).
The AC 1 (logDCFL) complexity results from a reduction that is performed by an outer algorithm, which can be implemented as a uniform family of Boolean circuits of logarithmic depth, and an inner operation, which is represented through unbounded fan-in gates that are embedded in the circuits and that serve as logDCFL oracles. The AC 1 complexity of the outer reduction is due to the tree contraction algorithm. Throughout the contraction, each atomic contraction operation processes a sub-circuit with a size that is of the same order as the overall input circuit. Hence, the oracle gates have non-constant fan-in. The whole tree contraction is performed in a logarithmic number of parallel steps. Thus, the contraction algorithm implements an AC 1 reduction. Within the AC 1 contraction circuits, the oracle gates perform the evaluation of a certain class of monotone planar Boolean circuits. This operation can be bound to a complexity of logDCFL. In summary, the overall path checking algorithm consists of two sequential reduction steps: First the LTL path checking problem is reduced (in logarithmic space) to the problem of evaluating a certain class of monotone Boolean circuits. Second, the problem of evaluating those circuits is AC 1 reduced to the problem of evaluating a certain class of monotone planar Boolean circuits. The latter problem is solved by an logDCFL oracle.
The LTL path checking problem is closely related to the membership problems for the various types of regular expressions: the membership problem is in NL for regular expressions [16] , in logCFL for semi-extended regular expressions [28] , and P-complete for star-free regular expressions and extended regular expressions [27] . Of particular interest is the comparison to the star-free regular expressions, since they have the same expressive power as LTL on finite paths [24] . With AC 1 (logDCFL) vs. P, our result demonstrates a computational advantage for LTL.
LTL with past and bounded-future operators. Practical temporal logics like PSL extend LTL with additional operators that help the user to write shorter and simpler specifications. Such extensions often come at a price: adding extended regular expressions, for example, makes the path checking problem P-complete [27] . We show that this is not always the case: past-time and bounded operators are two major extensions of LTL, which both improve the succinctness of the logic exponentially, and whose path checking problems remain efficiently parallelizable.
Past-time operators are the dual of the standard modalities, referring to past instead of future events. Past-time operators greatly simplify properties like "b is always preceded by a", which, in the core logic, require an unintuitive application of the Until operator, as in G ¬(¬a U b ∧ ¬a). Furthermore, Laroussinie, Markey and Schoebelen [23] proved that the property "all future states that agree with the initial state on propositions p 1 , p 2 , . . . p n , also agree on proposition p 0 ," which can obviously be expressed as a simple past-time formula, requires an exponentially larger formula if only future-time operators are allowed. However, since past operators are the dual of future operators, they also result in planar circuits; hence, the construction for LTL can directly be applied to the tree of circuits that results from LTL formulas with unbounded past and future operators and we obtain the following result: Theorem 2. The LTL+Past path checking problem is in AC 1 (logDCFL).
Bounded operators express that a condition holds at least for a given, fixed number of steps, or must occur within such a number of steps. Bounded specifications are especially useful in monitoring applications [11] , where unbounded modalities are problematic: if only the finite prefix of a computation is visible, it is impossible to falsify an unbounded liveness property or validate an unbounded safety property. The succinctness of the bounded operators is due to the fact that expanding the bounded operators into a formula tree replicates subformulas, causing an exponential blow-up in the formula size. Another exponential blow-up is due to the logarithmic encoding of the bounds compared to an unary encoding in the form of nested next-operators.
A naive solution for the path checking problem of the extended logic would be to simply unfold the formula to the core fragment and then apply the construction described above
EFFICIENT PARALLEL PATH CHECKING FOR LTL WITH PAST AND BOUNDS
The circuit for the bounded formula φ U 3 ψ. Since the red colored subgraph is a K 3,3 , the circuit has no planar embedding. However, if the φ i -gates are constants, then propagating the constants eliminates the edges that prevent the shown embedding from being planar.
for the LTL operators. Because of the doubly exponential blow-up, however, such a solution would no longer be in NC. If we instead apply the expansion laws for the bounded operators to the original formula, we obtain a circuit of polynomial size, but with a more complex structure. Because of this more complex structure, the path checking construction that we described above for the core logic is no longer applicable. Consider the circuit corresponding to the bounded formula φ U 3 ψ, shown in Figure 3 : Since the graph of the circuit contains a K 3,3 subgraph, it has no planar embedding. Translating a formula with bounded operators to a tree of circuits would thus include non-planar circuits, which in general cannot be evaluated efficiently in parallel. The key insight of the construction for the extended logic is that, although the circuit for the bounded operators is not planar a priori, an equivalent planar circuit can be constructed as soon as one of the direct subformulas has been evaluated. Suppose, for example, that the φ i -gates in the circuit shown in Figure 3 are constants. Propagating these constants eliminates all edges that prevent the shown embedding from being planar! In general, simple propagation is not enough to make the circuit planar. This is illustrated in Figure 4 , where the same formula is analyzed under the assumption that the ψ i -gates are constant. While the propagation of the constants replaces parts of the circuit (identified by the dotted lines) with constants, there remain references to φ i -gates, e.g., the two references to φ 2 , that prevent the shown embedding from being planar. However, an equivalent planar circuit exists: This circuit, shown in Figure 4 as a gray overlay, replaces the disturbing references to the φ i -gates by vertical edges to subcircuits. For example, the first occurrence of φ 2 in
) is replaced with an edge to the subcircuit φ 2 ∧ (0 ∨ (ψ 3 ∧ 1)). The resulting circuit is equivalent, because the additional conjunct is redundant. Based on these observations, we present a translation from bounded temporal formulas to circuits that is guaranteed to produce planar circuits, but requires that one of the direct subformulas has already been evaluated. To meet this requirement, our path checking algorithm generates the circuits on-the-fly: a circuit for a subformula φ is constructed only when a direct subformula of φ is already evaluated. In this way, we avoid the construction of circuits that cannot be evaluated efficiently in parallel. As in the algorithm for LTL, we evaluate a fixed portion of the subformulas in every sequential step and thus terminate in time logarithmic in the size of the formula (bounds are encoded in O(1)) plus the length of the path. We prove the following result:
Theorem 3. The BLTL+Past path checking problem is in AC 1 (logDCFL).
The remainder of the paper is structured as follows. After preliminaries in Section 2, Section 3 discusses the evaluation of monotone Boolean circuits. We introduce transducer circuits, which are circuits with a defined interface of input and output gates, and show that the composition of two planar transducer circuits can be computed in logDCFL. In Section 4, we describe the on-the-fly translation of BLTL+Past-formulas to planar circuits. In Section 5, we present the parallel path checking algorithm. We conclude with pointers to open questions in Section 6.
Preliminaries
2.1. Linear-Time Temporal Logic. We consider linear-time temporal logic (LTL) with the usual finite-path semantics, which includes a weak and a strong version of the Next operator [24] . Let P be a set of atomic propositions. The LTL formulas are defined inductively as follows: every atomic proposition p ∈ P is a formula. If φ and ψ are formulas, then so are
The size of a formula φ is denoted by φ . LTL formulas are evaluated over computation paths. A path ρ = ρ 0 , . . . , ρ n−1 is a finite sequence of states where each state ρ i for i = 0, . . . , n − 1 is a valuation ρ i ∈ 2 P of the atomic propositions. The length of ρ is n and is denoted by ρ . Given an LTL formula φ, a nonempty path ρ satisfies φ at position i (0 ≤ i < ρ ), denoted by (ρ, i) |= φ, if one of the following holds:
An LTL formula φ is satisfied by a nonempty path ρ (denoted by ρ |= φ) iff (ρ, 0) |= φ. By φ(ρ) we denote the Boolean sequence s ∈ B ρ with s i = 1 if and only if (ρ, i) |= φ for 0 ≤ i < ρ .
An LTL formula φ is said to be in positive normal form if in φ only atomic propositions appear in the scope of the symbol ¬. The following dualities ensure that each LTL formula φ can be rewritten into a formula φ in positive normal form with φ = O( φ ).
The semantics of LTL implies the expansion laws, which relate the satisfaction of a temporal formula in some position of the path to the satisfaction of the formula in the next position and the satisfaction of its subformulas in the present position:
We now extend LTL with the past-time operators Y ∃ (strong Yesterday), Y ∀ (weak Yesterday), S (Since), and T (Trigger) with the following semantics:
We call the resulting logic linear-time temporal logic with past (LTL+Past). The following dualities ensure that each LTL+Past formula φ can be rewritten into a formula φ in positive normal form with φ = O( φ ).
The expansion laws for the past operators are
To obtain linear-time temporal logic with past and bounds (BLTL+Past) we further add the bounded temporal operators U b , R b , S b , and T b , where b ∈ N is any natural number.
(For technical reasons, the size of a formula is defined using unary encoding for the bounds. However, our results are actually indepent of the encoding of the bounds.) The semantics of the bounded operators is defined as follows:
The following dualities apply to the BLTL+Past operators:
The expansion laws for the bounded operators are defined as follows for b ∈ N:
φ r for b = 0, and
EFFICIENT PARALLEL PATH CHECKING FOR LTL WITH PAST AND BOUNDS 9
We are interested in determining if a formula is satisfied by a given path. This is the path checking problem.
Definition 2.1 (Path Checking Problem). The path checking problem for LTL (LTL+Past, BLTL+Past) is to decide, for an LTL (LTL+Past, BLTL+Past) formula φ and a nonempty path ρ, whether ρ |= φ.
Later in this paper we will present a path checking algorithm for BLTL+Past. The algorithm constructs a circuit that is of polynomial size in the length of the input computation path and in the size of the input formula including the sum of the bounds. However, we do not want the complexity of the algorithm to depend on the encoding of the bounds. The following lemma allows us to prune the size of the bounds that occur in a BLTL+Past formula to the length of the computation path.
Lemma 2.2. Given a BLTL+Past formula φ and a finite computation path ρ, the BLTL+Past formula φ is obtained from φ by setting each bound n in φ to min(n, ρ ). It holds that ρ |= φ if and only if ρ |= φ .
Proof. By induction over φ.
Complexity classes within P.
We assume familiarity with the standard complexity classes within P. NC is the class of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. L is the class of problems that can be decided by a logspace restricted deterministic Turing machine. logDCFL is the class of problems that can be decided by a logspace and polynomial time restricted deterministic Turing machine that is additionally equipped with a stack. AC i , i ∈ N, denotes the class of problems decidable by polynomial size unbounded fan-in Boolean circuits of polylogarithmic depth of degree i. AC is defined as i∈N AC i . Throughout the paper, all circuits are assumed to be uniform. Often we use functional versions of complexity classes. Since in our case the output size of the functions is always polynomially bounded we can use a polynomial number of circuits for the corresponding class of decision problems, each for computing a single bit of the output. Thus, in the following we do not explicitly distinguish between decision problems and functional problems [17] . It holds that
Further details can be found in the survey paper by Johnson [17] . Given a problem P and a complexity class C, P is AC 1 Turing reducible to C (denoted as P ∈ AC 1 (C)) if there is a family of AC 1 circuits with additional unbounded fan-in C-oracle gates that decides P . It holds that
For further details on AC 1 reductions, we refer to [29] . 2.3. Parallel Tree Contraction. The path checking algorithm presented in this paper relies on efficient parallel tree contraction. Here we follow the approach of [1] and [19] . A rooted binary tree is called regular if all inner nodes have exactly two children. Let T 0 = V 0 , E 0 be an ordered, rooted, regular, binary tree. A contraction step on T i takes a leaf l of T i , its sibling s, and its parent p and contracts these nodes into a single node s in the tree T i+1 = V i+1 , E i+1 with
, and
Using the fact that a contraction step is a local operation it is possible to perform contraction steps in parallel on non-overlapping subtrees. A tree contraction on an ordered, rooted, regular, binary tree T is a process that iteratively applies contraction steps on the tree T until it is contracted into a singleton tree. Algorithm 2.3 from [18] performs a tree contraction in log n stages of parallel contraction steps. Algorithm 2.3. Input: an ordered, rooted, regular, binary tree T with n leaves. Effect: contracts T into a singleton tree.
Number the leaves in order from left to right as 1, . . . , n. for log n iterations do Apply the contraction step to all odd numbered leaves that are the left child of their parent.
Apply the contraction step to all odd numbered leaves that are the right child of their parent. Update the numbering of the remaining (even numbered) leaves by dividing each leaf number by two. end for
The algorithm can be implemented on an exclusive read exclusive write random access memory machine (EREW PRAM) such that it runs in time O(log n) with a total work of O(n) [18] . It is well known that problems that can be solved on an EREW PRAM in time O(log n) with polynomial total work are contained in AC 1 [29] . Figure 5 shows a tree contraction process for an example tree.
In order to use the parallel tree contraction algorithm to compute some function f on a labeled tree, the contraction step is piggybacked with a local operation on the labels of the nodes involved in the contraction step. In order for f to be in AC 1 , the individual contraction steps must be performed in constant time. For our constructions this is not the case. However, by piggybacking the contraction step with operations that for some complexity class C are solvable with C-oracle gates, the problem of computing f is AC 1 -reduced to C. Hence, by showing that the complexity of the contraction step is in C, the overall complexity of f is proven to be in AC 1 (C).
Monotone Boolean Circuits
A monotone Boolean circuit Γ, γ consists of a set Γ of gates and a gate labeling γ. For a circuit G = Γ, γ , const(G) denotes the set of all constant gates in Γ. If Γ = const(G), we call G constant. By var(G) the set of all variable gates of Γ is denoted. Finally we define src(G) to be the set of all variable gates and all constant gates that are not sink gates in Γ. In the following, we assume that all circuits are monotone Boolean circuits. We omit the labeling whenever it is clear from the context and identify the circuit with its set of gates.
3.1. Circuit evaluation. The evaluation of a circuit Γ, γ is the (unique) circuit Γ, γ where for each gate g ∈ Γ the following holds:
• γ (g) = 0 iff γ(g) = and, l, r and γ (l) = 0 or γ (r) = 0, • γ (g) = 1 iff γ(g) = and, l, r and γ (l) = 1 and γ (r) = 1, • γ (g) = id, l iff γ(g) = and, l, r and γ (l) ∈ {0, 1} and γ (r) = 1, • γ (g) = id, r iff γ(g) = and, l, r and γ (r) ∈ {0, 1} and γ (l) = 1, • γ (g) = 0 iff γ(g) = or, l, r and γ (l) = 0 and γ (r) = 0, • γ (g) = 1 iff γ(g) = or, l, r and γ (l) = 1 or γ (r) = 1, • γ (g) = id, l iff γ(g) = or, l, r and γ (l) ∈ {0, 1} and γ (r) = 0, • γ (g) = id, r iff γ(g) = or, l, r and γ (r) ∈ {0, 1} and γ (l) = 0, • γ (g) = γ (s) iff γ(g) = id, s and γ (s) ∈ {0, 1}, and
A circuit is evaluated if all constant gates are sink gates. In an evaluated circuit, all gates that do not depend on variable gates are constant. Hence, a circuit without any variable gates evaluates to a constant circuit; for a circuit that contains variable gates, a subset of the gates is relabeled: some and -/or -/id -gates are labeled as constant or id -gates.
The problem of evaluating monotone planar circuits has been studied extensively in the literature [13, 10, 20, 4, 25, 6] . Our construction is based on the evaluation of one-input-face planar circuits: Given a circuit G = Γ, γ with variable gates X, the graph gr(G) of G is the directed graph Γ, E , where E = { a, b ∈ Γ × Γ | a · b}. A circuit C is planar if there exists a planar embedding of the graph of C. A planar circuit G is one-input-face if there is a planar embedding such that all gates of src(G) are located on the outer face. Note that an evaluated planar circuit with all variable gates on the outer face is one-input-face. The evaluation of one-input-face planar circuits can be parallelized efficiently. We make use of a result by Chakraborty and Datta [6] : Using standard techniques [20] , the theorem generalizes to circuits that contain variable gates:
Corollary 3.2. The problem of evaluating an one-input-face planar circuit is in logDCFL.
Proof. We first assign the Boolean constant 1 to all variable gates. Each gate that evaluates to 0 is turned into a 0 constant gate. Next, we assign 0 to all variable gates. Each gate that evaluates to 1 is turned into a constant gate with value 1. Since the values of the remaining gates depend on the variables, they are simply copied. If one of the latter gates depends on a constant gate, the dependency is removed by changing such a gate into an id -gate.
3.2.
Transducer Circuits. The central construction in our path checking algorithm is circuit composition: circuits for larger subformulas are built from circuits for smaller subformulas by connecting variable gates of one circuit to gates of another circuit. To facilitate this operation, we introduce transducer circuits, which are circuits with a defined interface of input and output gates that allow the circuit to transform a sequence of Boolean input values, for example the values of a subformula at different positions of the path, into a sequence of output values.
A transducer circuit is a tuple T = Γ, γ, I, O where G = Γ, γ is a circuit, I is a (strict) ordering of var(G), and O is a (strict) ordering of a subset of Γ. I is called the input of T and O is called the output of T . The input and output arity is the length of the input and output, denoted as I and O , respectively. We denote the i th element of I and O by I(i) and O(i), respectively. The transducer circuit T is planar if G has a planar embedding such that the gates of I appear counter-clockwise ordered on the outer face, the gates of O appear clockwise ordered on the outer face, and between any two gates of I on the outer face there are either no or all gates of O, i.e., the gates of I and O do not appear interleaved on the outer face.
Given two planar transducer circuits
δ(g), for g ∈ ∆ \ var(∆), and
The composition G • D can be computed by a logspace restricted deterministic Turing machine. A transducer circuit T represents a function f T : B I → B O , where f T (s) for some sequence s ∈ B I is computed by evaluating the composition of T with the constant circuit that represents s. The values of the output gates of the resulting constant circuit define the sequence f T (s). 
Constructing Circuits On-The-Fly
We now describe the translation of BLTL+Past-formulas in positive normal form to planar circuits. As discussed in the introduction, the translation is not done as a preprocessing step, but rather delayed until one of the direct subformulas has been evaluated. We guarantee that the resulting circuit is planar, one-input-face, and evaluated. The path checking algorithm, which will be presented in the next section, composes the evaluated one-input-face planar circuits in order to represent larger partially evaluated subformulas.
Given a path ρ and a BLTL+Past formula φ in positive normal form with at most one unevaluated direct subformula, the following construction provides a function cir ρ that maps the top-level operator of φ and its evaluated subformulas to an evaluated one-inputface planar transducer circuit that represents a partial evaluation of φ on ρ. The output arity of the circuit is ρ , the input arity is ρ for all formulas except for atomic propositions, where the circuit has input arity 0. The circuit can be constructed by a logspace restricted Turing machine. The full details of the construction are provided in the appendix.
Atomic propositions.
For an atomic proposition p, the circuit is a set of constant gates, one for each path position. The value of a gate is the value of p at the respective position of ρ: cir ρ (p) = {o 0 , . . . , o n−1 } , l, ε, O , where n = ρ , O = o 0 , . . . , o n−1 , l(o i ) = 1 iff p ∈ ρ i , and ε denotes an empty input sequence. Clearly, a set of constant gates is an evaluated one-input-face planar transducer circuit.
Unary operators.
For the unary operators X ∃ , X ∀ , Y ∃ , and Y ∀ , the circuit shifts the value of the input by one position in the respective direction. The first (respective last) position of the output is a constant with value 0 for strong operators and value 1 for weak operators. Again, the circuits are obviously planar, one-input-face, and evaluated, and of input and output arity ρ . E.g. cir ρ (X ∃ ) = G, l, I, O , where n = ρ , I = v 0 , . . . , v n−1 , O = o 0 , . . . , o n−1 , G = {v 0 , . . . , v n−1 } ∪ {o 0 , . . . , o n−1 }, and
Binary operators.
The binary operators require two constructions, one for the case where the left argument has been evaluated and one for the case where the right argument has been evaluated. For each operator op, we define two logspace-computable functions cir ρ (s, op) and cir ρ (op, s), which compute the circuit given an evaluation s ∈ B ρ of the left and right subformula, respectively. For the Boolean operators, the two functions are the same, e.g., cir
id , v i for s i = 0 and 0 ≤ i < n, 1 for s i = 1 and 0 ≤ i < n, and l(v i ) = ? for 0 ≤ i < n.
For the unbounded temporal operators, the constructions are derived from the expansion laws of the logic, such as φ l U φ r ≡ φ r ∨(φ l ∧X ∃ (φ l U φ r )) for the unbounded Until operator. The expanded formula is transformed into a transducer circuit by substituting constants for evaluated subformulas and variable gates for unevaluated subformulas. E.g. cir ρ (U, s) = G, l, I, O , where
and , v i , o i+1 for 0 ≤ i < n − 1 and s i = 0, 1 for 0 ≤ i < n − 1 and s i = 1, s n−1 for i = n − 1, and
and cir ρ (s, U) = G, l, I, O , where
for i = n − 1, and
The most difficult part of the construction is the translation for the bounded operators, which we now present in detail for the bounded Until operator U b . Figure 6 illustrates the construction of cir ρ (s, U b ) for a valuation s = 0, 1, 0, 1, 1, 1, 0, 1 of the left subformula. The gates indexed by i, j compute the value of the formula at position i and "remaining" bound b − j. If, at some position, the left subformula evaluates to 0, then the formula simplifies to the right subformula, independently of the remaining bound. This results in vertical edges in the circuit. If the left subformula evaluates to 1, then the formula is true if it is either true for bound j − 1 in position i + 1 or for bound j − 1 in position i. In the circuit, this is computed as a disjunction of the vertical and the diagonal neighbor. We define cir ρ (s,
for 0 ≤ i < n − 1 and j < b and s i = 0, or , v i,j+1 , v i+1,j+1 for 0 ≤ i < n − 1 and j < b and
for i = n − 1 and j < b, and ?
for j = b.
The construction of cir ρ ( U 3 , s) for the valuation s = 0, 1, 0, 0, 0, 0, 0, 1 of the right subformula is illustrated in Figure 7 . Here, the gates indexed by i compute the value of the formula at position i. for 0 ≤ i < n and s i = 1, 0 for 0 ≤ i < n and ∀j, i ≤ j < min(i + b, n).s j = 0, and , v i , o i+1 for 0 ≤ i < n − 1, s i = 0, ∃j, i < j < min(i + b, n).s j = 1, and l(v i ) = ? for 0 ≤ i < n.
We conclude the section with a lemma that formally states the existence of the logspacecomputable function cir ρ with the required properties. The complete construction of cir ρ is provided in the appendix.
Lemma 4.1. Let φ and ψ formulas and p an atomic proposition. Let ρ a path and s, t ∈ B ρ with s = φ(ρ) and t = ψ(ρ). There is an logspace-computable function cir ρ mapping its arguments to evaluated one-input-face planar transducer circuits such that
Parallel Tree Contraction for Path Checking
The parallel path checking algorithm for BLTL+Past formulas is based on a bottom-up evaluation of the formula converted to positive normal form starting with the atomic propositions. The central data structure is a binary tree, called the contraction tree, that keeps track of the dependencies between the different evaluation steps. Initially, the contraction tree corresponds to the formula tree where each unary node (due to X ∃ , X ∀ , Y ∃ , and Y ∀ operators) has been merged into a single node with its unique child node. The evaluation of the formula is performed by contraction steps, which contract a node that has already been evaluated with its parent into a new edge from its sibling to its parent. The resulting edge is labeled by a planar circuit that represents the partially evaluated subformula.
Since no child needs to wait for the evaluation of its sibling before it can be contracted with its parent, a constant portion of the nodes can be contracted in parallel, and, within logarithmic time, the tree is evaluated to a single constant circuit. We now describe and analyze this process in more detail.
5.1. Contraction tree. Given a formula φ in positive normal form and a path ρ, let φ 0 , . . . , φ m−1 be the subformulas of φ with φ 0 = φ. A contraction tree is an edge labeled tree T = T, t, l where T ⊆ {φ 0 , . . . , φ m−1 } ∪ {root}, t ⊆ { φ i , φ j | φ j is a subformula of φ i } ∪ { root, φ }, and l is a mapping that labels each edge of T with an evaluated one-input-face planar transducer circuit, such that the following conditions hold: (1) T \ {root} is an ordered, rooted, regular, binary tree, (2) all edge labels of T are evaluated one-input-face planar transducer circuits of arity ρ , all leaves are atomic propositions, and (3) for τ = φ i , φ j ∈ t it holds that f l(τ ) (φ j (ρ)) = ψ(ρ), where ψ is the direct subformula of φ i that has φ j as a subformula. Further, for the unique edge τ = root, φ j ∈ t it holds that f l(τ ) (φ j (ρ)) = φ(ρ).
The special node root and the corresponding edge root, φ were added solely for technical reasons.
The first condition ensures that the overall contraction process performs in a logarithmic number of parallel steps. The second condition provides the preconditions for a single contraction step. Namely, the compositionality of the constructed circuits and the complexity of logDCFL. The third condition states the induction hypothesis for the soundness of the whole algorithm: When a transducer circuit is attached to an edge of the contraction tree, it encodes the semantics of all partially evaluated subformulas contracted into that edge.
5.2. Initialization step. The initial contraction tree T \ {root} is the formula tree φ where each unary node (due to X ∃ , X ∀ , Y ∃ , and Y ∀ operators) has been merged with its unique child node. The corresponding new parent edge is labeled by the transducer circuit that results from composing the circuits produced by applying cir ρ to the corresponding BLTL+Past operators of the eliminated nodes.
Lemma 5.1. Given a formula φ in positive normal form and a path ρ, a contraction tree T can be constructed from φ and ρ by an logspace restricted Turing machine.
Proof. Define parent(χ) to be the subformula ψ of φ such that χ is the maximal subformula of ψ in φ. Let T = T, t, l with T = {φ i | φ i is not of the form X ∃ ψ,
where id is the identity transducer circuit of arity ρ . In T , all simple paths (due to X ∃ , X ∀ , Y ∃ , and Y ∀ operators) have been collapsed into single edges. This ensures that T \ {root} is an ordered, rooted, regular, binary tree. The circuits cir ρ (X ∃ ), cir ρ (X ∀ ), cir ρ (Y ∃ ), and cir ρ (Y ∀ ) are evaluated one-input-face planar transducer circuits that do not contain any constants. Hence, any number of these circuits can be composed resulting in an evaluated one-input-face planar transducer circuit. The composition of planar transducer circuits can be performed in logarithmic space. The mapping c is defined recursively above. However, it is easy to see that the whole procedure can be performed iteratively in logarithmic space in the size of ρ plus the size of φ. From the above, the first and the second condition for a contraction tree are clear. The third condition is obtained by applying Lemma 4.1 to the construction. 5.3. Contraction step. In the following, we describe the contraction of the tree T . During a contraction step, a node that is labeled by a constant circuit is merged with its parent node. The resulting node is contracted into the edge from its sibling to its grandparent.
Lemma 5.2. Let φ i a node of a contraction tree T with child nodes φ j and φ k and parent node p. Assume φ j to be a leaf. Let s be the evaluation of cir ρ (φ j ) • l( φ i , φ j ). Let T = T , t , l , where
T is a contraction tree and can be computed in logDCFL.
Proof. First, note that by construction of T it holds that φ i , φ j , φ k = root. Clearly, if T \{root} is an ordered, rooted, regular, binary tree then T \{root} is an ordered, rooteted, regular, binary tree, as well. By construction of T a leaf in T is a leaf in T as well. Thus, because T is a contraction tree, each leaf in T is an atomic proposition. Since φ j is a leaf φ j is an atomic proposition. Due to Lemma 4.1 cir ρ (φ j ) can be composed with l( φ i , φ j ) resulting in an one-input-face planar circuit that can be evaluated in logDCFL. Thus s is a constant circuit of arity ρ and cir ρ (f s (), φ i ) (respectively cir ρ (φ i , f s ())) is well defined and of arity ρ . Because T is a contraction tree and by Lemma 3.3, l ( p, φ k ) is an evaluated one-input-face planar transducer circuit. By the definition of • the input arity of l ( p, φ k ) is the input arity of l( φ i , φ k ) and the output arity of l ( p, φ k ) is the output arity of l( p, φ i ). Because T is a contraction tree this arity is ρ in both cases. All remaining edge labels of T inherit the arities from T . Considering the edge p, φ k ∈ t , the third condition for a contraction tree holds, since T is a contraction tree, and due to the definition of •, and because of Lemma 4.1. For all other edges, the property is directly inherited from T . The computation of T is in logDCFL because of Lemma 3.3 and Lemma 4.1.
5.4.
The path checking algorithm. Applying Lemma 5.1 and Lemma 5.2 to φ and ρ, we can use Algorithm 2.3 to obtain an AC 1 (logDCFL) solution to the path checking problem. This proves our main theorem: Theorem 3. Given a BLTL+Past formula φ and a path ρ, convert φ into positive normal form using only logarithmic space. A contraction tree T is initialized from φ in logarithmic space by use of Lemma 5.1 and then Algorithm 2.3 is applied to T with the contraction step defined in Section 5.3. Note that the extra root node and the edge root, φ in T do not influence the performance of Algorithm 2.3. The algorithm terminates when there is only a single leaf node n and a single edge root, n left in the contraction tree. By Lemma 5.2, the contraction algorithm performes in AC 1 (logDCFL). The value of the first output gate of the evaluation of the circuit c = cir ρ (n) • l( root, n ) is the result. By Lemma 5.2, c is evaluated and one-input-face and can hence be evaluated in logDCFL. The whole construction can be executed within AC 1 (logDCFL) in φ + ρ . Using Lemma 2.2, we can assume that any bound occurring in φ has at most size ρ . The sum of the bounds is thus polynomial in the size of φ (without bounds) and the length of ρ. Thus, the overall complexity of AC 1 (logDCFL) is independent of the encoding of the bounds in φ.
Since LT L and LT L + P ast both are subsets of BLT L + P ast Theorem 1 and Theorem 2 are obtained as corollaries of Theorem 3.
Conclusions
We have presented a positive answer to the question whether LTL can be checked efficiently in parallel on finite paths by giving an AC 1 (logDCFL) algorithm for checking formulas of the extended logic BLTL+Past over finite paths. This result is a significant step forward in the research program towards a complete picture of the complexities of the path checking problems across the spectrum of temporal logics, which was started in 2003 by Markey and Schnoebelen [26] . While other extensions of LTL, for example with Chop or Past+Now, immediately render the path checking problem P-complete and, hence, inherently sequential [26] , LTL with past and bounds can be checked efficiently in parallel.
There is a growing practical demand for efficient parallel algorithms, driven by the increasing availability of powerful (and inherently parallel) programmable hardware. For example, tools that translate PSL assertions to hardware-based monitors [7, 5, 11] can immediately apply our construction to evaluate subformulas consisting of bounded and past operators in parallel rather than sequentially. Similarly, monitoring tools based on LTL+Past can buffer constant chunks of the input and then evaluate the buffered input in parallel using our construction.
The capability of our algorithm to absorb the exponential succinctness of past and bounds is due to the use of planar circuits as a representation of partially evaluated subformulas, which allows the evaluation of the formula to efficiently stop and resume, as dictated by the dependencies between the subformulas. We expect that the use of planar circuits as a data structure in parallel verification algorithms, following the pattern of our construction, will find applications in other model checking problems as well.
There are several open questions that deserve further attention. There is still a gap between AC 1 (logDCFL) and the known lower bound, NC 1 . There is some hope to further reduce the upper bound towards NC 1 , the currently known lower bound, because our construction relies on the algorithm by Chakraborty and Datta (cf. Theorem 3.1) for evaluating monotone Boolean planar circuits with all constant gates on the outer face. The circuits that appear in our construction actually exhibit much more structure. However, we are not aware of any algorithm that takes advantage of that and performs better than logDCFL. An intriguing question along the way is whether the path checking complexities of LTL and BLTL+Past are actually the same: while they are both in NC, the circuits resulting from BLTL+Past formulas seem to be combinatorially more complex. Finally, an interesting challenge is to exploit the apparent "cheapness" of the BLTL+Past path checking problem beyond parallelization, for example in memory-efficient algorithms.
