# Axiomatizing proof tree concepts in Bounded Arithmetic 

Satoru Kuroda<br>Gunma Prefectural Women's University<br>1395-1, Kaminote, Tamamura, Gunma, 370-1193, Japan<br>satoru@gpwu.ac.jp


#### Abstract

We construct theories of Cook-Nguyen style two-sort bounded arithmetic whose provably total functions are exactly those in LOGCFL and LOGDCFL. Axiomatizations of both theories are based on the proof tree size characterizations of these classes. We also show that our theory for LOGCFL proves a certain formulation of the pumping lemma for context-free languages.


1998 ACM Subject Classification F.4.1 Mathematical Logic
Keywords and phrases Bounded Arithmetic, LOGCFL, LOGDCFL, Proof tree

Digital Object Identifier 10.4230/LIPIcs.CSL.2012.440

## 1 Introduction

The complexity classes LOGCFL and LOGDCFL are subclasses of P which gain a large popularity in the complexity theory community during the last two decades. The class LOGCFL consists of predicates logspace reducible to context free languages and LOGDCFL is defined in the same manner as LOGCFL but with deterministic context free languages.

LOGCFL lies between NL and $A C^{1}$ and LOGDCFL contains $L$ but we do not know whether any of these inclusions is proper or not. Furthermore, we do not know the relationship between NL and LOGDCFL.

One of the main feature of LOGCFL is that it has several natural alternative characterizations such as NAuxPDA [9], alternating Turing machines [8] and semi-unbounded fan-in circuits [10]. Also LOGCFL has natural complete problems such as word problems for finite groupoids and acyclic conjunctive queries [4]. Moreover, it is worth noting that LOGCFL is closed under many operations including the closure under complementation [2].

On the contrary, much less is known about the class LOGDCFL. In particular, no natural complete problem is found so far.

In this paper we construct theories of bounded arithmetic for LOGCFL and LOGDCFL using their proof tree size characterizations. To author's knowledge, no such theory for LOGDCFL has been proposed so far. For LOGCFL, the author [6] defined a theory $V-Q^{S A C}$ which axiomatize the concept of $S A C^{1}$ circuits. However, the theory has an augmented language which represents generalized quantifier expressing that a $S A C^{1}$ circuit has an accepting tree on an input.

Our approach is very similar to the one developed by Cook and Nguyen [3], namely augmenting the theory $V^{0}$ by axioms expressing the concept of a given complexity class.

The concepts we use for constructing theories for LOGCFL and LOGDCFL are somewhat similar. That is, both concepts are based on circuits having polynomial proof tree size.

For LOGDCFL, McKenzie et.al. [7] proved that the class is identical to the class of predicates decidable by polynomial size Multiplex-Select circuits with polynomial proof tree

size. For LOGCFL, a similar and simpler characterization is known by Venkateswaran [10] as the class of polynomial size Boolean circuits having polynomial proof tree size.

Although these two characterizations have a similar nature, we encounter with a slightly different circumstance when axiomatizing them. Since LOGDCFL is a deterministic class, there exists a unique witness, namely a proof tree for an accepting input to a circuit, whereas, there may exist more than one polynomial size proof trees for a general circuit which witnesses LOGCFL predicates. Such a difference causes a difficulty in the witnessing argument of the corresponding theories. However, we show that a slight modification of the argument of Cook and Nguyen implies witnessing theorems for both theories.

We also show that our theory VLCFL for LOGCFL is finitely axiomatizable by formalizing LOGCFL complete problems.

Cook proposed a research program called Bounded Reverse Mathematics whose aim is to decide how much computational concepts are needed to prove mathematical theorems. This is achieved by deciding which theorems of mathematics are provable in a given bounded arithmetic theory (See [3] for an exposition). For this direction, we show that VLCFL proves a form of the pumping lemma for context-free languages.

The essential part of formalizing the standard proof of the pumping lemma in VLCFL is to translate parse trees into proof trees of Boolean circuits so that we can argue about context-free languages in terms of circuits.

By carefully examining the textbook style proof of the pumping lemma, it turns out that proof also uses combinatorial arguments including the pigeonhole principle and PATH problem which is NL complete. Moreover, the inductive argument in the proof can be expressed as a number induction for LOGCFL decidable predicate.

## 2 Preliminaries

Throughout the paper we deal with two-sort theories and complexity classes. So first we briefly review their basic notions.

Two-sort logic uses two types of variables; number variables denoted by lower case letters $x, y, z \cdots$ represent natural numbers while string variables denoted by upper case letters $X, Y, Z \cdots$ represent binary strings.

### 2.1 Two-sort complexity classes

A two-sort function is a function with two sort of arguments $\bar{x}$ and $\bar{X}$. A function $F(\bar{x}, \bar{X})$ is a string function if its range is in strings. A function $f(\bar{x}, \bar{X})$ is a number function if its range is in numbers. As usual, we denote strings functions by uppercase letters and number functions by lower case letter. Predicates are identified with 0-1 valued number functions.

In the two-sort formalization, number variables play subsidiary roles, so in defining complexity classes, their measures are with respect to the length of string parameters.

Let $R\left(x_{1}, \ldots, x_{k}, X_{1}, \ldots, X_{n}\right)$ be a two-sort relation. As in [3], number variables $x_{1}, \ldots, x_{k}$ are presented in unary and string variables $X_{1}, \ldots, X_{n}$ are in binary form.

We define two sort complexity classes LOGCFL and LOGDCFL as follows:

- Definition 1. LOGCFL is the class of predicates which are logspace reducible to some context-free languages. LOGDCFL is the class of predicates which are logspace reducible to some deterministic context-free languages.


### 2.2 Two-sort theories

The language $L_{2}$ of two sort theories contains the symbols $Z(x), x+y, x \cdot y,|X|, x=y$ and $x \in Y$. We also write $x \in Y$ as $Y(x)$. We use two sort of quantifiers; number quantifiers $(\forall x)$ and $(\exists x)$ and string quantifiers $(\forall X)$ and $(\exists X)$. Bounded number quantifiers are of the form $(\forall x<t)$ and $(\exists x<t)$. String bounded quantifiers are of the form

$$
(\forall X<t) \varphi(X) \equiv(\forall X)(|X|<t \rightarrow \varphi(X)), \text { and }(\exists X<t) \varphi(X) \equiv(\exists X)(|X|<t \wedge \varphi(X)) .
$$

An $L_{2}$ formula is called bounded if all its quantifiers are either bounded number quantifiers or bounded string quantifiers. The class $\Sigma_{0}^{B}$ consists of $L_{2}$ formulae whose quantifiers are bounded quantifiers only. The class $\Sigma_{1}^{B}$ consists of $L_{2}$ formulae whose quantifiers are bounded quantifiers, positive occurrences of existential bounded string quantifiers, and negative occurrences of universal bounded string quantifiers.

Our base theory is the following theory:

- Definition 2. The $L_{2}$ theory $V^{0}$ consists of the following axioms:
- $B A S I C_{2}$ : finite number of defining axioms for symbols in $L_{2}$,
- extensionality axiom : $X=Y \leftrightarrow(|X|=|Y| \wedge(\forall i<|X|)(X(i) \leftrightarrow Y(i))$,
- $\Sigma_{0}^{B}$-COMP : $(\exists X)(\forall y<t)(X(y) \leftrightarrow \varphi(y))$, where $\varphi \in \Sigma_{0}^{B}$ does not contain $X$ as a free variable.

Definition 3. A function $F(\bar{x}, \bar{X})$ is $\Sigma_{1}^{B}$ definable in a $L_{2}$-theory $T$ if there exists a $\Sigma_{1}^{B}$ formula $\varphi(\bar{x}, \bar{X}, Y)$ such that

- $T \vdash(\forall \bar{x})(\forall \bar{X})(\exists!Y) \varphi(\bar{x}, \bar{X}, Y)$, and
- $Y=F(\bar{x}, \bar{X}) \leftrightarrow \varphi(\bar{x}, \bar{X}, Y)$ holds in the standard model.
- Theorem 4. A function is $\Sigma_{1}^{B}$ definable in $V^{0}$ if and only if it is computable in $A C^{0}$.

The following property is also well-known and useful:

- Theorem 5. Let $F$ be a $\Sigma_{1}^{B}$ definable function of $V^{0}$ and $V^{0}(F)$ be the theory $V^{0}$ extended by the function symbol for $F$ together with its defining axioms. Then $V^{0}(F)$ is an conservative extension of $V^{0}$.

So we can use any $\Sigma_{1}^{B}$ definable function in $V^{0}$ without increasing its strength. In particular, the following functions are $\Sigma_{1}^{B}$ definable in $V^{0}$ which will be used in elsewhere in this paper:

- The number pairing function : $\langle x, y\rangle=\frac{(x+y+1)(x+y)}{2}$.
- The number $\mu$-operator :

$$
\mu_{z<y} \varphi(x)= \begin{cases}\text { the least } x<y \text { with } \varphi(x) & \text { if exists } \\ 0 & \text { otherwise } .\end{cases}
$$

- The value of $X$ at $y:(X)^{y}=\mu_{y<a} X(\langle y, z\rangle)$ for $|X|<\langle a, b\rangle$.
- The row function : $Z^{[x]}(i) \leftrightarrow(i<|Z| \wedge Z(x, i))$.

The main tool of this paper is the method for constructing theories for subclasses of $P$ using complete problems which is developed by Cook and Nguyen. Details of this method can be found in [3], so we briefly overview the argument.

Suppose a function $F$ has a $\Sigma_{0}^{B}$ graph as

$$
Y=F(X) \leftrightarrow\left(|Y| \leq t \wedge \delta_{F}(X<Y)\right)
$$

for some $L_{2}$ term $t$ and $\Sigma_{0}^{B}$ formula $\delta_{F}$. Suppose further that $V^{0}$ proves the uniqueness of the value of $F$. Let C be the class of relations which are $A C^{0}$ reducible to $F$. In this case, we define a theory for C as follows:

## S. Kuroda

- Definition 6. The $L_{2}$-theory VC is axiomatized by the axioms of $V^{0}$ and the following axiom:

$$
(\exists Y \leq b)(\forall i<b) \delta_{F}\left(X^{[i]}, Y^{[i]}\right)
$$

Below we define theories with defining axiom for the function $F$ as above and the equivalence to the theory VC is ensured by showing that the aggregate function is $\Sigma_{1}^{B}$ definable within the theory.

- Definition 7 (Aggregate Function). Suppose that $F(\bar{x}, \bar{X})$ is such that for some $L_{2}$ term $t$

$$
|F(\bar{x}, \bar{X})| \leq t(\bar{x}, \bar{X})
$$

Then $F^{*}(b, \bar{Z}, \bar{X})$ is a function such that

$$
\left|F^{*}(b, \bar{Z}, \bar{X})\right| \leq\langle b, t(|\bar{Z}|, \bar{X})\rangle
$$

and

$$
F^{*}(b, \bar{Z}, \bar{X})(w) \leftrightarrow(\exists i<b)(\exists v<t)\left(w=\langle i, v\rangle \wedge F^{*}\left(b,\left(Z_{1}\right)^{i}, \ldots,\left(Z_{k}\right)^{i}, X_{1}^{[i]}, \ldots, X_{n}^{[i]}\right)(v)\right.
$$

Finally, a direct connection between VC and FC is established:

- Theorem 8 (Cook-Nguyen). A function is $\Sigma_{1}^{B}$ definable in VC if and only if it is in FC.

We also use universal conservative extension $\widehat{\mathbf{V C}}$ of VC which is defined as follows: let $\overline{V^{0}}$ be the theory $V^{0}$ extended by function symbols for all $A C^{0}$ functions together with their defining axioms. Let $\delta_{F}^{\prime}(X, Y)$ be the quantifier free defining axiom for $F$ which is equivalent to $\delta_{F}(X, Y)$ in $\overline{V^{0}}$ and define the function

$$
Y=F^{\prime}(X) \leftrightarrow\left(|Y| \leq t \wedge \delta_{F}^{\prime}(X, Y)\right)
$$

- Definition 9. $\widehat{\mathbf{V C}}$ is the universal theory over language of $\overline{V^{0}}$ plus $F^{\prime}$ whose axioms are those for $\overline{V^{0}}$ and (*).

Remark. Our theories defined below are slightly different from VC in that we add axiom scheme rather than a single axiom. Nevertheless, the witnessing and definability works by an almost similar argument.

## 3 A theory for LOGDCFL

We will define the theory Vldcfl by formalizing the concept of polynomial proof tree of polynomial size multiplex-select circuits.

A multiplex select circuit is a circuit whose only gates are multiplex select gates. Inputs to a multiplex select gate are grouped into a bundle of $k \in O(\log n)$ steering bits and $2^{k}$ equal size bundles of $l \in O(\log n)$ data bits. The gate outputs the value of the data input bundle $d_{i}$ if the value of the steering input bundle is the binary expression of $i$. For detailed definition, see [7].

A proof tree of a circuit $C$ on input $X$ is a tree $P T(C, X)$ define inductively as follows:

- if $C$ consists of a single gate then $P T(C, X)$ consists of its steering input bundle and the data bundle selected by the steering input on $X$.
- if $C$ consists of more than two gates and $g$ is the output gate of $C$ then $P T(C, X)$ is defined as the tree rooted at $g$ with subtrees $P T\left(C_{s}, X\right)$ and $P T\left(C_{d}, X\right)$ where $C_{s}$ is the subcircuit with output at the steering input of $g$ and $C_{d}$ is the subcircuit with output at the data input selected by $s$ on $X$.

Note that the computation of multiplex select circuits are deterministic in the sense that there exists an unique proof tree for each input.

McKenzie et.al. [7] showed that multiplex select circuits characterize the class LOGDCFL.

- Theorem 10 (McKenzie et.al.). A predicate is in LOGDCFL if and only if it is decidable by $A C^{0}$-uniform poly-size family of multiplex select circuits with polynomial proof tree size.

We code a multiplex select gate by a triple $\langle g, c, k\rangle$ where $g$ is the index of its steering input, $k$ is the number of data inputs, and $c+j$ is the index of $j$-th data input for $j \leq k$.

As usual, a circuit is coded by a two-dimensional array but with semantics which differs from that of Boolean circuits. Recall that a non-input gate in a multiplex-select circuit has two types of inputs; one steering input and $2^{l}$ data inputs where $l$ is the length of the steering input bundle. So we define $A C^{0}$ functions which specifies these inputs.

Let $E$ be a two-dimensional array with $|E|=\langle n, n\rangle$. A gate $x<n$ with no input is regarded as an input data bundle with the data $x$. Otherwise, $x$ is a non-input gate and we define its steering input $s(x, E)$ and the first data input $c(x, E)$ as follows:

$$
\begin{gathered}
s(x, E)=\mu y<x E(x, y), \\
c(x, E)= \begin{cases}\mu y<x(E(x, y) \wedge s(x, E))<y & \text { if it exists }, \\
s(x, E) & \text { otherwise }\end{cases}
\end{gathered}
$$

We omit the parameter $E$ if it is clear from the context.
The output of a multiplex select gate is given by the following rule:
for a gate $x$, if it receives an input $j$ from its steering input $s(x)$ then it outputs the input from the data input of index $\min (c(x)+j, x-1)$.

We use the notion of degrees of gates in a multiplex-select circuit which is defined as:

- Definition 11. Let $E$ be a circuit such that $|E|=\langle n, n\rangle$ and $x<n$. We define the degree $\operatorname{deg}(x, E)$ recursively as

$$
\operatorname{deg}(x, E)=\left\{\begin{array}{l}
1 \quad \text { if } x \text { is an input gate } \\
\operatorname{deg}(s(x))+\operatorname{deg}(\min (c(x)+j, x-1)) \\
\quad \text { if } x \text { is an non-input and } s(x) \text { outputs } j .
\end{array}\right.
$$

It is not difficult to see that a circuit family has polynomial size proof trees if and only if it has polynomial degrees.

We define an axiom expressing the concept of multiplex-select circuits with polynomial degrees. The idea is to code the computation of a given circuit by a string $Z$ such that a

- if degree $(x)$ is bounded by a given polynomial $p(n)$ then $(Z)^{x}=\left\langle v_{x}, \operatorname{deg}(x)\right\rangle$ where $v_{x}$ is the output of $x$, and
- otherwise $(Z)^{x}=\langle n, p(n)\rangle$.

Formally, define the $L_{2}$ formula $\operatorname{MSCDeg}(n, x, E, Z)$ as the conjunction of the following formulae:

- $x$ is an input gate: $(\forall y<x) \neg E(x, y) \rightarrow(Z)^{x}=\langle x, 1\rangle$,
- $x$ is a non-input gate having a degree less than $p(n)$ :

$$
\begin{aligned}
& (\exists y<x) E(x, y) \rightarrow(Z)_{1}^{s(x)}+(Z)_{1}^{\min \left(c(x)+(Z)_{0}^{s(x)}, x-1\right)}<p(n)-1 \wedge \\
& (Z)^{x}=\left\langle(Z)_{0}^{\min \left(c(x)+(Z)_{0}^{s(x)}, x-1\right)},(Z)_{1}^{s(x)}+(Z)_{1}^{\min \left(c(x)+(Z)_{0}^{s(x)}, x-1\right)}\right\rangle,
\end{aligned}
$$

## S. Kuroda

- $x$ is an non-input gate having a degree greater than $p(n)-1$ :

$$
(\exists y<x) E(x, y) \rightarrow(Z)_{1}^{s(x)}+(Z)_{1}^{\min \left(c(x)+(Z)_{0}^{s(x)}, x-1\right)} \geq p(n) \wedge(Z)^{x}=\langle n, p(n)\rangle
$$

For a $L_{2}$-term $p(n)$ we define the formula

$$
M S C D e g-C O M P_{p}:(\forall n)(\forall E<\langle n, n\rangle)(\exists Z\langle n,\langle n, p(n)\rangle\rangle)(\forall x<n) M S C D e g(n, x, E, Z) .
$$

- Definition 12. The $L_{2}$-theory Vldcfl is $V^{0}$ extended by $M S C D e g-C O M P_{p}$ for each $\left.p \in L_{2}\right\}$.

Remark. We have made several simplifications in the formalization of multiplex-select circuits which is still general enough to capture the original ones in the sense that we can effectively compute a code $E(C, X)$ from the original circuit code $C$ and an input $X$. Firstly, we can convert an $O(\log |X|)$ input bit bundle into a number representing a gate index using $A C^{0}$ function. The original definition allows each bundle to be extended by constant bits which also can be computed in $A C^{0}$.

Finally, the restriction of date inputs bit to consecutive ones as $c(x), \ldots c(x)+j$ can express the original circuit as follows. Let $d$ be the $i$-th data input to $x$. We introduce a "copy" gate $C_{d}$ of $d$ having a single bit steering input and a single data input from $d$. Then we let the gate position of $C_{d}$ to be $c(x)+i$.

- Theorem 13. A function is $\Sigma_{1}^{B}$ definable in VLDCFL if and only if it is LOGDCFLcomputable.

To prove Theorem 13, we use the formalization as in Cook-Nguyen's Book [3]. First we show the uniqueness of proof trees.

- Lemma 14. VldCFL proves the following:

$$
\begin{aligned}
& \left(\forall Z_{0}, Z_{1}<n\right)\left[\left((\forall x<n) M S C D e g_{p}\left(n, x, E, Z_{0}\right) \wedge(\forall x<n) M S C D e g_{p}\left(n, x, E, Z_{1}\right)\right)\right. \\
& \left.\rightarrow Z_{0}=Z_{1}\right] .
\end{aligned}
$$

(Proof). Since $M S C D e g_{p}$ is a $\Sigma_{0}^{B}$ formula, this follows from an $\Sigma_{0}^{B}$-IND instance which is available in $V^{0}$. The most essential part is to show that the aggregate function for $M S C D e g_{p}$ is $\Sigma_{1}^{B}$-definable in Vldcfl.

Lemma 15. The aggregate function for $M S C D e g-C O M P_{p}$ defined by the formula

$$
(\forall n)(\forall E<\langle\langle n, n\rangle, b\rangle)(\exists Y)(\forall i<b)(\forall x<n) M S C D e g_{p}\left(n, x, E^{[i]}, Y^{[i]}\right)
$$

is $\Sigma_{1}^{B}$-definable in $V_{\text {LDCFL }}$.
The proof is more or less identical to that for $V P$ (Lemma VIII 1.10 in [3]).
Now, Theorem 13 follows from the following:

- Lemma 16. The $F A C^{0}$ closure of functions $F_{p}$ with defining axiom

$$
Y=F_{p}(n, E) \Leftrightarrow|Y| \leq t_{p} \wedge(\forall x<n) M S C D e g_{p}(n, x, E, Y)
$$

for $p \in L_{2}$, denoted by $F A C^{0}(\operatorname{Deg} P)$, is identical to $\mathcal{F}$ LOGDCFL.
(Proof). To show that $F A C^{0}(\operatorname{Deg} P) \subseteq \mathcal{F}_{\text {LOGDCFL }}$ we remark that $F_{p} \in \mathcal{F}_{\text {LOGDCFL }}$ for all $p \in L_{2}$.

For the converse inclusion, it suffices to show that any bit in $F \in \mathcal{F}$ LOGDCFL can be computed using some $F_{p}$ together with $A C^{0}$ operations.

Using the technique developed by Cook and Nguyen [3], we can prove Theorem 13 from Lemmata 14,15 and 16.

Since LOGDCFL contains L, we expect that the same inclusion holds for corresponding theories. The theory VL is axiomatized by $V^{0}$ together with the following axiom:

$$
P A T H \equiv \operatorname{Unique}(a, E) \rightarrow(\exists P \leq\langle a, a\rangle) \delta_{P A T H}(a, E, P),
$$

where

$$
\begin{aligned}
& \operatorname{Unique}(a, E) \equiv(\forall x<a)(\exists!y<a) E(x, y) \\
& \delta_{P A T H}(a, E, P) \equiv(P)^{0}=0 \wedge(\forall v<a)\left(E\left((P)^{v},(P)^{v+1}\right) \wedge(P)^{v+1}<a\right) .
\end{aligned}
$$

## - Theorem 17. Vldcfl contains VL.

(Proof). It suffices to show that Vldcfl proves PATH. Let $E$ satisfy $\operatorname{Unique}(a, E)$. For each $x<a$, we denote the unique $y$ such that $E(x, y)$ by $d(x)$. We first design a multiplex select gate such that if the steering input is $x$ then it outputs $d(x)$. This gate is easily constructed by setting its data inputs as $d(0), \ldots, d(a-1)$. Then we connect copies of this gate in a sequential manner, that is the first gate receives its steering input from $x$, the second gate from the first gate, and so on. Furthermore, all gates receive data inputs from $d(0), \ldots, d(a-1)$.

Suppose a node $b<n$ is reachable from $a$ by a path of length $k \leq n$. Then it is easy to see that there exists a gate in this circuit with label $\langle b, 2 k+1\rangle$.

The above argument can be formalized in VLDCFL, that is, for $p(n) \leq 2 n+1$, we have

$$
(\exists Z)(\forall x<n) M S C D e g_{p}\left(n, x, E^{\prime}, Z\right)
$$

where $E^{\prime}$ is the code for the above circuit. Furthermore, the witness $P$ in $P A T H$ can be constructed from $Z$ using $\Sigma_{0}^{B}$-COMP.

## 4 A theory for LOGCFL

We now turn to show that the axiomatization of proof tree size concept also captures the class LOGCFL. First we recall the following circuit characterization of the class:

- Theorem 18 (Venkateswaran [10]). LOGCFL is the class of predicates which are computable by polynomial size circuits with polynomial proof trees. This equivalence is true even if we restrict circuits to semi-unbounded fan-in.
(Proof). The former statement is due to [10]. So we prove the latter part.
It is readily seen that a circuit can be transformed to an equivalent semi-unbounded fan-in circuit with polynomial increase in size by the transformation of unbounded fan-in AND gates by a fan-in two subcircuits of $O(\log n)$ depth. It remains to show that the degree of the resulting circuit also has polynomial increase. For this, it suffices to show that the replacement of unbounded fan-in AND-gates by subcircuits yields polynomial increase in the degree, that is the number of nodes in a proof tree.

Let $g$ be an AND-gate in the original circuit having inputs from $h_{0}, \ldots, h_{k-1}$ and $g^{\prime}$ be the gate on the top of the subcircuit which is a substitute for $g$. It is not difficult to see that $\operatorname{deg}\left(g^{\prime}\right)=\operatorname{deg}(g)+k$. More strictly, this is proved by induction on the depth of gates.

## S. Kuroda

The advantage of the semi-unbounded fan-in restriction is that we do not need vector summation to compute degrees. Thus we can axiomatize our theory based on $V^{0}$ rather than VTC $^{0}$.

Now we define an axiom expressing polynomial degrees in a similar manner as VP of Cook-Nguyen [3]. First we will give an intuitive idea: Let $E$ be a two-dimensional array coding a circuit. The input-output relation of $E$ differs from that for multiplex select circuits.

Let $G(x)$ be an unary predicate determining the type of a gate $x$ so that $x$ is an AND gate if $G(x)$ and an OR gate otherwise. If $G(x)$ holds then $x$ receives inputs from two gates defined by

$$
\begin{aligned}
& c_{0}(x)=\mu y<x E(x, y) \\
& c_{1}(x)=\mu y<x\left(E(x, y) \wedge c_{0}(x)<y\right) .
\end{aligned}
$$

If $\neg G(x)$ then $x$ receives inputs from all $y<x$ such that $E(x, y)$.
The computation of a circuit given by $E$ and $G$ as above is coded by a string $Z$ such that

$$
\begin{aligned}
& (Z)^{0}=p(n),(Z)^{1}=1 \\
& (Z)^{x}= \begin{cases}\operatorname{deg}(x) & \text { if } x \text { outputs } 1 \text { and } \operatorname{deg}(x)<p(n), \\
p(n) & \text { otherwise }\end{cases}
\end{aligned}
$$

for $x \geq 2$.
Putting these altogether, we formally define

$$
\begin{aligned}
& M C V \operatorname{Deg}_{p}(n, E, G, Z) \Leftrightarrow \\
& (Z)^{0}=p(n) \wedge(Z)^{1}=1 \wedge \\
& (\forall x<n)(x \geq 2 \rightarrow \\
& \left(G ( x ) \wedge \left(\left((Z)^{c_{0}(x)}+(Z)^{c_{1}(x)}<p(n)-1 \wedge(Z)^{x}=(Z)^{c_{0}(x)}+(Z)^{c_{1}(x)}+1\right) \vee\right.\right. \\
& \left.\left((Z)^{c_{0}(x)}+(Z)^{c_{1}(x)} \geq p(n)-1 \wedge(Z)^{x}=p(n)\right)\right) \vee \\
& \left(\neg G ( x ) \wedge \left((\exists y<x)\left((Z)^{y}<p(n)-1 \wedge E(x, y) \wedge(Z)^{x}=(Z)^{y}+1\right) \vee\right.\right. \\
& \left.\left((\forall y<x)\left(E(x, y) \rightarrow(Z)^{y} \geq p(n)-1 \wedge(Z)^{x}=p(n)\right)\right)\right)
\end{aligned}
$$

where $p \in L_{2}$.

- Definition 19. We define the $L_{2}$-theory Vlcfl as

$$
V^{0}+\left\{(\forall n)(\forall E<\langle n, n\rangle)(\forall G<n)\left(\exists Z<\langle n, p(n)\rangle M C V D e g_{p}(n, E, G, Z): p \in L_{2}\right\} .\right.
$$

It is possible to define our theory using $S A C^{1}$ circuits which can be formalized by a single axiom. Nevertheless, we choose the above axiomatization as it is easier to formalize other computational concepts within the theory. First we show that our theories preserve the inclusion relation of corresponding complexity classes:

Theorem 20. VlCfl proves $C O N N$, thus it contains VNL.
(Proof). We argue in Vlcfl. Let $E<\langle n, n\rangle$ be given. We need to show that

$$
(\exists Y<\langle a, a\rangle)\left(\delta_{C O N N}(a, E, Y) \wedge Y(a, 1)\right)
$$

To this end, we construct a graph $C<\langle\langle a, a\rangle,\langle a, a\rangle\rangle$ such that

$$
C(0, x, 1, y) \leftrightarrow x=0 \wedge E(0, y)
$$

and

$$
C(n, x, m, y) \leftrightarrow m=n+1 \wedge E(x, y)
$$

hold. That is, $C$ has layers such that each column contains copies of all nodes in $E$. The edge relation of $C$ is given so that the node $x$ in $i$-th layer has an edge to the node $y$ in $(i+1)$-st layer if $E(x, y)$ holds for $i>0$. Furthermore, on 0 -th layer, only the node 0 has an edge to the node $y$ in the first layer if $E(0, y)$.

Now we show that a small modification of $C$ yields a circuit deciding the axiom CONN for VNL. In the new circuit $C^{\prime}$, we have the first column containing 0 and 1 , say $\langle 0,0\rangle$ and $\langle 1,0\rangle$, and the number of columns of $C$ are incremented by 1 in $C^{\prime}$. Thus the gate $\langle i, j\rangle$ in $C$ corresponds to $\langle i, j+1\rangle$ in $C^{\prime}$. Thus we define
$C^{\prime}(n, x, m, y) \leftrightarrow(n=0 \wedge x=1 \wedge m=1 \wedge y=0) \vee(C(n-1, x, m-1, y) \wedge n>0 \wedge m>0)$.
Furthermore, we let $G$ be such that $(\forall n)(\forall x) G(n, x)$, that is, all gates in $C^{\prime}$ are AND gates.
Now let $p(n)=n$. Then we have $(\exists Z) M C V \operatorname{Deg}_{p}\left(n, C^{\prime}, G, Z\right)$ and we have an $A C^{0}$ function $F$ such that

$$
M C V D e g_{p}\left(n, C^{\prime}, G, Z\right) \rightarrow \delta_{C O N N}(n, E, F(Z)) .
$$

Also note that a proof tree of a gate $\langle i, x\rangle$ is a path from $\langle 0,1\rangle$ to $\langle i, x\rangle$. Thus we are done.

- Theorem 21. VlCfl contains Vldcfl.
(Proof). It suffices to show that VLCfl proves $M S C D e g-C O M P_{p}$ for any $p \in L_{2}$. This is achieved by giving circuits simulating multiplex-select circuits which have polynomial size and polynomial proof size. Let $g$ be an MS-gate with the steering input $s$ and data inputs $d_{0}, \ldots, d_{l}$. The Boolean circuit $C_{g}$ simulating $g$ can be obtained by building the following subcircuits:
- $C_{\text {sel }}^{i}$ outputs 1 if the steering input $s$ selects the data input $d_{i}$, and
- $C_{\text {out }}^{j}$ outputs the bits of the $j$-th output bit of $g$.

Then $C_{g}$ is defined as

$$
\bigvee_{1 \leq i \leq l}\left(C_{s e l}^{i} \wedge C_{o u t}^{i}\right)
$$

The subcircuit $C_{\text {sel }}^{i}$ is defined as $\bigwedge_{1 \leq j \leq l} l_{j}$ where

$$
l_{i}= \begin{cases}s_{j} & \text { if the } j \text {-th bit of } i \text { is } 1 \\ \neg s_{j} & \text { otherwise. }\end{cases}
$$

Then $C_{\text {out }}^{j}$ is computed by the circuit

$$
\bigvee_{1 \leq i \leq l}\left(C_{s e l}^{i} \wedge d_{i}^{j}\right)
$$

It is not difficult to see that the above construction can be described by $\Sigma_{0}^{B}$ relation. So we have the $\Sigma_{0}^{B}$ description of the translation of the whole MS-circuit $C$ by an equivalent Boolean circuit.

Although the above construction does not yields semi-unbounded fan-in circuits, it can be converted into an equivalent $S A C^{1}$ circuit.

It is also true that VLCFL is contained in the theory for $A C^{1}$ which is axiomatized by $V^{0}$ and the axiom

$$
\left(\exists Y \leq\langle | n|+1, n\rangle \delta_{L M C V}(n,|n|, E, G, I, Y)\right.
$$

where

$$
\begin{aligned}
\delta_{L M C V}(n, d, E, G, I, Y) \equiv & (\forall x<n)(\forall z<d)((Y(0, x) \leftrightarrow I(x)) \wedge \\
(Y(z+1, x) \leftrightarrow & (G(z+1, x) \wedge(\forall u<n)(E(z, u, x) \rightarrow Y(z, u))) \vee \\
& (\neg G(z+1, x) \wedge(\exists u<n)(E(z, u, x) \wedge Y(z, u)))) .
\end{aligned}
$$

## S. Kuroda

- Theorem 22. VlCFL is contained in VAC ${ }^{1}$.

Actually, we prove a stronger theorem stating that any polynomial size circuit family with polynomial proof tree size has an equivalent $S A C^{1}$ circuits.

- Theorem 23. Let $p, q \in L_{2} . \mathbf{V T C}^{0}$ proves that for any circuit $C$ with size $p$ and proof tree size $q$ there exists a semi-unbounded circuit $C^{\prime}$ such that for any input $X$,

$$
C \text { accepts X in proof tree size } q \Leftrightarrow C^{\prime} \text { accepts } X \text {. }
$$

(Proof). We use the idea similar to realizable pairs of Ruzzo [8]. Let $g$ be a gate of $C$ and $\Gamma$ be a set of nodes in $C$. W define $C_{g, \Gamma}$ to be a directed acyclic subgraph of $C$ whose root is $g$ and sinks are $\Gamma$. We say that $C_{g, \Gamma}$ is realizable within size $p$ and proof tree size $q$ if $\operatorname{size}\left(C_{g, \Gamma}\right) \leq p$ and $g$ has proof tree size $\leq q$ provided that all nonleaves of $\Gamma$ are assigned the value 1 .

It suffices to construct a semi-unbounded fan-in circuit deciding whether $C_{g, \Gamma}$ is realizable with size $p$ and proof tree size $q$ since $C=C_{g_{0}, \emptyset}$ is realizable if and only if $C$ accepts the input where $g_{0}$ is the output gate of $C$. So we construct a recursive procedure realize ( $C, p, q$ ) which works as follows:

In each recursive step, realize $(C, p, q)$ splits $C$ into two parts $C_{0}$ and $C_{1}$ each having approximately half size of $C$ and recursively check whether realize $\left(C_{0}, p / 2, i\right)$ and $\operatorname{realize}\left(C_{1}, p / 2, j\right)$ where $i+j=q$. This procedure is given as follows:
procedure realize(C,p,q);
begin
if $C$ consists of a single gate $g$ then if $g$ has the value 1 and $p, q \geq 1$
then ACCEPT else REJECT
else
guess $i$ with $0<i<q$ and a gate $s$ in $C$;
$C_{0}:=$ subcircuit of $C$ rooted at $s ; C_{1}:=C \backslash C_{0}$ with $s$ replaced by 1 ;
check in parallel
realize $\left(C_{0}, p / 2, i\right)$ AND realize $\left(C_{1}, p / 2, q-i\right)$
end

We will show that this algorithm can be converted into logarithmic depth semi-unbounded fan-in circuits.

First we remark that checking whether $\operatorname{size}(C)<p$ can be done by an $N C^{1}$ circuit using threshold circuits.

Since the number of the nestings of recursive calls is logarithmic in $p$, the total number of subcircuits $C_{0} \mathrm{~S}$ and $C_{1} \mathrm{~s}$ created along the execution of the procedure is bounded by polynomial. Thus the number of gates used in checking circuit sizes is polynomially bounded.

Now each recursive steps can be expressed by an AND of $\operatorname{size}(C)<p$ and the OR of the following subroutines:

- the base case with a single node $g$,
- realize $\left(C_{0}, q / 2, i\right)$ AND realize $\left(C_{1}, p / 2, q-i\right)$ for $0<i<q$ and all nodes in $C$.

The fan-in of this $O R$ gate is $O(p \cdot q)$ which is polynomial. Furthermore, all AND gates required is bounded fan-in.

By recalling that the number of the nestings of recursive calls is $\log p=O(\log n)$, we conclude that the procedure realize can be transformed into an $S A C^{1}$ circuit family.

It is not hard, though tedious, to show that the above construction can be demonstrated by $\Sigma_{0}^{B}$ relation. To check the correctness of the construction, we need counting argument for the evaluation of circuit sizes. So the whole argument can be formalized in $\mathbf{V T C}^{\mathbf{0}}$.

- Corollary 24. Vlcfl and $V-Q^{S A C}$ proves the same $L_{2}$ sentences.

Now we show that Vlcfl corresponds to LOGCFL.

- Theorem 25. A function is $\Sigma_{1}^{B}$ definable in $V_{\text {LCFL }}$ if and only if it in $\mathcal{F}_{\text {LOGCFL }}$.

Note that the witness $Z$ for $M C V D e g_{p}$ axiom is not always optimal in a sense that the degree of the output of $Z$ is larger than $p(n)$ while there is another witness $Z^{\prime}$ whose output degree is less than or equal to $p(n)$. Nevertheless, we can avoid such undesirable cases by considering $S A C^{1}$ circuits.

- Theorem 26 (Definability for VLCFL). If a function of polynomial growth is bitwise computable in LOGCFL then it is $\Sigma_{1}^{B}$ definable in VLCFL.
(Proof Sketch). In [6], it is shown that $V-Q^{S A C}$ proves that $S A C^{1}$ is closed under complementation. By Corollary 24, this is also provable in VLCFL.

Moreover, any $S A C^{1}$ circuit $C$ is a polynomial size circuit having polynomial proof tree size such that for any input $X$ and any $Z$ witnessing $M C V P$ axiom for $C, C$ outputs true if and only if the degree of the output gate is less than $p(|X|)$ for a given proof tree bound $p$.

Let $F(X)$ be a LOGCFL function with polynomial growth. To show that VLCFL $\Sigma_{1}^{B}$ defines $F$, it suffices to show that circuits can be combined into a multi output circuit. This can be done using circuits and their complementary circuits which is also an $S A C^{1}$ circuit provably in Vlcfl.

For witnessing we use the argument using the conservative universal extension $\widehat{\text { VLCFL }}$ of Vlcfl.

- Theorem 27 (Witnessing for VLCfL). If a function is $\Sigma_{1}^{B}$ definable in VLCfl then it is of polynomial growth and bitwise computable in LOGCFL.
(Proof). We define the universal conservative extension $\widehat{\widehat{V L C F L}}$ by introducing for each $p \in L_{2}$ a function symbol $F_{p}$ such that

$$
F_{p}(n, E, G)=Z \leftrightarrow M C V D e g_{p}(n, E, G, Z)
$$

Using Herbrand-style argument, we can show that provably total functions of $\widehat{\text { VLCFL }}$ are in the $A C^{0}$ closure of $\left\{F_{p}: p \in L_{2}\right\}$. As LOGCFL is closed under $A C^{0}$ operations, it suffices to show that $F_{p}$ is witnessed by a LOGCFL algorithm. In order to compute $F_{p}$ we modify NAuxPDA machine simulating polynomial size circuits with polynomial proof trees as in [7]. The machine makes an depth-first search through a circuit and at each step, it computes the degree of the currently visiting gate using the work tape. This can be done in logarithmic space by expressing degrees in binary. Therefore, this gives an NAuxPDA computing $F_{p}$.

## 5 Finite axiomatizability

In this section we show that the theory VLCFL is finitely axiomatizable. The idea is to formalize LOGCFL complete problems such as acyclic conjunctive query problem (Gottlob et.al. [4]) or word problem for finite groupoids (Bedard et.al. [1]).

## S. Kuroda

First we modify the theory VLCFL so that it can argue about circuit families which have $\Sigma_{0}^{B}$ direct connection languages. For an $L_{2}$-formula $\varphi(x, y)$ we define the string

$$
E_{\varphi}(n)(x, y) \Leftrightarrow x<n \wedge y<n \wedge \varphi(x, y) .
$$

Similarly for an $L_{2}$-formula $\psi(x)$ we define the string

$$
G_{\psi}(n)(x) \Leftrightarrow x<n \wedge \psi(x) .
$$

Then we define the theory

$$
\mathbf{V L C F L}^{\prime} \equiv V^{0}+\left\{(\forall n)\left(\exists Z<\langle n, p(n)\rangle M C V \operatorname{Deg}_{p}\left(n, E_{\varphi}(n), G_{\psi}(n), Z\right): \varphi, \psi \in \Sigma_{0}^{B}, p \in L_{2}\right\} .\right.
$$

It is not difficult to see that

- Lemma 28. VLCFL $^{\prime}=$ VLCFL .

Using VLCfL' we have

- Theorem 29. Vlcfl' and hence Vlcfl is finitely axiomatizable.
(Proof Sketch). We can formalize the acyclic conjunctive query problem in $L_{2}$ and define the direct connection language $\varphi$ and $\psi$ of the polynomial size circuit family deciding it in proof tree size $p(n)$. Then in VLCFL' we can show that any circuit family with polynomial proof tree size can be reduced to it. Thus VLCFL $^{\prime}$ is finitely axiomatizable.
Remark. Although it is likely that the same argument holds for Vldcfl, we do not know whether there exists a complete problem for LOGDCFL.


## 6 Provability of the pumping lemma for CFLs

We now turn to the problem of the provability of the pumping lemma. Intuitively, the pumping lemma states that a sufficiently long word in a context-free language can be iterately "pumped up" to another word in the language. More precisely,

- Theorem 30 (The pumping lemma for CFLs). Let $G=(N, T, P, S)$ be a context-free grammar in Chomsky normal form. Suppose $W \in L(G)$ is such that $|W| \geq 2^{m-1}+1$ where $m=|N|$. Then there exist $W_{1}, \ldots, W_{5}$ such that $W=W_{1} \cdots W_{5},\left|W_{2} W_{4}\right| \geq 1$, $\left|W_{2} W_{3} W_{4}\right| \leq 2^{m}$ and $W_{1} W_{2}^{i} W_{3} W_{4}^{i} W_{5} \in L(G)$ for all $i \in \omega$.

In order to formalize Theorem 32, we need expressions which refer to the exponentiation like $y \geq 2^{x}$. Recall that the relation $y=2^{x}$ is $\Delta_{0}$ definable in $I \Delta_{0}$ although the function is not total in the theory. Likewise, in $L_{2}$ we can define the exponentiation relation using a $\Sigma_{0}^{B}$ formula. In particular, the assumption $|W| \geq 2^{s-1}+1$ in Definition 31 below is expressible in the language of VLcfl.

Next we show how to code context-free grammars. Let $G=(N, T, P, S)$ be a CFG in Chomsky normal form. Since $N$ and $T$ are finite sets, We code it as $T=\{0, \ldots n-1\}$, $N=\{n, \ldots, n+s\}, S=n$. Recall that rules of context-free grammars in Chomsky normal form are either of the form $a \rightarrow b c$ or $a \rightarrow l$ for $a, b, c \in N$ and $l \in T$. We let $P \subseteq N \times\left(T \cup N^{2}\right)$ so that

$$
P(a, x) \Leftrightarrow a \rightarrow x \text { is a rule in } P .
$$

We also write $a \rightarrow_{P} x$ instead of $P(a, x)$.
A parse tree $T$ of height $l$ is a layered binary tree such that each row $T^{[i]}$ is obtained from $T^{[i-1]}$ by applying rules in $P$ to nonterminal symbols simultaneously. Each symbol $T^{[i]}[k]$
is of the form $\left\langle n, k^{\prime}\right\rangle$ denoting that $u \leq n+s$ is a symbol assigned to the slot and $k^{\prime}$ is the index of its parent in $T^{[i-1]}$, where $X[y]$ is the value of $X$ at $y$ (denoted by $(X)^{y}$ in [3]).

More precisely, $T$ is a parse tree if the following conditions hold:

- $T^{[0]}$ consists of a single nonterminal symbol $a$ with $n \leq a \leq n+s$ and for all $i>0$ and $k$, $T^{[i]}[k]=\left\langle u, k^{\prime}\right\rangle$ for some $u \leq n+s$ and $k^{\prime}$.
- If $T^{[i]}[k]=\left\langle u, k^{\prime}\right\rangle$ with $u \geq n$ then $T^{[i]}[k+j]=\left\langle v, k^{\prime}\right\rangle$ for $v \geq n$ and exactly one $j \in\{-1,1\}$ such that $w \rightarrow_{P} u v$ for $w=\left(T^{[i-1]}\left[k^{\prime}\right]\right)_{0}$. Moreover, either one of the following holds:
= there exists a unique $k^{\prime \prime}$ such that $T^{[i+1]}\left[k^{\prime \prime}\right]=\langle v, k\rangle, v<n$ and $u \rightarrow_{P} v$,
- there exists a unique $k^{\prime \prime}$ such that $T^{[i+1]}\left[k^{\prime \prime}\right]=\langle v, k\rangle, T^{[i+1]}\left[k^{\prime \prime}+1\right]=\left\langle v^{\prime}, k\right\rangle, n \geq$ $v, v^{\prime} \geq n+s$ and $u \rightarrow_{P} v v^{\prime}$.
- If $T^{[i]}[k]=\left\langle u, k^{\prime}\right\rangle$ with $u<n$ then there exists a unique $k^{\prime \prime}$ such that $T^{[i+1]}\left[k^{\prime \prime}\right]=\langle v, k\rangle$ whenever $i \leq l$ where $l$ is the number of rows in $T$. In addition, one of $\left(T^{[i-1]}\left[k^{\prime}\right]\right)_{0}=u$ or $\left(T^{[i-1]}\left[k^{\prime}\right]\right)_{0} \rightarrow u$ holds.
- For any $i$ and $k<k^{\prime}$ if $T^{[i+1]}[c]$ and $T^{[i+1]}[c]$ are children of $T^{[i]}[k]$ and $T^{[i]}\left[k^{\prime}\right]$ then $c<c^{\prime}$.
- $\left.\left(T^{[l]}[k]\right)\right)_{0}<n$ for all $k$.

Based on the above codings, it is easy, though tedious, to see that there exists a $\Sigma_{0}^{B}$ formula
$\operatorname{Parse}(a, l, n, s . P, T, X) \equiv$
$T$ is a parse tree of length $l$ starting from $a$ which generates $X$.

- Definition 31. We define $P L(n, s, P)$ to be the following formula:

$$
\begin{aligned}
& (\forall X)(\exists T) \operatorname{Parse}(s, l, n, n, T, P, X) \wedge|X| \geq 2^{s-1}+1 \\
& \rightarrow\left(\exists X_{1}, X_{2}, X_{3}, X_{4}, X_{5}\right)\left(X=X_{1} X_{2} X_{3} X_{4} X_{5} \wedge\left|X_{2} X_{4}\right| \geq 1 \wedge\left|X_{2} X_{3} X_{4}\right| \leq 2^{s} \wedge\right. \\
& \left.(\forall i)\left(\exists l^{\prime}\right)\left(\exists T^{\prime}\right) \operatorname{Parse}\left(a, l^{\prime}, n, n, P, T^{\prime}, X_{1} X_{2}^{i} X_{3} X_{4}^{i} X_{5}\right)\right) .
\end{aligned}
$$

As previously stated, this is provable in Vlcfl, namely,

- Theorem 32. Vlcfl proves $(\forall n)(\forall s)(\forall P) P L(n, s, P)$.

The main idea of the proof is to simulate CFGs by circuits so that the transformation of parse trees as in the standard proof of the pumping lemma as in [5] can be interpreted in terms of proof tree. Let $(E, G)$ be a code of a circuit as in the axiom $M S C D e g_{p}$ but with $n$ input gates whose values are unspecified and $X$ be a string with $|X|=n$. Then we define $(E, G)$ with input $X$ by $\left(E_{X}, G_{X}\right)$. Our main tool is the following:

- Lemma 33. There exists an $A C^{0}$ function $F(m, n, s, P)=(a, E, G)$ which is $\Sigma_{1}^{B}$ definable in $V^{0}$ and an $L_{2}$-term $r(n, s, P, X)$ such that $(E, G)$ is a code of a semi-unbounded fan-in circuit with $m$ unspecified input gates and VLCFL proves

$$
\begin{aligned}
& (\forall X)[(\exists T) \operatorname{Parse}(s, l, n, n, P, T, X) \\
& \left.\leftrightarrow(\exists Z)\left(M C V P_{r}\left(a, E_{X}, G_{X}, Z\right) \wedge Z[a]<r(|X|, n, s, P)\right)\right] .
\end{aligned}
$$

Moreover, there exists an $A C^{0}$ function $T(Z)$ such that $V^{0}$ proves the following:
$(\forall X)\left[\left(M C V P_{r}\left(a, E_{X}, G_{X}, Z\right) \wedge Z[a]<r(|X|, n, s, P)\right) \rightarrow \operatorname{Parse}(s, l, n, n, T(Z), P, X)\right]$.
(Proof Sketch). As in Example 1 in page 220 of Ruzzo [8], we construct a circuit checking whether an input string belongs to a given context-free language.

## S. Kuroda

We construct a circuit which simulates a given CFG $G=(n, s, P)$ on input $X$. The idea is to introduce a gate which checks whether $a \Rightarrow^{*} x_{i} \cdots x_{j}$ for each nonterminal symbol $a$ and a substring $x_{i} \cdots x_{j}$ of $X$.

In particular, the output gate is an OR gate checking whether $n \Rightarrow{ }^{*} x_{1} \cdots x_{n}$ (Recall that the start symbol is coded by $n$ ). This is done by introducing an AND gate for each rule $n \rightarrow a b$ and a partition $x_{1} \cdots x_{i} / x_{i+1} \cdots x_{n}$. This AND gate gives an output to the output OR gate and receives inputs from gates checking $a \Rightarrow^{*} x_{1} \cdots x_{i}$ and $b \Rightarrow^{*} x_{i+1} \cdots x_{n}$.

Gates which check whether $a \Rightarrow^{*} x_{i} \cdots x_{j}$ are defined similarly. Furthermore, at the input level, we connect subcircuits deciding that a given binary string is a code of a terminal symbol.

It is easy to see that the edge relation of this circuit is $A C^{0}$ computable. Also it can be seen that it has polynomial size and polynomial proof size and semi-unbounded fan-in.

We also remark that the above construction of the circuit ensures that its proof tree corresponds to a parse tree of $n \Rightarrow^{*} X$, so it is easy to construct an $A C^{0}$ function transforming a proof tree into a parse tree.
(Proof of Theorem 32) It suffices to show that the transformation of parse trees in the standard proof of the pumping lemma can be paraphrased in terms of proof trees in the circuit $F(n, s, P, X)$ of Lemma 33.

First we note that by Lemma 33, it follows that the predicate $(\exists T) \operatorname{Parse}(s, l, n, n, T, P, X)$ is equivalent to an open formula in the universal conservative extension $\widehat{\mathbf{V L C F L}}$. So in the following we argue in $\widehat{\text { VLCFL }}$.

Let $X \in L(G)$ be such that $|X| \geq 2^{s-1}+1$. First we claim that there exists a partition $X_{1}, \ldots, X_{5}$ of $X$ as in the proof of the pumping lemma.

By assumption, $(\exists T) \operatorname{Parse}(s, l, n, n, P, T, X)$ holds for some $l$. So by Lemma 33 we have

$$
(\exists Z)\left(M C V P_{r}\left(a, E_{x}, G_{x}, Z\right) \wedge Z[a]<r(m, n, s, P)\right.
$$

Since the circuit ( $E_{X}, G_{X}$ ) is semi-unbounded fan-in, we can extract a proof tree from $X$ which is binary branching.
Claim. There exists a path $R$ from the output to some input in $R$ which is longer than $s$.
(Proof of Claim). The construction of the circuit $\left(E_{X}, G_{X}\right)$ ensures that any proof tree contains all bits of $X$. So by a combinatorial argument formalized in VLCFL we have the claim.

From the above claim and the pigeonhole principle for $l(R) \rightarrow s$ where $l(R)$ is the length of the path $R$, we can choose two occurrences of some nonterminal symbol $c$.

In order to choose the position of such occurrences, we slightly modify the construction of the circuit in Lemma 33 so that the OR gate checking $a \Rightarrow^{*} x_{i} \cdots x_{j}$ is assigned a label $a$. Note that this modification is also $A C^{0}$ computable.

Let $X^{\prime}$ and $X^{\prime \prime}$ be substrings of $X$ generated by the first and the second occurrences of $c$ respectively. We can execute these substrings using the PATH axiom. Moreover, the subtree starting from a given node is also computable by PATH.

Let $X_{1}, X_{2}, X_{3}, X_{4}, X_{5}$ be such that $X=X_{1} X^{\prime} X_{5}, X^{\prime}=X_{2} X^{\prime \prime} X_{4}$ and $X_{3}=X^{\prime \prime}$. Define

$$
\operatorname{pump}(X, i)=\operatorname{pump}\left(X_{1}, X_{2}, X_{3}, X_{4}, X_{5}, i\right)=X_{1} X_{2}^{i} X_{3} X_{4}^{i} X_{5}
$$

It is easy to see that pump is $\Sigma_{1}^{B}$ definable in $V^{0}$.
Now we show that $\widehat{\text { VLCFL }}$ proves the following which we denote by $(\forall i) P T(i, X)$ :

$$
\begin{aligned}
& (\forall i<|X| m)(\exists Z)\left(M C V P_{p}\left(a_{\text {pump }(X, i)}, E_{\text {pump }(X, i)}, G_{\text {pump }(X, i)}, Z\right)\right. \\
& \wedge Z[r(n,|s|, P, \operatorname{pump}(i, X))]<p(|\operatorname{pump}(i, X)|)) .
\end{aligned}
$$

Let $T^{\prime}$ and $T^{\prime \prime}$ be the parse tree of $X^{\prime}$ and $X^{\prime \prime}$ respectively. From a parse tree $T_{i}$ of $\operatorname{pump}(X, i)$ we can construct a parse tree $T_{i+1}$ of $\operatorname{pump}(X, i+1)$ by replacing the parse tree of $T^{\prime \prime}$ by $T^{\prime}$. Again we note that this operation is $A C^{0}$ computable. So we have

$$
(\forall i)(P T(i, X) \rightarrow P T(i+1, X)) .
$$

It is easy to see that $P T(0, X)$. Recall that $P T(i, X)$ is equivalent to a $\Sigma_{0}^{B}$ formula $\varphi(i, X)$ in the extended language and $\widehat{\text { VLCFL }}$ proves $\Sigma_{0}^{B}$ induction. So we conclude that $(\forall i) P T(i, X)$. Thus again by Lemma 33 we have

$$
\left(\forall i<|X|_{m}\right)(\exists T) \operatorname{Parse}(|s|, l, n,|s|, P, T, X) .
$$

which proves the theorem.

## 7 Concluding Remarks

It is an interesting problem to determine the lower bound on the provability of the pumping lemma for context-free languages. For our theories one might conjecture that Vldcfl does not prove the theorem $(\forall n)(\forall s)(\forall P) P L(n, s, P)$.

As we have pointed out in the proof of Theorem 32, we need several combinatorial tools which lie within the theory for NL in order to execute the proof of the pumping lemma. Also it is essential in the proof that context free languages are definable in the theory.

Based on these observations, we come to the problem of whether VLDcFl $+P A T H$ proves $(\forall n)(\forall s)(\forall P) P L(n, s, P)$. Note that if it is not the case then VLDCFL $+P A T H$ cannot define (nondeterministic) context-free languages which gives a strong evidence implying that both NL and LOGDCFL is strictly contained in LOGCFL.

## References

1 F. Bédard, F. Lemieux and P. McKenzie, Extensions to Barrington's M-program model. Theoretical Computer Science, 107(1), 1993, pp.31-61.
2 A.Borodin, S.A.Cook, P.W.Dymond, W.L. Ruzzo, and M.Tompa, Two Applications of Inductive Counting for Complementation Problems. SIAM J. Comput. 18, 1989, pp. 559578
3 S.A.Cook and P.Nguyen, Logical Foundations of Proof Complexity. Cambridge University Press 2010.
4 G. Gottlob, N. Leone and F. Scarcello, The complexity of acyclic conjunctive queries. Journal of the ACM 48(3) 2001, pp. 431-498.
5 J. E. Hopcroft and J. D. Ullman, Introduction to automata theory, languages and computation. Addison-Wesley, 1979.
6 S. Kuroda, Generalized Quantifier and a Bounded Arithmetic theory for LOGCFL. Archive for Mathematical Logic, 46(5-6), 2007, pp489-516.
7 P. McKenzie, K. Reinhardt and V. Vinay, Circuits and Context-Free languages. Computing and Combinatorics. Lecture Notes in Computer Science, 1627, 1999, pp.194-203
8 W.L. Ruzzo, Tree-size bounded alternation. Journal of Computer and System Sciences, 21(2), 1980 pp.139-154.
9 I. Sudborough, On the tape complexity of deterministic context-free language, Journal of ACM, 25(3), 1978, pp.405-414.
10 H. Venkateswaran, Properties that characterize LOGCFL. Journal of Computer and System Sciences, 43(2), 1991, pp.380-404.

