We prove that T N C 1 , the true universal first-order theory in the language containing names for all uniform N C 1 algorithms, cannot prove that for sufficiently large n, SAT is not computable by circuits of size n 4kc where k ≥ 1, c ≥ 2 unless each function f ∈ SIZE(n k ) can be approximated by formulas {F n }
Introduction
We investigate the provability of polynomial circuit lower bounds in weak fragments of arithmetic including Buss's [1] theory S ries is the general question whether the existential quantifiers in complexitytheoretic statements can be witnessed feasibly and so that to derive the witnessing we do not need to exceed feasible reasoning.
Informally, our formalization of n k -size circuit lower bounds for SAT, denoted by LB(SAT, n k ), has the following form:
∀n > n 0 , ∀ circuit C with n inputs and size n k ∃ y, a such that (C(y) = 0 ∧ SAT (y, a)) ∨ (C(y) = 1 ∧ ∀z¬SAT (y, z)) where n 0 , k are constants and SAT (y, z) means that z is a satisfying assignment to the propositional 3CNF formula y, see Section 2.
If S 1 2 proves the formula LB(SAT, n k ) for some constant n 0 , then by the usual kind of witnessing, Buss's witnessing [1] or the KPT theorem [12] , for any n k -size circuit with n inputs we can efficiently find a formula of size n on which the circuit fails to solve SAT, see Proposition 4.1.
One could hope to use the p-time algorithm to derive a contradiction with some established hardness assumption, however, Atserias and Krajíček noticed that the same p-time algorithm follows from standard cryptographic conjectures, see Proposition 4.2. (Actually, as discussed in Section 4, a randomized version of such observations appeared already in Buss [3, Section 4.4] and Cook-Mitchell [6, Section 6] .) It is an interesting question to ask how strong theories are needed to derive these conjectures.
We do not know how to obtain the unprovability of SAT circuit lower bounds in S 1 2 but we can do it basically for any weaker theory with stronger witnessing properties. We present it in the case of theory T N C 1 which is the true universal first-order theory in the language containing names for all uniform N C 1 algorithms.
In theories weaker than S 1 2 , like the theory T N C 1 , the situation is less natural because they cannot fully reason about p-time concepts. In particular, some universal quantifiers in LB(SAT, n k ) can be replaced by existential quantifiers without changing the intuitive meaning of the sentence. The resulting formula LB ∃ (SAT, n k ) (defined in Section 5) is equivalent to LB(SAT, n k ) in S 1 2 but not necessarily in T N C 1 . This is because LB ∃ (SAT, n k ) asserts among other things the existence of computations of general n k -size circuits, a fact which may not be T N C 1 -provable. Therefore, it is essentially trivial to obtain a conditional unprovability of LB ∃ (SAT, n k ) in T N C 1 , see Proposition 6.1. This is not the case with the formalization LB(SAT, n k ) and in this sense it is easier and more suitable for the theory T N C 1 to reason about LB(SAT, n k ).
The main result of this paper is that we can obtain a conditional unprovability of LB(SAT, n k ) as well. We show that LB(SAT, n 4kc ) for k ≥ 1, c ≥ 2 is unprovable in T N C 1 unless each function f ∈ SIZE(n k ) can be approximated by formulas F n of size 2 O(n 1/c ) with subexponential advantage: P x{0,1} n [F n (x) = f (x)] ≥ 1/2 + 1/2 O(n 1/c ) . The proof will be quite generic. In particular, using known lower bounds on PARITY function, we will obtain that, unconditionally, V 0 cannot prove quasi polynomial (n log n -size) circuit lower bounds on SAT. Here, V 0 is a second-order theory of bounded arithmetic such that its provably total functions are computable in AC 0 , see Section 5.
To prove our main theorem we firstly observe that by the KPT theorem [16] the provability of LB(SAT, n 4kc ) in universal theories like T N C 1 gives us an O(1)-round Student-Teacher (S-T) protocol finding errors of n 4kc -size circuits attempting to compute SAT. Then, in particular, it works for n 4kc -size circuits encoding Nisan-Wigderson (NW) generators based on any function f ∈ SIZE(n k ) and any suitable design matrix [17] . The interpretation of NW-generators as p-size circuits comes from Razborov [20] . In this situation we apply Krajíček's proof from [15] showing that certain NW-generators are hard for the true universal theory T P V in the language containing names for all p-time algorithms. This is the main technique we use. We show that it works in our context as well and allows us to use the S-T protocol to compute f by subexponential formulas with a subexponential advantage.
Perhaps the most significant earlier result of this kind was obtained by Razborov [19] . Using natural proofs he showed that theory S 2 2 (α) cannot prove superpolynomial circuit lower bounds on SAT unless strong pseudorandom generators do not exist. In fact, his proof works even for sufficiently big polynomial circuit lower bounds. The second-order theory S 2 2 (α) is however quite weak with respect to the formalization Razborov used. As far as we know his technique does not imply the unprovability of circuit lower bounds (formalized as here, see Section 2) even for V 0 . In this respect, our proof applies to much stronger theories, basically to any theory weaker than S 1 2 in terms of provably feasible functions.
The paper is organized as follows. In Section 2 we formalize circuit lower bounds in the language of bounded arithmetic. In Section 3 we define a conservative extension of the theory S 1 2 denoted S 1 2 (bit) and state its properties. In Section 4 we discuss the provability of circuit lower bounds in S 1 2 (bit). Section 5 defines subtheories of S 1 2 (bit) for which we prove our main unprovability results in Section 6.
Formalization
The usual language of arithmetic contains well known symbols: 0, S, +, ·, = , ≤. To encode reasoning about computation it is natural to consider also symbols
, |x| for the length of the binary representation of x and # with the intended meaning x#y = 2 |x|·|y| . Theories of bounded arithmetic are typically defined using the language L = {0, S, +, ·, =, ≤, x/2 , |x|, #}, cf. Buss [1] . We will consider also the language L bit which contains in addition the symbol x i for the i-th bit of the binary representation of x. The basic properties of symbols from L bit are captured by a set of basic axioms BASIC(bit) which we will not spell out, cf. [1, 13] , e.g. chapter 5.2 in Krajíček [13] states the axioms for symbols in L and chapter 5.4 in [13] gives a construction of a formula in the language L defining the i-th bit of the binary representation of x which we use here as an axiom.
We say that a quantifier is sharply bounded if it has the form ∃x, x ≤ |t| or ∀x, x ≤ |t| where t is a term not containing x. A quantifier is bounded if it is existential bounded: ∃y, y ≤ t, or universal bounded: ∀y, y ≤ t where y is not occuring in t. Σ In words, the complexity of bounded formulas in the language L (formulas with all quantifiers bounded) is defined by counting the number of alternations of bounded quantifiers, ignoring the sharply bounded ones.
All NP resp. coNP properties are representable by Σ b 1 resp. Π b 1 formulas, cf. [11, 21, 22] . formulas in the theory called P V 1 , cf. [4, 13] , see also Section 3.
We will now express circuit lower bounds in L bit .
Firstly, denote by Comp(C, y, w) a Σ b 0 (bit)-formula saying that w is a computation of circuit C on input y. Such a formula can be constructed in many ways and our results work for any Σ b 0 (bit) formalization. For simplicity, we present here a less efficient one where C represents a directed graph on |w| vertices.
Let
there is an edge in circuit C going from the i-th vertex to the j-th vertex. For k < |w|, let N C (k) be the pair of bits (C [|w|,|w|]+2k , C [|w|,|w|]+2k+1 ) encoding the connective in the k-th node of circuit C, say (0, 1) be ∧, (1, 0) be ∨, and (1, 1) and (0, 0) be ¬. Therefore, |C| = [|w|, |w|] + 2|w|. Then let Circ(C, y, w) be the formula stating that C encodes a |w|-size circuit with |y| inputs:
which means that if the j-th node of C is ∧ or ∨, there are exactly two previous nodes i, k of C with edges going from i and k to j, if the j-th node of C is ¬, there is exactly one previous node i with an edge going from i to j. Comp(C, y, w) says that for each i < |y| the value of w i is the value of the i-th input bit of y and each w j is an evaluation of the j-th node of circuit C given w k 's evaluating nodes connected to the j-th node:
Formula C(y; w) = 1 stating that w is an accepting computation of circuit C on input y will be Comp(C, y, w) ∧ w |w|−1 = 1. Similarly for C(y; w) = 0.
Next, let SAT (y, z) be a Σ b 0 (bit)-formula saying that z is a satisfying assignment to the propositional 3-CNF formula y.
To define it explicitly for each i, j, k < 2m we let y We use m implicitly given by y in the formula SAT (y, z):
Finally, for any k, hardness of SAT for n k -size circuits can be expressed as the following ∀Σ
Here n 0 is a fixed constant which is not indicated in LB(SAT, n k ). This should not cause any confusion. Whenever we say that LB(SAT, n k ) is provable in a theory T we mean that it is provable in T for some n 0 . Further, ∀1 n > n 0 is a shortcut for ∀m, n such that |m| = n ∧ m > n 0 . Therefore, y is feasible in m and for each n 0 and k, LB(SAT, n k ) is universal closure of a Σ b 2 (bit) formula.
We use the formalization of circuit lower bounds which is essentially a family of statements parametrized by n 0 instead of the formalization of the form ∃n 0 , LB(SAT, n k ) because the latter would result in a formula with higher quantifier complexity and the witnessing necessary in our proofs would not work. A similar problem would arise if we used lower bounds of the form "∀1 n 0 , ∃1 n > 1 n 0 , ∀C, ∃y, a ...". Moreover, it seems natural to avoid situations in which ∃n 0 , LB(SAT, n k ) is provable but not for any specific n 0 .
Note also that, strictly speaking, for fixed k, LB(SAT, n k ) might not be equivalent to lower bounds with different encodings of SAT formulas. For instance, our encoding of 3CNF's makes the formula size (the n) always cubic in the number of variables. However, the choice of our encoding is rather arbitrary and our results apply analogously for any efficient encoding of 3CNF's. On the other hand, if we used general SAT formulas instead of 3CNF's, the predicate SAT (x, y) would not be in AC 0 anymore what would cause problems in results concerning the provability in theory V 0 . Then, we would need to decide what is the right formalization of circuit lower bounds in the case of V 0 and modify the proof accordingly which we want to avoid.
Feasible Mathematics
If we obtain n k -size circuit lower bounds for SAT but do not find any efficient method how to witness errors of potential n k -size circuits for SAT, some of these circuits might work in practice like correct ones. We will now define theories of feasible mathematics where provability of n k -size circuit lower bound for SAT implies the existence of such an error witnessing.
Perhaps, the most prominent one is S 1 2 introduced by Buss [1] . We will use its conservative extension S 
there is a p-time function f such that A(x, f (x)) holds for any x. S 1 2 (bit) admits also a useful kind of witnessing for Σ b 2 (bit)-formulas which was obtained by using a direct method in Pudlák [18] , and by using Herbrand functions in Krajíček [12] . Theorem 3.2 (Pudlák [18] , Krajíček [12] ). If S 1 2 (bit) ∃y ∀z ≤ t A(x, y, z) for Σ b 0 (bit)-formula A and term t depending only on x, y, then there is p-time algorithm S such that for any x either ∀z ≤ t A(x, S(x), z) or for some z 1 , ¬A(x, S(x), z 1 ). In the latter case, either ∀z ≤ t A(x, S(x, z 1 ), z) or there is z 2 such that ¬A(x, S(x, z 1 ), z 2 ). However after k ≤ poly(|x|) rounds of this kind, ∀z ≤ t A(x, S(x, z 1 , ..., z k ), z) holds for any x.
Another theory with similar witnessing properties is P V 1 which is an extension of a theory P V introduced by Cook [4] , see also [13] . The language of P V 1 consists of symbols for all functions given by a Cobham-like inductive definition of p-time functions (hence it contains L bit ). P V 1 defined in Krajíček-Pudlák-Takeuti [16] is then a first-order theory axiomatized by equations defining all the function symbols and a derivation rule similar to polynomial induction for open formulas. It is a universal theory, i.e. it has an axiomatization by purely universal sentences, and since all function symbols of P V 1 have well-behaved Σ 
2 (bit) and P V 1 are weak fragments of arithmetic but they are sufficiently strong to prove many things. We can interpret provability in P V 1 and S 1 2 as capturing the idea of what can be demonstrated when our reasoning is restricted to manipulations of p-time objects.
More formalizations of circuit lower bounds for SAT
LB(SAT, n k ) is not the only way to express circuit lower bounds for SAT. For example, for given n 0 and k, we can define formula SCE(SAT, n k ) stating that for each 1 n > n 0 and each n k -size circuit there is a satisfiable formula of size n such that the circuit will not find its satisfying assignment.
where C(y; w) = z means that w is a computation of circuit C on input y with output bits z. Formally, Comp(C, y, w) ∧ ∀i < |z|(w |w|−i−1 = 1 ↔ z i = 1). SCE in SCE(SAT, n k ) refers to "search SAT counterexample".
A different formalization of circuit lower bounds is given by the following formula DCE(SAT, n k ) where DCE refers to "decision SAT counterexample". In DCE(SAT, n k ) circuits C attempting to solve SAT have again just one output but using self-reducibility they are used to search for satisfying assignments of propositional formulas: If C says that a formula y is satisfiable, we can set the first free variable in y firstly to 1 and then to 0, and use C to decide in which of these cases the resulting formula is satisfiable and in the same manner continue searching for the full satisfying assignment. DCE(SAT, n k ) states that for each n k -size circuit C there is a formula y and a possibly partial assignment to its variables a such that either 1.) SAT (y, a) and C says that y is unsatisfiable, or 2.) ¬SAT (y, a) for a full assignment a of y and C says that a satisfies y, or 3.) it happens that C gets into a local inconsistency: for a partial assignment a of y C claims that y under the assignment a is satisfiable but when we extend a by setting the first of the remaining free variables by 1 and 0 in both cases C claims that the resulting formula is unsatisfiable. Formally,
where y(a) encodes formula y under the assignment a, F A(a, y) resp. P A(a, y) means that a is full resp. partial assignment to variables in y and y(a1) resp. y(a0) is y under the assignment which is the extension of a that sets the first unassigned variable in y to 1 resp. 0. We leave details of these encodings to the reader.
The formalizations LB(SAT, n k ), SCE(SAT, n k ), DCE(SAT, n k ) are (essentially) equivalent modulo slight changes to the size parameter. For example, SCE(SAT, Kn k+1 ) → LB(SAT, n k ) and LB(SAT, n k + Kn) → SCE(SAT, n k ), where SCE(SAT, Kn k+1 ) is defined as SCE(SAT, n k ) but with |w| bounded by Kn k+1 . Similarly for LB(SAT, n k + Kn). Here, K is a sufficiently big constant and n 0 is arbitrary but the same constant in the assumption and in the conclusion of each implication. We claim that this is provable already in P V 1 .
Proposition 3.1. P V 1 proves the following implications
where K is a sufficiently big constant and n 0 is arbitrary but the same constant in the assumption and the conclusion of each implication.
Proof: The first implication was observed in [5] : Assume ¬LB(SAT, n k ), i.e. for a big enough n there is an n k -size circuit C deciding SAT on instances of size n. Then there is a p-time function which given a circuit C witnessing ¬LB(SAT, n k ) produces a Kn k+1 -size circuit sC which outputs a satisfying assignment sC(y) for every satisfiable formula y of size n. For each i, the circuit sC finds the i-th bit of the satisfying assignment by asking C whether y remains satisfiable if the i-th variable is set to 1, given the values it has previously found for the first i − 1 variables. Then (assuming ¬LB(SAT, n k ) and SAT (y, a)) P V 1 proves by Σ b 0 (P V ) induction on i that y instantiated by the first i truth values is satisfiable according to C and hence ¬SCE(SAT, Kn k+1 ).
Concerning the second implication: If ¬SCE(SAT, n k ), i.e. for a big enough n there is an n k -size circuit C which outputs a satisfying assignment C(y) for every satisfiable formula of size n, then there is a p-time function which given any such circuit C produces an (n k + Kn)-size circuit dC which decides SAT on instances of size n. Given a formula y, dC outputs 1 if and only if C(y) satisfies y. Assuming ¬SCE(SAT, n k ) it follows in P V 1 that (SAT (y, a) → dC(y; w) = 1) ∧ (dC(y; w) = 1 → SAT (y, C(y))) for any y, a of size |a| < |y| = n, hence ¬LB(SAT, n k + Kn).
Next, in P V 1 , if circuit C witnesses ¬DCE(SAT, n k ), then it witnesses also ¬LB(SAT, n k ): for any y, a of size |a| < |y| = n for a big enough n, C(y; w) = 0 → ¬SAT (y, a) and if C(y; w) = 1 then by Σ b 0 (P V )-induction (as in the first implication) C(y(b); w) = 1 for a full assignment b of y for which SAT (y, b) holds.
Finally, in P V 1 , if circuit C witnesses ¬LB(SAT, n k ), then it witnesses ¬DCE(SAT, n k ): for any y, a of size |a| < |y| = n for a big enough n, (C(y; w) = 0 → ¬SAT (y, a)), C(y(a); w) = 1 ∧ F A(a, y) → SAT (y, a) and if C(y(a); w) = 1 ∧ P A(a, y) then for some b extending a SAT (y, b) and thus C(y(a1); w) = 1 ∨ C(y(a0); w) = 1.
Witnessing errors of p-size circuits
Using LB(SAT, n k ), SCE(SAT, n k ) and DCE(SAT, n k ) we can define several types of error witnessing of p-size circuits claiming to solve SAT.
We say somewhat informally that LB(SAT, n k ) ∈ P if there is a p-time algorithm A which for any sufficiently big n and any n k -size circuit C with n inputs finds out y, a such that LB(C, y, a):
Intuitively, A witnesses the important existential quantifiers in LB(SAT, n k ).
We say that LB(SAT, n k ) has an S-T protocol with l rounds if there is a p-time algorithm S such that for any function T and any sufficiently big n, whenever S is given n k -size circuit C, S outputs y 1 , a 1 such that either LB(C, y 1 , a 1 ) or otherwise T sends to S w 1 , z 1 certifying ¬LB(C, y 1 , a 1 ). Then S uses C, w 1 , z 1 to produce y 2 , a 2 and the protocol continues in the same way, S possibly using all counter-examples T sent in earlier rounds. But after at most l rounds S outputs y, a such that LB(C, y, a).
Analogously, DCE(SAT, n k ) ∈ P if there is a p-time algorithm A which for any sufficiently big n and any n k -size circuit C with n inputs finds out y, a such that DCE(C, y, a):
Finally, SCE(SAT, n k ) ∈ P if there is a p-time algorithm A which for any sufficiently big n and any n k -size circuit C with n inputs and n outputs finds out y, a such that SAT (y, a) ∧ ¬SAT (y, C(y)).
The phrase that DCE(SAT, n k ) resp. SCE(SAT, n k ) has an S-T protocol with l rounds could be defined similarly but notice that in this case T's advice would consist only of computations w of given circuit C which can be produced by S itself as it has C as input.
In practice, if we want to witness that no small circuit solves SAT, it does not seem sufficient to have a p-time algorithm for LB(SAT, n k ) because such an algorithm could output a tautology but we would not have an apriori way to certify that it is indeed a tautology and hence a correctly witnessed error. Therefore, it seems that practically more appropriate error witnessing is defined by DCE(SAT, n k ) or SCE(SAT, n k ) in which we actually force given circuits to claim inconsistent statements. We discuss it in more detail in the next section.
Circuit Lower Bounds in S
In this section we observe that the provability of circuit lower bounds in S 1 2 (bit) would give us an efficient witnessing of errors of p-size circuits for SAT described in the previous section. Then we show that certain hardness assumptions imply the same efficient witnessing of errors. Consequently it seems that the first result itself cannot be used to show the unprovability of LB(SAT, n k ) in S 1 2 (bit). Similar observations appeared already in Buss [3] . More precisely, Proposition 4.1 is a folklore and Buss [3, Section 4.4] described also a witnessing of SCE(SAT, n k ) by non-uniform p-size circuits based on the existence of strong pseudorandom generators which is analogous to the one from Proposition 4.2.
Proof: LB(SAT, n k ), DCE(SAT, n k ) and SCE(SAT, n k ) are universal closures of Σ b 2 (bit)-formulas so the first implication follows directly from Theorem 3.2. In case of SCE(SAT, n k ) and DCE(SAT, n k ) T's advice in the resulting S-T protocol consist just of computations of given circuit C. This can be, however, produced by S itself as it has C as input.
Alternatively, one could show in S 1 2 (bit) that SCE(SAT, n k ) and also DCE(SAT, n k ) can be stated in a ∀Σ b 1 (bit) way and apply directly Buss's witnessing.
An efficient witnessing of errors of p-time SAT algorithms can be performed in the following way.
If f is a one-way function, we can secretly produce a ∈ {0, 1} n and ask the algorithm claiming to solve SAT whether the statement f (a) = f (x) encoded as a poly(|a|)-size formula with free variables x = x 1 , ..., x n is satisfiable (the formula might also contain some auxiliary variables used to express computation of f such that their value can be efficiently determined given any assignment to x), see Cook-Mitchell [6] . The algorithm is forced to say that the formula is satisfiable and by the choice of f , with high probability it will not find its satisfying assignment.
Atserias (private communication) suggested to derandomize this construction and Krajíček made the following observation.
Proposition 4.2. If there exists a one-way permutation f computable in ptime and secure against p-size circuits, i.e. for any p-size circuits C n there is a function (n) = n −ω(1) such that for large enough n,
and if there exists h ∈ E hard on average for subexponential circuits, i.e. there is δ > 0 such that for all circuits C n of size ≤ 2 δn and large enough n,
Proof: If there is h ∈ E hard on average for subexponential circuits, by [17] for each l there is c and NW-generator g : {0, 1} c log n → {0, 1} n such that g is poly(n)-time computable and for any n l -size circuits D n ,
This generator allows us to derandomize the construction above: Let f be a one-way permutation secure against p-size circuits. Take l such that for each
n can be computed by n l -size circuits. Here, f (x) = f (y) is a 3CNF formula expressing the fact that f (x) = f (y). The formula has free variables y = y 1 , ..., y n together with auxiliary variables used to express the computation of f . On the other hand, x's in f (x) = f (y) are constants denoting x ∈ {0, 1} n . The size of f (x) = f (y) is n d for an absolute constant d (but f (x) = f (y) can be seen also as a formula of size (n + 1) d ). For the chosen l there is c and N W -generator g as mentioned above. Now, we will describe the algorithm witnessing SCE(SAT, n k ) ∈ P
The failure of C m is thus witnessed in p-time by the formula f (g(b)) = f (y) and its assignment g(b).
Proposition 4.2 says that under certain hardness assumptions we can witness circuit lower bounds for SAT in p-time. It is natural to ask now for a p-time witnessing of these assumptions. What we already know is that by Jeřábek [9, Corollary 3.6] the existence of a function h ∈ E hard for subexponential circuits in S could formalize randomized algorithms as described in Jeřábek [10] . Krajíček observed that a witnessing of LB(SAT, n k ) is also possible assuming just that LB(SAT, n k ) holds but the witnessing is non-constructive and only by nonuniform p-size circuits, see Proposition 4.4. Proposition 4.2 seems to imply that for proving S 1 2 (bit) SCE(SAT, n k ) we need to use other properties than SCE(SAT, n k ) ∈ P. Moreover, assumptions of Proposition 4.2 give us an S-T protocol for LB(SAT, n k ) too. Informally, any n k -size circuit C claiming to decide SAT can be used to search for satisfying assignments of propositional formulas. Using the algorithm from Proposition 4.2, S can produce y, a, such that SAT (y, a) but C will not find any satisfying assignment of y. This means that either C claims that y is unsatisfiable or the assignment it finds does not satisfy y or while searching for a satisfying assignment it gets into a local inconsistency which is the only case when S needs to ask for an advice of T, a satisfying assignment of y extending the partial assignment found by C. Proposition 4.3. If the same hardness assumption as in Proposition 4.2 holds, then LB(SAT, n k ) has an S-T protocol with 1 round (i.e. 1 advice of T) where S is a p-time algorithm, and LB(SAT, n k ) has also an S-T protocol with poly(n) rounds where S is in uniform AC 0 . Here, "S in uniform AC 0 " means that for each n, there are poly(n) circuits S n 1 , ..., S n poly(n) , one for each round of the interaction of the S-T protocol, and the uniformity means that there is a p-time algorithm which produces S n j given 1 n and 1 j without knowing the interaction before round j.
Proof: By Proposition 4.2 we have a p-time algorithm A solving SCE(SAT, n 2k ). Firstly, we show that LB(SAT, n k ) has an S-T protocol with 1 round and p-time S.
For each n k -size circuit C with one output bit, there is a circuit sC of size ≤ Kn k+1 , for a sufficiently big K, searching for satisfying assignments of given formulas as in Proposition 3.1. Here we give a more detailed description: For each formula y, let a be a partial assignment of y produced by sC so far (empty at the beginning) and denote by y(a) the formula y under the assignment a. If C(y(a)) = 0, sC outputs an assignment of y full of zeros. If C(y(a)) = 1, it assigns y 1 a , the first free variable in y(a), firstly by 1 and then by 0. Denote the resulting formula y(a1) resp. y(a0). If C(y(a1)) = C(y(a0)) = 1, sC sets y Given C, S can produce sC in p-time and use A to find y, a 1 such that SAT (y, a 1 ) but ¬SAT (y, sC(y)).
If C(y) = 0, S outputs y, a 1 . Else, S simulates sC on input y. If it never happens that C(y(a1)) = C(y(a0)) = 0 for any partial assignment a produced by sC, S outputs y(sC(y)). Otherwise, for some partial assignment a of y, C(y(a)) = 1 and C(y(a1)) = C(y(a0)) = 0. In such case S outputs y(a), a 2 where a 2 is a full assignment of y extending a with all zeros. If this is not a correct answer, T replies with a 3 extending a and satisfying y. Then S outputs y(ab), a 3 where b ∈ {0, 1} such that ab is consistent with a 3 .
In all cases S succeeds after asking for at most 1 advice of T.
To get S in uniform AC 0 note that A actually produces a set B of ≤ n c propositional formulas of the form f (Y ) = s and their satisfying assignments such that each Kn k+1 -size circuit fails on at least one of them. It suffices to use instead of A the set B, i.e. AC 0 S will try all of the formulas f (Y ) = s with their satisfying assignments in place of y, a 1 . Recall that the AC 0 S is actually a sequence of polynomially many uniform AC 0 circuits in the sense that every reply of T is managed by a different AC 0 circuit. Given C, S will firstly try some y, a 1 from B (it does not produce sC). If y, a 1 does not witness that C does not solve SAT as in LB(SAT, n k ), T replies with the computation of C witnessing that C(y) = 1. S then finds out if C(y(1)) = C(y(0)) = 0 using the following general protocol. Whenever S needs to simulate given circuit C on input z, it outputs z with its arbitrary assignment r. If z, r does not witness that C fails to solve SAT, T replies either with a satisfying assignment d of z or with the computation of C on input z which can be verified by a uniform constant-depth formula. In the former case, S (but a different AC 0 circuit than the one which produced z, r) outputs z, d and this time it either witnesses that C fails to solve SAT or it gets the computation of C. In this way S finds out if C(y(1)) = C(y(0)) = 0 and continues to simulate sC and the S-T protocol with p-time S.
If the protocol above using y, a 1 does not witness failure of C, S tries another element from B in place of y, a 1 . By the definition of B, at least one of them works.
Note that the uniformity of the AC 0 S-T protocol described in Proposition 4.3 is not DLOGTIME because to produce the respective AC 0 circuits we need to compute a function h ∈ E on log-sized inputs which is hard for subexponential circuits.
Further, while Proposition 4.3 says that uniform AC 0 S-T protocols for LB(SAT, n k ) with poly(n) rounds are likely to exist, in Theorem 6.1 we will show that under a hardness assumption LB(SAT, n k ) has no AC 0 S-T protocols with O(1) rounds.
The proof of Proposition 4.3 shows also that if SCE(SAT, n k ) ∈ P, then DCE(SAT, n k ) ∈ P. All in all, Buss's witnessing does not seem to help us to obtain the unprovability of LB(SAT, n k ) in P V 1 or S 1 2 (bit). Maybe it could work for intuitionistic S 1 2 where the witnessing holds for arbitrarily complex formulas, cf. Buss [2] . The situation is different in case of weaker theories where we have more efficient witnessing. This will allow us to reduce to some hardness assumptions.
Before considering weaker theories let us also mention that in order to show SCE(SAT, n k ) ∈ P/poly, it suffices to assume that for any sufficiently big n, SAT restricted to instances of length n has no circuit of size n 2k . This was observed by Krajíček in [14] but unlike Buss's [3, Section 4.4] proof of SCE(SAT, n k ) ∈ P/poly which assumes the existence of strong pseudorandom generators, this method is not constructive in the sense that it does not tell us what could be the hard SAT instances.
Krajíček's observation uses a well known combinatorial principle
Now take as X the set of all n k/2 -size circuits and interpret E(x, y) as "y is a satisfiable formula of size n and circuit x does not find a satisfying assignment of y". Assume n is big enough. If SAT restricted to instances of size n does not have n k -size circuits, then for every n circuits C 1 , ..., C n of size n k/2 there is y such that i=1,...,n E(C i , y). Else, there is a specific sequence of n circuits such that for any satisfiable y at least one of these n circuits finds a satisfying assignment of y and this yields a single n k -size circuit solving SAT at length n, contradicting the assumption. By the principle above, there are then y 1 , ..., y n k such that for each n k/2 -size circuit C, i=1,...,n k E(C, y i ). Therefore there is an n 2k -size circuit which for each x ∈ X finds y such that E(x, y) by trying E(x, y i ) for i = 1, ..., n k and thus using additional satisfying assignments a 1 , ..., a n k of respective y's as advice solves SCE(SAT, n k/2 ).
Analogously, we can show that DCE(SAT, n k ) ∈ P/poly by considering E(x, y) = "circuit x rejects formula y which is satisfiable or circuit x accepts y but if it is used to find a satisfying assignment of y it ends up in the same inconsistent situation as in DCE(x, y, a) for some a". Such E(x, y) is a p-time relation.
It is not clear how to apply this technique in the case of LB(SAT, n k ). Straightforwardly defining E(x, y) as "circuit x rejects formula y which is satisfiable or circuit x accepts unsatisfiable y" does not work because then for each y, ¬E(1, y) ∨ ¬E(0, y) where 1 resp. 0 is a trivial circuit which outputs always 1 resp. always 0.
Therefore, we have the following proposition.
Proposition 4.4 (Krajíček [14] ). If for any sufficiently big n, SAT restricted to instances of length n has no circuit of size n 2k , then SCE(SAT, n k ) and DCE(SAT, n k ) are in P/poly.
Theories weaker than P V 1
We will now present some theories weaker than P V 1 like T N C 1 for which we will show the unprovability of circuit lower bounds. We could however similarly define a general theory T C corresponding to a standard complexity class C and our results would work analogously. T N C 1 is a universal theory so it admits the KPT theorem from [16] :
If T N C 1 ∃y∀zA(x, y, z) for open formula A, there are finitely many functions f 1 , ..., f k in uniform N C 1 such that
There are also two-sorted theories of Bounded Arithmetic corresponding to uniform AC 0 , N C 1 and other complexity classes, cf. Cook-Nguyen [7] . The first-sort (number) variables are denoted by lower case letters x, y, z, ... and the second-sort (set) variables by capital letters X, Y, Z, ... The underlying language includes the symbols +, ·, =, ≤, 0, 1 of first-order arithmetic. In addition it contains symbol = 2 interpreted as equality between bounded sets of numbers, |X| for the function mapping an element X of the set sort to the largest number in X plus one, and ∈ for the relation n ∈ X meaning that n is an element of X.
Bounded quantifiers for sets have the form ∃X ≤ t φ which stands for ∃X (|X| ≤ t ∧ φ) or ∀X ≤ t φ for ∀X (|X| ≤ t → φ). Here t is a number term which does not involve X. Σ B 0 formulas are formulas without bounded quantifiers for sets but may have bounded number quantifiers. Each bounded set X ≤ t can be seen also as a finite binary string of size ≤ t which has 1 in the i-th position iff i ∈ X. When we say that a function f (x, X) mapping bounded sets and numbers to bounded sets is in AC 0 or N C 1 we mean that the corresponding function on finite binary strings X and unary representation of x is in AC 0 or N C 1 .
The base theory we will consider is V 0 consisting of a set of basic axioms capturing the properties of symbols in the two-sorted language and a comprehension axiom schema for Σ B 0 -formulas stating that for any Σ B 0 formula there exists a set containing exactly the elements that satisfy the formula, cf. [7] . Further, Cook and Nguyen define theory V N C 1 as V 0 extended by the axiom that every monotone formula has an evaluation, see [7] .
Analogously for V 0 with the resulting functions in uniform AC 0 .
LB(SAT, n k ) translates to the two-sorted language as follows
where k, n 0 are constants as before and Comp(C, Y, W ), C(Y ; W ) = 0/1, SAT (Y, Z) are defined as their first-order counterparts but function x i is replaced by i ∈ X.
Similarly, we obtain the two-sorted SCE(SAT, n k ), DCE(SAT, n k ).
Let us also specify the formalization of LB(SAT, n k ) in T N C 1 . L N C 1 contains symbols for SAT (y, z), Comp(C, y, w) and all the predicates we explicitly defined as Σ b 0 (bit)-formulas because they are not just p-time but in fact uniform N C 1 . For simplicity, whenever we speak about LB(SAT, n k ) in T N C 1 we mean its formalization where instead of the Σ b 0 (bit)-formulas we have the respective symbols of L N C 1 . Similarly for SCE(SAT, n k ), DCE(SAT, n k ).
Therefore, LB(SAT, n k ), SCE(SAT, n k ) and DCE(SAT, n k ) in T N C 1 have the form ∃y∀z A(x, y, z) for an open formula A (i.e. A has no quantifiers).
The situation with the provability of polynomial circuit lower bounds in weak theories like T N C 1 is less natural because they cannot fully reason about p-time concepts. In particular, there is a formula LB ∃ (SAT, n k ) which is equivalent to LB(SAT, n
is like LB(SAT, n k ) but with LB(C, y, a) (defined in Section 3.2) expressed positively:
Analogously define DCE ∃ (SAT, n k ), SCE ∃ (SAT, n k ) and their two-sorted and L N C 1 formulations.
By the witnessing theorem above, if T N C 1 proves LB(SAT, n k ), then LB(SAT, n k ) has an N C 1 S-T protocol with O(1) rounds which is S-T protocol with O(1) rounds and S in uniform N C 1 . If T N C 1 LB ∃ (SAT, n k ), then LB ∃ (SAT, n k ) has an N C 1 S-T protocol with O(1) rounds which is defined analogously as for LB(SAT, n k ) but with S producing also computations w of given circuits. As DCE ∃ (SAT, n k ) has the form ∃yA(
but with the witnessing algorithm producing also computations w of given circuits. Analogously for theories V 0 , V N C 1 .
6. Unprovability of circuit lower bounds in subtheories of P V 1
To prove that V N C 1 or T N C 1 do not prove LB(SAT, n k ) it suffices to show that LB(SAT, n k ) has no S-T protocol with O(1) rounds where S is in uniform N C 1 . For the unprovability of LB ∃ (SAT, n k ) it however suffices to refute the existence of S-T protocols with O(1) rounds where S ∈ N C 1 produces w's (computations of given circuits) itself. This is essentially trivial since in such case, N C 1 circuits could produce computations of general circuits of similar size:
) has no N C 1 S-T protocol with poly(n) rounds unless SIZE(n k ) ⊆ N C 1 . Unconditionally, for any sufficiently big k, LB(SAT, n k ) / ∈ AC 0 , DCE ∃ (SAT, n k ) / ∈ AC 0 and LB ∃ (SAT, n k ) has no AC 0 S-T protocol with poly(n) rounds.
Proof: Assume first that LB(SAT, n k+1 ) ∈ N C 1 , i.e. there are N C 1 circuits D m (x) such that for sufficiently big n whenever x ∈ {0, 1} m for m = poly(n) encodes an n k+1 -size circuit C n with n inputs, D m (x) outputs y, a such that
Now any n k -size circuits B n with n inputs can be simulated by N C 1 circuits: For b ∈ {0, 1} n and z = (z 1 , ..., z n ) denote R[B n , b, z] the circuit with n inputs z but computing as B n on b, i.e. it does not use inputs z at all. The size of
. Let E n (b) be an AC 0 circuit which uses description of B n 's as advice and maps b ∈ {0,
For each b ∈ {0, 1} n , use D m (E n (b)) to find y, a and output 0 iff SAT (y, a).
Deciding SAT (y, a) is by our formalization doable by constant-depth formulas. Therefore, for each b, we predict B n (b) with an N C 1 circuit.
If LB(SAT, n k ) ∈ AC 0 for sufficiently big k, we would obtain AC 0 circuits for PARITY, which is impossible. This construction works analogously for DCE ∃ (SAT, n k+1 ) and as well for LB ∃ (SAT, n k+1 ). If LB ∃ (SAT, n k+1 ) has an N C 1 S-T protocol, then for given n k+1 -size circuit C, S does not have to produce w, y, a such that w is a computation of C on input y but then T can reply 0 and S is thus eventually forced to produce a computation of circuit C which means that N C 1 S can simulate any n k -size circuit as in the case of LB(SAT, n k+1 ).
This simple observation does not work if we want to refute that LB(SAT, n k ) has N C 1 S-T protocols because T can send to S a computation of the artificially attached circuit. Indeed by Proposition 4.3, LB(SAT, n k ) has a uniform AC 0 S-T protocol with poly(n) rounds under a plausible assumption.
We can however show that LB(SAT, n k ) has no N C 1 S-T protocols with O(1) rounds under a hardness assumption. To show this we will use an interpretation of suitable NW-generators as p-size circuits which is due to Razborov [20] and Krajíček's proof of a hardness of certain NW-generators for theory T P V which is defined as T N C 1 but in the language containing names for all p-time algorithms, cf. [15] . Actually, the proof of the following theorem seems to be a natural modification of the proof of Proposition 6.1.
To prove the theorem we will use Nisan-Wigderson (NW) generators with specific design properties. Let A = {a i,j } i=1,...,m j=1,...,n be an m × n 0-1 matrix with l ones per row. J i (A) := {j ∈ {1, ..., n}; a i,j = 1} and f : {0, 1} l → {0, 1}. Then define NW-generator based on f and A, N W f,A : {0, 1} n → {0, 1} m as
where x|J i (A) are x j 's such that j ∈ J i (A). For any c ≥ 2, Nisan and Wigderson [17] constructed 2 n × n 2c 0-1 matrix A with n c ones per row which is also (n, n c )-design meaning that for each i = j, |J i (A) ∩ J j (A)| ≤ n. Moreover, the matrix A has such a property that for big enough n there are n 2c -size circuits which given i ∈ {0, 1} n compute the set J i (A), more precisely, given input i ∈ {0, 1} n they output n c indices in J i (A) where each index is described by 2c log n output bits. Therefore, as it was observed by Razborov [20] , if f is in addition computable by n k -size circuits, for any x ∈ {0, 1} n 2c , (N W f,A (x)) y is a function on n inputs y which is for sufficiently big n computable by circuits of size n 4kc . To see this, note that for any given y ∈ {0, 1} n an n 2c -size circuit produces n c indices of J y (A) where the r-th index is described by 2c log n bits J r,1 , ..., J r,2c log n . Then a circuit of size ≤ n c n 2c (2Kc log n + K), with an absolute constant K, which has the form r∈{1,...,n c } s∈{0,1} 2c log n (( t∈{1,...,2c log n} (J r,t ↔ s t )) → (r-th output bit ↔ x s )) specifies n c bits in x on which an n ck -size circuit computes f (x|J y (A)). As n 2c + n kc + n c n 2c (2Kc log n + K) < n 4kc for k ≥ 1 and big enough n, the whole circuit computing (N W f,A (x)) y has size < n 4kc .
Proof(of Theorem 6.1): Let f ∈ SIZE(n k ) and A be a 2 n × n 2c (n, n c )-design defined above so for any sufficiently big n and any x, (N W f,A (x)) y can be computed from y by an n 4kc -size circuit. Assume that LB(SAT, n 4kc ) has an N C 1 S-T protocol with O(1) rounds. In particular, for sufficiently big n and each n 4kc -size circuit C(y) computing (N W f,A (x)) y , S either finds out the value of C(y 1 ) by deciding (in AC 0 ) SAT (y 1 , a 1 ) for y 1 , a 1 it produced itself or T will send to S counterexamples w 1 , b 1 such that
In the latter case, S continues with its second try y 2 , a 2 . After at most t ≤ l rounds for some fixed constant l, S will successfully predict C(y t ).
Let E n 2c (x) be AC 0 circuits mapping x ∈ {0, 1} n 2c to a description of an n 4kc -size circuit with n inputs y computing the function (N W f,A (x)) y , so E n 2c just substitutes given x to a description of (N W f,A (x)) y which is otherwise fixed. Moreover, without loss of generality, for any y and x 1 , x 2 such that x 1 |J y (A) = x 2 |J y (A) the computation of E n 2c (x 1 ) on input y is the same as the computation of E n 2c (x 2 ) on input y up to the specific bits of x 1 resp. x 2 where x 1 and x 2 differ. We denote the invariant part of the computation of E n 2c (x) on input y as its relevant part. To be precise, it is the computation of E n 2c (x) on input y with bits x j , j / ∈ J y (A) replaced by 0's. We will consider our S-T protocol only on inputs of the form E n 2c (x). Krajíček [15] showed that if f is in NP∩coNP with unique witnesses such S-T protocol allows us to approximate f by a p-size circuit. We will inspect that his proof works also for f in P/poly and N C 1 S-T protocols. In addition we will assume that T in our S-T protocol operates as follows: whenever S outputs y with some a, T answers with the lexicographically first assignment b satisfying y and the unique relevant part w of the computation of given circuit on input y. If there is no such b, T replies with a string of zeroes instead of b (and the unique relevant part w of the computation of given circuit on input y). This should replace the uniqueness property assumed in [15] . Note that S can recover the full computation of given circuit on input y just from its relevant part.
For u ∈ {0, 1} n c and v ∈ {0, 1} n 2c −n c define r y (u, v) ∈ {0, 1} n 2c by putting bits of u into positions J y (A) and filling the remaining bits by v (in the natural order). For each x there is a trace tr(x) = y 1 , a 1 , ..., y t , a t , t ≤ l of the S-T communication.
Claim 1.
There is a trace T r = y 1 , a 1 , . .., y t , a t , t ≤ l and p ∈ {0, 1} n 2c −n c such that T r = tr(r yt (u, p)) for at least a fraction of 2/(3(2 2n )) t of all u's.
T r and p can be constructed inductively. There are at most 2 2n pairs y j , a i , hence there is y 1 , a 1 such that at least 1/2 2n traces begin with it. Either there is p ∈ {0, 1} n 2c −n c such that y 1 , a 1 = tr(r y 1 (u, p)) for at least 2/(3(2 2n )) of all u's or we can find y 2 , a 2 such that at least 1/(3(2 2n ) 2 ) traces begin with y 1 , a 1 , y 2 , a 2 . For the induction step assume we have a trace y 1 , a 1 , ..., y i , a i such that at least 1/(3 i−1 (2 2n ) i ) traces begin with it. Either there is p ∈ {0, 1} n 2c −n c such that y 1 , a 1 , ..., y i , a i = tr(r y i (u, p)) for at least 2/(3 i (2 2n ) i ) of all u's or we can find y i+1 , a i+1 such that at least 1/(3 i (2 2n ) i+1 ) traces begin with y 1 , a 1 , ..., y i+1 , a i+1 . This proves the claim.
Fix now T r and p from the previous claim. Because A is (n, n c )-design, for any row y = y t at most n x j 's with j ∈ J y (A) are not set by p. Hence there are at most 2 n assignments z to x j 's with j ∈ J y (A) not set by p. For each such z let w z , b z be the T's advice after S outputs y, a i on any x containing the assignment given by z and p. By our choice of T, b z depends only on y and w z is uniquely determined by z (and p which is fixed). Let Y y , y = y t be the set of all these witnesses w z , b z for all possible z's. The size of each such Y y is 2 O(n) (including the sizes of the witnesses w z , b z ). Now we define a formula F that attempts to compute f and uses as advice T r, p and some t sets Y y . For each u ∈ {0, 1} n c produce r yt (u, p) (this is in AC 0 ). Let V be the set of those inputs u for which tr(r yt (u, p)) either is T r or extends T r and let U be the complement of V . Define d 0 to be the majority value of f on U . Then use S to produce y 1 , a 1 . If y 1 , a 1 is different from T r output d 0 . Otherwise, find the unique T's advice in Y y 1 . Again, this is doable by a constant depth formula of size 2 O(n) which has poly(n) output bits. It has the form z∈{0,1} n (z = r yt (u, p)|(J y 1 (A) ∩ J yt (A)) → output = w z ∈ Y y 1 ). In the same manner continue until S produces y t , a t . If y t , a t differs from T r output d 0 . Otherwise, output 0 iff SAT (y t , a t ).
F is a formula with n c inputs and size 2 O(n) because producing r yt (u, p) is in AC 0 , searching for T's advice in Y y i 's is doable by constant-depth 2 O(n) -size formulas, S is in N C 1 and the structure of S-T protocol can be described by a constant-depth formula of size n O(1) :
(S(x) / ∈ T r → output = d 0 ) ∧ (S(x) ∈ T r → ((S(x, w z , b z ) / ∈ T r → output = d 0 ) ∧ (... (S(x, w 1 , b 1 , . .., w t , b t ) / ∈ T r → output = d 0 )∧ (S (x, w 1 , b 1 , . .., w t , b t ) ∈ T r → (output = 0 ↔ SAT (y t , b t )))...)))
By the choice of T r, for at least a fraction 2/(3(2 n )) t of all u ∈ {0, 1} n c , we have that u ∈ V and F will successfully predict f (u). Moreover, by the choice of T r in the proof of Claim 1, at most 1/(3(2 n )) t of all traces tr(r yt (u, p)) properly extend T r. Since d 0 is the correct value on at least half of u ∈ U , F will successfully predict f (u) on at least half of U , half of V and 1/2(1/(3 t 2 nt )) of all u's. That is, P u∈{0,1} n c [F (u) = f (u)] ≥ 1/2 + 1/(3 t 2 nt+1 ). such that for sufficiently big n's, P x∈{0,1} n [F n (x) = f (x)] ≥ 1/2 + 1/2 O(n 1/c ) .
To obtain an unconditional unprovability of circuit lower bounds we can use Hastad's lower bound for constant depth circuits computing the parity function.
Theorem 6.2 (Hastad [8] ). For any depth d circuits C n of size 2 n 1/(d+1) and large enough n, P x∈{0,1} n [C n (x) = P ARIT Y (x)] ≤ 1/2 + 1/2 n 1/(d+1) .
If V 0 LB(SAT, n k ), then LB(SAT, n k ) has an AC 0 S-T protocol with O(1) rounds so the resulting formula F in the proof of Theorem 6.1 would be actually a constant-depth circuit and PARITY could be approximated by constant depth circuits of size 2 O(n 1/c ) with advantage 1/2 O(n 1/c ) . This is not enough for the contradiction with Hastad's theorem. Nevertheless, it is sufficient if we replace polynomial circuit lower bounds LB(SAT, n k ) by quasi polynomial lower bounds LB(SAT, n log n ):
∀m > n 0 , ∀C, ∃y, a, |a| < |y| = n, ∀w, |w| ≤ n log n = m, [Comp(C, y, w) → (C(y; w) = 0 ∧ SAT (y, a)) ∨ (C(y; w) = 1 ∧ ∀z¬SAT (y, z))]
where n is the number of inputs to C and m represents n log n (or simply |m| = |n| 2 ).
If V 0 LB(SAT, n log n ), then in the proof of Theorem 6.1 we can use instead of n 4kc -size circuits of the form (N W f,A (x)) y with x ∈ {0, 1} n 2c say n 4k log log n -size circuits (N W f,A (x)) y with x of size n 2 log log n and big enough k. The proof works for big enough n even if c = log log n. The size of the resulting constant-depth circuit F is then 2 O(n 1/ log log n ) with advantage 1/2 O(n 1/ log log n ) contradicting Hastad's theorem.
Corollary 6.3. V 0 LB(SAT, n log n ).
Acknowledgement
I would like to thank Jan Krajíček, Albert Atserias, Sam Buss and an anonymous reviewer for many useful discussions, comments and suggestions.
