Proving circuit lower bounds in uniform classes is one of the grand challenges in computational complexity theory. Particularly, it is well known that proving superpolynomial circuit lower bounds in NP resolves the longstanding conjecture NP 6 ¼ P. Towards the final goal, a lot of work have been dedicated to approaches proving circuit lower bounds in high classes. This tutorial article overviews those proof techniques developed for circuit lower bounds in higher classes than NP such as PH, ZPP NP , MA EXP , PP, Promise-MA, NEXP, and so forth.
Introduction
Nonuniform computation models such as circuit families are allowed to switch algorithms depending on length of given inputs, while uniform models such as Turing machines are posed to run a single algorithm independently of the length of inputs. Due to this property, it is known that the nonuniform models have much stronger computational power than the uniform ones in some cases. (A Turing machine works on inputs of arbitrary length, but a single circuit C n : f0; 1g n ! f0; 1g can only work on a fixed input length n. Thus, we need to consider a circuit family C ¼ fC 1 : f0; 1g ! f0; 1g; C 2 : f0; 1g 2 ! f0; 1g; . . .g that contains a single n-input circuit for every input length n in order to deal with inputs of arbitrary length.)
For example, any decision problem L : f0; 1g Ã ! f0; 1g can be solved by a circuit family of size Oð2 n =nÞ [35] , where n is input length, but the so-called halting problem cannot be solved by any Turing machine no matter how long it runs.
(Throughout this article, we focus on decision problems of deciding if x 2 L or not on a given input x 2 f0; 1g Ã , or equivalently, of computing L : f0; 1g Ã ! f0; 1g, and so if we simply say ''a problem'' it means a decision problem if not specified.)
If computational resources are bounded in these models, little is known of how strong circuits are in uniform complexity classes such as P :¼ TIMEðpolyðnÞÞ and NP :¼ NTIMEðpolyðnÞÞ. (TIMEðtðnÞÞ and NTIMEðtðnÞÞ are classes of problems solved by OðtðnÞÞ-time deterministic and nondeterministic Turing mahcines, respectively.) For example, it is widely believed that some problem in exponential-time class EXP :¼ TIMEð2 polyðnÞ Þ cannot be solved by a polynomial-size circuits, i.e., EXP 6 & SIZEðpolyðnÞÞ (SIZEðsðnÞÞ is a class of problems solved by OðsðnÞÞ-size circuit families), but we have no technique to prove this separation at the present time.
Therefore, it is one of the most fundamental and important issues in computational complexity theory to prove lower bounds of circuit size, i.e., circuit lower bounds, for a problem that belongs to a uniform class. In other words, our task is then to show that a specific problem computed by some uniform model (for example, the satisfiability problem SAT, which can be solved by a polynomial-time nondeterministic Turing machine) cannot be solved by any circuit family of small size (for example, polynomial-size circuit families).
Proving high circuit lower bounds in uniform classes, namely, discovering a problem in unform classes that is hard against small circuit families, not only enriches our understanding for the power of nonuniform computation but also clarifies limits of the uniform computation such as separations of uniform classes and derandomization of randomized uniform classes. Therefore, proving some circuit lower bounds directly resolves longstanding open problems in computational complexity theory.
For example, proving superpolynomial circuit lower bounds in NP implies the separation of NP from P. More specifically, by showing that some specific problem in NP such as SAT is not in the class SIZEðpolyðnÞÞ, i.e., 2010 Mathematics Subject Classification: Primary 68Q17, Secondary 68Q15. This work is supported in part by the ELC project (Grant-in-Aid for Scientific Research on Innovative Areas MEXT Japan, KAKENHI No. 24106009).
SAT 6 2 SIZEðpolyðnÞÞ (and thus NP 6 & SIZEðpolyðnÞÞ), we can obtain the consequence NP 6 ¼ P. Obviously, proving the NP 6 ¼ P conjecture is one of the ultimate goals in the computer science. Proving circuit lower bounds in NP establishes an approach towards the goal.
Also, proving exponential circuit lower bounds in the class E :¼ TIMEð2 OðnÞ Þ implies the full derandomization of BPP, which is a class of problems solved by polynomial-time probabilistic Turing machines with bounded errors [24] . More precisely, if there exists a problem in E that requires circuits of 2 ðnÞ size to be solved for every sufficiently large n, we have BPP ¼ P. The BPP ¼ P conjecture is also important in the computer science because it clarifies a limit of computational power of randomized computation, and so proving circuit lower bounds in E establishes an approach to resolving the conjecture. If BPP ¼ P, algorithms for decision problems cannot gain exponential speedups in the bounded-error setting.
As seen in the above examples, proving circuit lower bounds in uniform classes makes great impacts on the computer science. However, it is too challenging to prove them directly. In fact, the best known lower bound is 5n À oðnÞ in NP, shown by Iwama, Lachish, Morizumi and Raz [25, 26, 33] , which is still far from the goal, superpolynomial circuit lower bounds in NP, with currently known techniques. So, a number of papers in computational complexity theory have intensively and broadly studied relaxed settings towards the challenging goals.
One approach is to prove lower bounds of restrict circuit classes in NP, such as those of constant-depth circuits, monotone circuits, and so forth, and then, gradually relax them to unrestricted circuits. Another approach, on which we focus in this article, is to first prove high circuit lower bounds in high uniform classes, and then, let the uniform classes down towards NP.
The former approach has established excellent techniques to prove lower bounds for restricted circuit classes since early 80's [3, 21, 23, 39, 40, 46] . For example, it was shown by Ajtai [3] and Furst, Saxe and Sipser [21] that no AC 0 circuit family, which consists of polynomial-size circuits of constant depth with unbounded fan-in gates, solves the parity problem, which is a problem of deciding if P i x i is odd or not, and is obviously in the class P. However, the obstacle of natural proofs prevents most of successful techniques in this approach from further establishment of unrestricted circuit lower bounds in uniform classes such as NP. Proofs for circuit lower bounds often make use of some property P (equivalently, a subset of functions from f0; 1g Ã to f0; 1g) that a specific problem L (e.g., the parity problem) in uniform class does not satisfy (i.e., L = 2 P) but every problem in some circuit class C (e.g., AC 0 ) does (i.e., C P). Razborov and Rudich demonstrated that we cannot prove NP 6 & SIZEðpolyðnÞÞ by using a property that satisfies certain natural conditions under some cryptographic assumption believed widely [41] . For example, it is known that the conditions are satisfied in the property used for the proofs that the parity problem is not in AC 0 . As mentioned above, the latter approach, which we will study in this article, is to focus on the higher uniform classes than NP. We can imagine that it would be easier to prove a superpolynomial circuit lower bound in nondeterministic exponential time class NEXP :¼ NTIMEð2 polyðnÞ Þ, i.e., NEXP 6 & SIZEðpolyðnÞÞ, than NP 6 & SIZEðpolyðnÞÞ since NP & NEXP. So, we first try to prove a circuit lower bound in higher classes (e.g., NEXP), and then, find powerful techniques to get down them into NP later in this approach.
This approach has been already studied for more than three decades. In early 80's, Kannan proved fixed polynomial lower bounds in the polynomial-time hierarchy PH (precisely, AE P 2 \ Å P 2 ). Specifically, he proved that for every constant k > 0 there is a problem L k 2 AE Subsequent to the result of Kannan, several papers improved the classes which has fixed polynomial lower bounds on the framework of Kannan's argument, such as ZPP NP 6 & SIZEðn k Þ by Köbler and Watanabe [32] and S P 2 6 & SIZEðn k Þ by Cai [14] (the argument in [14] is based on Sengupta's observation) towards high circuit lower bounds in NP. Simply put, ZPP NP is a class of problems solved by a probabilistic Turing machine in expected polynomial time with zero error probability using a special subroutine (called an oracle) that solves NP problems in a single step, and S P 2 is a class of problems solved by a polynomial-time deterministic Turing machine (called a verifier) with witnesses sent from two competing all-powerful entities (called provers). See Definitions 4.6 and 4.12.
A series of the results on the framework of Kannan's argument can be also converted to superpolynomial circuit lower bounds by lifting up the uniform classes to their exponential-time versions, as discussed by Miltersen, Vinodchandran and Watanabe [36] . For example, Kannan's argument directly provided AE is an exponential-time version of AE P 2 \ Å P 2 . These arguments heavily rely on the diagonalization, which is the most standard technique for separations among classes, and this was known to be almost the only hopeful argument for circuit lower bounds to avoid the obstacle of natural proofs. However, the diagonalization itself is trapped in another barrier, the relativization barrier, discovered by Baker, Gill and Solovay [9] .
We say a separation A is then a class of problems that computable in C with access to A. In the case of circuits, the oracle is implemented as a gate.) Separations shown by the diagonalization relativize, and thus, if
for some oracle A, we cannot prove the separation C 6 & D by the diagonalization. [7, 8, 34, 45] , which suggested high potential for broad development of proof techniques in computational complexity theory, including circuit lower bounds.
In fact, Buhrman, Fortnow and Thierauf demonstrated a non-relativizing technique for circuit lower bounds given from the arithmetization technique [7] ; they proved superpolynomial circuit lower bounds in MA EXP , which is a randomized version of NEXP, and further, they showed that such lower bounds cannot be obtained by any relativizing argument, namely, their proof of circuit lower bounds actually avoids the relativization barrier. Vinodchandran also showed fixed polynomial lower bounds in PP, i.e., unbounded-error probabilistic polynomial time class, through the arithmetization [52] . Later, Aaronson proved Vinodchandran's result indeed avoids the relativization barrier [1] similarly to Buhrman et al.'s result [13] . Santhanam showed a striking improvement of Vinodchandran's result [43] ; he proved a fixed polynomial circuit lower bound in Promise-MA, (some randomized version of NP), which is known to be contained in PP [51] .
As shown by these results, the arithmetization pushed up proof techniques for circuit complexity beyond the relativization barrier and made significant steps forward towards resolving the challenging open problems.
However, Aaronson and Wigderson discovered a new obstacle called algebrization barrier [2] , which is an algebraic extension of the relativization barrier. Their result clarified limits of many known non-relativizing techniques including the arithmetization for proving circuit lower bounds and separating uniform classes such as circuit lower bounds in MA EXP Recently, Williams achieved a surprising breakthrough for proving circuit lower bounds beyond the obstacles; he proved that NEXP does not have non-uniform ACC 0 circuits of polynomial size, where ACC 0 consists of circuit families with constant depth over unbounded fan-in modulo gates in addition to the standard gates. Most recently, he further improved the class to NEXP \ coNEXP (an exponential-time version of NP \ coNP), namely, he proved NEXP \ coNEXP does not have ACC 0 circuits of size n log n [57] . As briefly described above, there have been important results with innovative ideas for proving circuit lower bounds in high uniform classes. In this tutorial article, we will overview these results and their proof techniques developed for circuit lower bounds.
Starting with basic notions and notation in Section 2, we will observe basic differences of computational power of Turing machines and circuit families in Section 3, and then, we will study significant results of circuit lower bounds in Sections 4, 5 and 6. Particularly, we will overview Kannan's argument and its extensions to prove fixed polynomial circuit lower bounds in the polynomial-time hierarchy in Section 4. In Section 5, we will overview results of circuit lower bounds proved by arithmetization techniques that overcome the relativization barrier. In Section 6, we will study a new technique recently discovered by Williams that provides circuit lower bounds in nondeterministic exponentialtime class from fast algorithms for circuit satisfiability problems. More specifically, we will study the following results in Sections 4, 5 and 6:
. fixed polynomial circuit lower bounds in AE [52] , . fixed polynomial circuit lower bounds in MA/1 and in Promise-MA (Section 5.4) [43] , . superpolynomial ACC 0 circuit lower bounds in NEXP and in NEXP \ coNEXP (Section 6.1) [56, 57] . As mentioned earlier, see Figure A·1 for relationships among uniform complexity classes in Appendix. We conclude this article in Section 7 with the next steps for proving circuit lower bounds in uniform classes.
Basic Notions and Notation
In this article, we consider Turing machines and family of circuits as devices to compute a decision problem L f0; 1g Ã (we sometime identify the decision problem L f0; 1g Ã as its characteristic function L : f0; 1g Ã ! f0; 1g, as mentioned earlier). We will not explain the details of Turing machines here; we just give a pointer to an excellent introduction, Chapters 1-3 of Arora and Barak's textbook [5] , for starters.
We suppose readers' basic (undergraduate-level) knowledge of computational complexity theory (including the definitions of the basic complexity classes NP, P, PSPACE, the well-known problem SAT, and others). For example, we defined NP :¼ NTIMEðpolyðnÞÞ earlier, but we often switch to another definition of NP: For L 2 NP there exists a Proving Circuit Lower Bounds in High Uniform Classes 3 polynomial p and polynomial-time computable function R L : f0; 1g Ã Â f0; 1g Ã ! f0; 1g such that x 2 L , 9w 2 f0; 1g pðjxjÞ , R L ðx; wÞ ¼ 1. It would be better for the readers who are unfamiliar with such an idea to first read Arora and Barak's textbook [5] . As for advanced complexity classes such as AE P k , ZPP NP , S P 2 , MA, we will give their definitions as necessary.
This article principally deals with notions of circuits, and so, we briefly introduce basic notation and notions for circuits. Unless otherwise specified, any circuit consists of the gates from the set {AND, OR, NOT, INPUT} throughout this article, where gate types AND, OR have fan-in 2, NOT has fan-in 1, INPUT has fan-in 0, and they all have unbounded fan-out.
We measure size of a circuit by the number of the gates which the circuit has. jCj denotes the size of a circuit C. We say C is an s-size circuit if jCj s. For any circuit C, we denote by descðCÞ 2 f0; 1g Ã a description (by a bit string) of the circuit C. Note that the bit length jdescðCÞj is at most OðjCj log jCjÞ. Conversely, for a string 2 f0; 1g Ã we denote by cktðÞ a circuit that represents. (If does not give a correct form of circuits, we suppose cktðÞ represents a circuit that outputs 0 for every input.)
We say a string 2 f0; 1g 2 n is a truth table of a circuit C : f0; 1g n ! f0; 1g if i ¼ CðiÞ, where i denotes the i-th bit of the string . Thus, we have ¼ Cð0Þ Á Á Á Cð2 n À 1Þ. We denote by ttðCÞ a truth table of a circuit C. As briefly described in Section 1, for a function sðnÞ, we denote by SIZEðsðnÞÞ a class of decision problems that some family of OðsðnÞÞ-size n-input circuits fC n : f0; 1g n ! f0; 1gg n2N can solve correctly. Namely L 2 SIZEðsðnÞÞ if and only if there exists a family of sðnÞ-size n-input circuits fC n g n2N LðxÞ ¼ C n ðxÞ for every sufficiently large n 2 N. Also, we define SIZEðpolyðnÞÞ :¼ [ c>0 SIZEðn c Þ. (As for Turing machines, we denote by TIMEðtðnÞÞ (and NTIMEðtðnÞÞ, respectively) a class of decision problems that some OðtðnÞÞ-time deterministic (and nondeterministic, respectively) Turing machine solves.
We can hardwire a fixed bit string into a circuit by implementing a constant value 1 (0, respectively) from x _ :x (x^:x, respectively) only with any 1-bit input x and the elementary gates.
Turing Machines versus Circuit Families
As mentioned in Section 1, nonuniform computation model such as a family of circuits is quite stronger than uniform models such as Turing machines because nonuniform model can switch algorithms depending on length of inputs. This is one of the reasons why it is extremely difficult to prove circuit lower bounds in uniform classes in general.
Indeed, any decision problem can be computed by a family of circuits, while no Turing machine can solve the socalled halting problem. The strength of circuits can be easily demonstrated by using the Shannon expansion. The Shannon expansion shows for every Boolean function f : f0; 1g n ! f0; 1g we have f ðx 1 ; x 2 ; . . . ; x n Þ ¼ x 1^f ð1; x 2 ; . . . ; x n Þ _ :x 1^f ð0; x 2 ; . . . ; x n Þ:
From this equation, we can construct a circuit for n-variate function from those for two ðn À 1Þ-variate functions and five gates (two AND gates and INPUT, NOT, OR gates). Applying this expansion recursively, we obtain Oð2 n Þ-size circuit for f . This fact immediately gives the following proposition. Proposition 3.1. For every Boolean function f : f0; 1g n ! f0; 1g, there exists a circuit C of size at most 7 Á 2 nÀ1 À 5 such that CðxÞ ¼ f ðxÞ for every x 2 f0; 1g n , namely, C computes f correctly.
Therefore, it holds that for every decision problem L : f0; 1g Ã ! f0; 1g there exists a family of Oð2 n Þ-size circuits that computes L. Further, this upper bound of the circuit size can be improved by the result of Lupanov [35] . Theorem 3.2 (Lupanov (1958) ). For every Boolean function f : f0; 1g n ! f0; 1g, there exists a circuit C of size Oð2 n =nÞ such that C computes f correctly.
It is known that this bound is tight since we can show some Boolean function requires size ð2 n =nÞ by the standard counting argument. To see this, we first introduce the notion of the consistency between a string and a circuit. This notion will be also used frequently later. ' for ' 2 n . We say an n-input circuit C : f0; 1g n ! f0; 1g is consistent with if ttðCÞ f1;...;' g ¼ , namely, the first ' bits of the truth table of C coincide with . Lemma 3.4. Let s ! n. If 3s log s þ 2s < ' 2 n , there exists a string HARD of length ' with which no s-size n-input circuit is consistent.
Proof. We consider all the '-bit strings, which represent the first ' bits of the truth tables of n-variate Boolean functions. While the number of such strings is 2 ' , we show that if 3s log s þ 2s < ' 2 n then the number of the functions that s-size circuit can compute is strictly less than 2 ' , and thus, there exists some '-bit string with which no s-size circuit is consistent.
We estimate an upper bound of the number of s-size circuits, which is also an upper bound of the functions those compute. Since every gate has fan-in at most 2, the way of connecting a gate to others is at most ðs À 1Þ 2 , and thus, that n =6nÞ-size circuit is consistent. Therefore, the function whose truth table is HARD cannot computed by any 2 n =6n-size circuit. Ã Remark 3.6. We just used Lemma 3.4 to prove the matching lower bounds to Theorem 3.2 here, but it will be repeatedly used in the subsequent sections to guarantee the existence of strings having high circuit complexity.
From the above discussion, we can see that circuit families are quite stronger than Turing machines in some situation. On the other hand, we can find a compatibility between circuit families and Turing machines by adding special strings called advice to Turing machines. The function a is the advice, which provides helpful information for the computation but can depend only on input length. Note that the function f bounds the length of the advice.
It is easy to imagine that if advice provides a design of an n-input circuit of polynomial size on input length n we can simulate the circuit in polynomial time. Conversely, it is known that any Turing machine can be efficiently simulated by a circuit family as the following theorem shows (see, e.g., Theorem 2.8 of [53] ): We say T is time-constructible if there exists a Turing machine that outputs binary representation of TðjxjÞ on a given x in TðjxjÞ time. From this theorem, P/poly is completely characterized by polynomial-size circuit families, namely, we have P=poly ¼ SIZEðpolyðnÞÞ.
Proposition 3.9. L 2 P=poly if and only if there exists a circuit family fC n : f0; 1g n ! f0; 1gg n2N such that C n correctly computes L \ f0; 1g n , namely, x 2 L \ f0; 1g n , C n ðxÞ ¼ 1, for every sufficiently large n.
Also, note from Theorem 3.8 that we can simulate polynomial-time deterministic Turing machines by polynomialsize circuit families, i.e., P & SIZEðpolyðnÞÞ.
Now, we start studying the arguments for circuit lower bounds with the following warming-up proposition, which shows a superpolynomial circuit lower bounds in EEXP :¼ TIMEð2 2 polyðnÞ Þ by using the basic diagonalization argument. Proof. The main idea is to construct a truth table of a hard function to kill all the polynomial-size circuits by using the power of the double-exponential time.
Denote by NðsÞ be the number of s-size circuits. As estimated in the proof of Lemma 3.4, we have NðsÞ 2 3s log sþ2s . Let sðnÞ :¼ n log n . Since NðsðnÞÞ ¼ 2 Oðn log n log 2 nÞ , we can lexicographically enumerate all the truth tables of sðnÞ-size circuits of n inputs in double-exponential time. Let N :¼ NðsðnÞÞ for short. Define a matrix T 2 f0; 1g We remove the circuit killed by b 0 , namely, the row whose first bit is not b 0 , from T. Denote by N 0 the number of the rows (or equivalently the number of the circuits which survive b 0 ) in the updated T. Next, we define b 1 :¼ :MajðT 1;2 ; . . . ; T N 0 ;2 Þ. By the same reasoning, the number of the circuits whose outputs on inputs 0 Á Á Á 0 and 0 Á Á Á 01 coincide with b 0 and b 1 respectively is at most N 0 =2 N=4. By repeating the same procedure, we can define b 0 ; b 1 ; . . . ; b M so that every sðnÞ-size circuit is killed by one of them, where M log N ¼ Oðn log n log 2 nÞ. Note that the double-exponential time is enough to perform this procedure.
Setting 
and polynomial p such that x 2 L , 8 p w ½ðw; xÞ 2 L 0 . Also, using an NP oracle, we can define AE
We can define these classes via alternating Turing machines. For k ! 1, we say L 2 AE Also, AE k TIME½ f ðnÞ (Å k TIME½ f ðnÞ, respectively) is defined as a class of problems that can be computed by an alternating Turing machine in time Oð f ðnÞÞ with k À 1 alternations from an existential state (a universal state, respectively).
Kannan's argument provides fixed polynomial circuit lower bounds in PH, more specifically, AE Proof. Recall Lemma 3.4, which shows the existence of hard strings with which no small circuit is consistent. The main idea of Kannan's argument is to guess such a hard string and verify the hardness using the power of the alternation. Consider the following AE 2 procedure. (1) Guess a string HARD of length n kþ1 . (2) Verify whether every n k -size circuit is not consistent with HARD . This procedure indeed provides a hard string HARD since such a string exists by Lemma 3.4, and thus, given an input x, we just output the x-th bit of HARD to define the hard problem. Note that the consistency check of HARD with a single n k -size circuit can be done in polynomial time since the length of HARD is bounded by a polynomial. However, such a hard string may not be unique. We cannot then define the x-th bit of the hard strings uniquely. So, we define the hard problem by the lexicographically first hard string. The lexicographically first hard string can be defined by the following step.
(3) Verify whether for every string < HARD there exists an n k -size circuit that is consistent with . Combining these steps, we can provide a AE P 3 problem L HARD that is not computed by any n k -size circuit. Denote by ' the description length of n k -size circuits, (where ' ¼ Oðn k log nÞÞ, and set ' HARD :¼ 3n k þ 3n k log n k þ 1. By Lemma 3.4, there exists a HARD of length ' HARD with which no n k -size circuit is consistent.
½C and C 0 are n k -size circuits, C is not consistent with HARD ;
< HARD , C 0 is consistent with , and the x-th bit of HARD is 1. Ã Kannan also improved this theorem by combining with the collapse of the polynomial-time hierarchy, which is quite useful for circuit lower bounds.
This framework considers two cases: some small circuit family can compute a hard problem, for example SAT, or 6 KAWACHI not. The former case is believed to be unlikely, and so, something strange happens in a sense of computational comlexity theory. In the case of SAT, the former case derives the consequence that the polynomial-time hierarchy collapses to the second level, although it is strongly conjectured that AE P kþ1 6 ¼ AE P k for every k ! 1. Thus, we can obtain the circuit lower bounds in the second level of the polynomial-time hierarchy. In the latter case, we directly obtain the circuit lower bounds of SAT in NP.
The collapse of the polynomial-time hierarchy was shown by Karp and Lipton [30] . Formally, they provided the following theorem.
Theorem 4.3 (Karp and Lipton (1980)). For every
The proof of this theorem essentially makes use of the search-to-decision reduction of SAT using downward selfreducibility (the property that efficient solutions on short input length implies those on long one), which is given in the following lemma. This intuitively shows the hardness of a search version of SAT is as easy as that of the decision problem SAT. Then, it is easy to see that is indeed a satisfying assignment for . Since A SAT is executed at most m times and its single execution is in polynomial time, this algorithm runs in polynomial time. Even in the case of circuits, the idea for construction is the same. Ã
We now give the proof of Theorem 4.3.
Proof of Theorem 4.3. We first show SAT 2 SIZEðn k Þ ) AE P 2 ¼ Å P 2 , and then AE
p v ½R L ðx; u; vÞ ¼ 1 for some polynomial p and polynomial-time computable function R L : ðf0; 1g Ã Þ 3 ! f0; 1g. Assume SAT 2 SIZEðn k Þ, namely, some family of n k -size circuits can compute SAT. Using the circuits, we can construct a polynomial-size circuit C search that searches a witness for R L from the idea of Lemma 4.4 and the Cook-Levin theorem. More precisely, we can efficiently convert the statement ''9 p v ½R L ðx; u; vÞ ¼ 1'' for any x; u into an equivalent SAT instance x;u of polynomial length by the Cook-Levin theorem, and then, we can search a satisfying assignment for x;u if it exists. So, we can obtain the following circuit for every x 2 f0; 1g n and u 2 f0; 1g pðnÞ :
By the property of the circuit C search , for every x 2 f0; 1g n and u 2 f0; 1g pðnÞ , we have
where jdescðC search Þj is bounded by some polynomial in n since C search is a polynomial-size circuit. Therefore, any
, and thus we have PH ¼ AE
, we have 9 p u ½ðx; uÞ 2 L 0 , 9 p u; v ½ðx; u; vÞ 2 L 00 for some p and L 00 2 Å P 1 ¼ coNP. Therefore, we have x 2 L , 9 p u; v; 8 p w ½ðx; u; v; wÞ 2 L 000 for some p and L 000 2 P, and then AE
The same reasoning inductively works for AE P k ¼ AE P 2 for every k > 2, and thus, PH ¼ AE
Applying the case-analysis framework mentioned earlier, we can obtain the following circuit lower bounds from Theorem 4.2 and Theorem 4.3. Figure A·1 . We now see the definition of the class ZPP NP .
Definition 4.6. We say L 2 ZPP NP if there exists a probabilistic Turing machine M with access to an NP oracle such that for every x 2 f0; 1g Ã we have
and the expected running time of M is bounded by a polynomial.
Their main technical result is given as follows:
Proof. As mentioned above, ZPP NP NP NP ¼ AE P 2 PH. So, we show PH ZPP NP if SAT has a family of polynomial-size circuits.
The main idea is to perform the binary search in the set of all the n k -size circuits to find a circuit for SAT (on input length n) by a ZPP NP procedure. If we find the circuit for SAT, we can efficiently replace a quantified formula into an unquantified one using the circuit for SAT, and thus, we can compute any problem in PH by a ZPP NP algorithm. For simplicity, suppose that we have an efficient uniform sampler U from a set of circuits, that is, we can sample a circuit uniformly at random from the set of circuits. (In fact, we can construct such a randomized procedure with help of an NP oracle [11, 27] .) Let S 0 be the set of all the n k -size circuits of input length n. First, we uniformly sample 100n circuits C 1 ; . . . ; C 100n from S 0 by U. We construct a circuit C Maj that takes majority voting of 100n circuits, namely, C Maj ðxÞ ¼ MajðC 1 ðxÞ; . . . ; C 100n ðxÞÞ. Then, we check if C Maj computes SAT correctly by using the NP oracle. If so does C Maj , we are done. If not, we find a counterexample 0 on which C Maj fails by using the NP oracle. Next, let S 1 be the set of all the n k -size circuits that compute 0 correctly. Then, the expected number of jS 1 j is half of jS 0 j. Recall that C Maj fails on 0 , and C Maj is the majority of C 1 ; . . . ; C m . This implies that half of C 1 ; . . . ; C m uniformly chosen from S 0 fails on 0 . Therefore, half of S 0 are expected to be killed by 0 .
As done in the previous step for S 0 , we sample 100n circuits from S 1 and obtain a counterexample 1 using the NP oracle if C Maj does not compute SAT. Next, let S 2 be the set of all the circuits that compute 0 and 1 correctly. By the same discussion, the expected number of jS 2 j is half of jS 1 j.
Repeating this procedure, the expected number of jS i j is at most 2 Ài Á 2
Oðn k log nÞ unless we can find the circuit for SAT till the i-th repetition. (Recall that the number of s-size circuit is at most 2 3s log sþ2s as given in the proof of Lemma 3.4.) Therefore, we can find the circuit for SAT expectedly by at most Oðn k log nÞ repetitions. In fact, this overview is not of Köbler and Watanabe's original argument, but it is an overview of Bshouty, Cleve, Gavaldà, Kannan and Tamon's randomized algorithm that learns circuits with an NP oracle [12] . As noted in [12] , their learning algorithm provides an alternative proof of this theorem, and both of them actually stem from the binary search in the set of circuits.
The overview of their learning algorithm is easy to understand, but it uses an (almost) uniform sampler for circuits like [11, 27] as a black box. On the other hand, Köbler and Watanabe's argument indirectly specifies a circuit for SAT, but it is self-contained. In this article, we discuss the indirect, but self-contained, argument in details.
Now we move to the formal proof. In the main idea, the algorithm makes use of a sampler for circuits. Instead of this sampler, we exploit the so-called pairwise independent hash functions, defined as follows, to indirectly specify circuits. Definition 4.8. We say a family of functions H ¼ fh : f0; 1g n ! f0; 1g m g is pairwise independent if for every x 2 f0; 1g n and every y 2 f0; 1g
and for every x 6 ¼ x 0 2 f0; 1g n and every y; y 0 2 f0; 1g
where h is taken uniformly at random from H.
Several implementations of such families are known. For example, fh T;b ðxÞ ¼ Tx þ bg is a pairwise independent hash function for m n, where T 2 f0; 1g nÂm is a Toeplitz matrix and b 2 f0; 1g m , and the arithmetic operations are defined over the binary field F 2 , namely, the product is^and the sum is exclusive-or È. We call a matrix T ¼ ðt i; j Þ i; j Toeplitz if t i; j ¼ t iþ1; jþ1 for every i; j. Therefore, we can describe the function h T;b with n þ m À 1 þ m bits. It is easy to see that we can efficiently sample one of the functions uniformly at random and compute h T;b ðxÞ on an input x. See, e.g., Section D.2 of [22] for more details.
L i denotes a list of the counterexamples at the i-th repetition. S i denotes a set of the n k -size circuits that correctly compute all the SAT instances in L i .
For a formula and a set S of n k -size circuits, F ;S denotes a set of n k -size circuits in S that fail on , namely, F ;S :¼ fC 2 S : CðÞ 6 ¼ SATðÞg. Now, we can assume without loss of generality that every circuit cannot fail on unsatisfiable formulas. Recall Lemma 4.4. We can construct a polynomial-size circuit for SAT search from a polynomial-size circuit for SAT. From this construction, every circuit C can be assumed (only with polynomial increase of size) to first try to find a satisfying assignment for a given formula , and then, C can check that satisfies . Therefore, if is not satisfiable, no C can find a satisfying assignment. Thus, it never fails on unsatisfiable formulas. Then, S is a set of n k 0 -size circuits for a constant k 0 > k. In what follows, we implicitly consider S as the set of n k 0 -size circuits constructed from n k -size circuits. Then, F ;S ¼ fC 2 S : 2 SAT^CðÞ ¼ 0g and the statement ''9 9C 2 F ;S ½Rð; CÞ ¼ 1'' for a polynomial-time computable R, which will be used in the algorithm, can be verified in NP since it is equivalent with ''9; 9; 9C 2 S ½ðÞ ¼ 1^CðÞ ¼ 0^Rð; CÞ ¼ 1.'' (Note that there are two cases that a circuit fails on ; (i) 2 SAT but CðÞ ¼ 0, and (ii) = 2 SAT but CðÞ ¼ 1. The former is easily checked in NP, but the latter is not. We can exclude the latter case by assuming that every circuit first tries to search and then verify it.)
We are now ready to describe the details of the ZPP NP algorithm that specifies a circuit for SAT indirectly. In fact, this algorithm does not find a polynomial-size circuit for SAT but a polynomial-length ''advice'' for solving SAT by some NP \ coNP=poly algorithm. Such advice suffices for showing PH ¼ ZPP NP as seen later. Let ' be the description size of n k -size circuits (therefore, descðCÞ 2 f0; 1g ' for every n k -size circuit and ' ¼ Oðn k log kÞ). Advice-finding algorithm A KW (input: 0 n ) (1) Let S 0 f0; 1g ' be the set of descriptions of all the n k -size circuits C : f0; 1g n ! f0; 1g of n-bit input and L 0 :¼ ;.
(2) Check if every SAT instance (of length n) is correctly computed by the majority of circuits specified from S i via random hash functions as follows: (2-1) Determine an appropriate output length of the hash functions (which is used in the details of (2-2)) as follows: (2-2-1) Ask the NP oracle a query Q defined as
where Q is an NP statement as noted above, and the length of this query is at most a polynomial in jL i j and n. (Note that C 2 S i is equivalent with
If every instance is correctly computed by the majority, output hash functions H m max and a list L i of counterexamples which implicitly specify a circuit for SAT. (Note that SAT is computed correctly by the majority of C 1 ; . . . ; C 8n , specified by H m max , from S i , specified by L i .) Otherwise, make a new list L iþ1 of counterexamples, increment i, and go back to (2) . More precisely, perform the following: (3-1) If no counterexample exists in (2-2), output H m max and L i . Otherwise, find the counterexample i (by the same idea of Lemma 4.4) as follows: (3-1-1) Set 0 be the null string. For j ¼ 0; . . . ; n, perform the following. (3-1-2) Ask the NP oracle the query ''Q holds and the first j bits of i is j 0,'' where the statement q is given in (2-2-1). If the oracle returns yes, set jþ1 : (2) . We now show jS i j decreases fast with a constant probability, as given in Lemma 4.9. From this lemma, it is easy to see that the expected number of the repetitions is at most Oðn log jS 0 jÞ ¼ Oðn kþ1 log nÞ. (Note that jS iþ1 j < jS i j if i exists at (3-1-2) and A KW certainly halts at some time since S i contains a circuit for SAT for every i.) Since every repetition takes only polynomial time, the algorithm A KW runs in expected polynomial time.
Lemma 4.9. Suppose that Q holds at (3-1-2) at the i-th repetition. We then have
where the probability is taken over the random choice of pairwise independent hash functions at (2-1-1).
Proof. We first show that (2-1-2) gives a good lower bound of jS i j in Claim 4.10. Without loss of generality, we can assume that jS i j ! 16n for every i by adding a redundant part of Oðlog nÞ bits to the original descriptions of circuits, and hence, we have blog jS i j À log n À 3c ! 1 in the claim.
Claim 4.10. We have
where the probability is taken over random choices of pairwise independent hash functions at (2-1-1).
Proof. We will show Pr½9C 2 S i ½h
8n are independent. Therefore, m max ! blog jS i j À log n À 3c with probability at least 2 À16 . So, the remaining task is to show Pr½9C 2 S i ½h
We used the facts that We now estimate the probability that jS iþ1 j ! ð1 À 1=64nÞjS i j in the case when m max ! blog jS i j À log n À 3c. Since S iþ1 ¼ S i n F i ;S i from the definition, 
If jFj jS i j=64n and m max ! blog jS i j À log n À 3c it holds for every j
by the union bound.
We have by the Höffding bounds (see, e.g., Exercise 4.13 of [37] )
Therefore, if m max ! blog jS i j À log n À 3c, it holds jS iþ1 j ! ð1 À 1=64nÞjS i j with probability at most 2 n expðÀnÞ. Since m max ! blog jS i j À log n À 3c with probability at least 2 À16 from Claim 4.10, we have
Using the outputs of A KW , a list L of counterexamples and a set H of hash functions, we can immediately construct an NP \ coNP algorithm for SAT. Let S be the set of the n k -size circuits that correctly compute all the counterexamples in L, and H ¼ fh 
. . . ; h 8n 2 H (note that it is easy to check if C 1 ; . . . ; C 8n 2 S since L contains only positive instances from the assumption that every circuit never fails on negative instances), and then outputs MajðC 1 ðÞ; . . . ; C 8n ðÞÞ.
Finally, we show the collapse of PH to ZPP NP . Recall that the zero-error randomized algorithm A KW outputs ðL; HÞ in expected polynomial time and an NP \ coNP algorithm can solve SAT with ðL; HÞ.
It is easy to show that NP ¼ NP
The case-analysis framework (SAT has an n k -size circuit or not) as of Theorem 4.5 works with Theorem 4.7 like the Karp-Lipton theorem, and thus, we obtain the fixed polynomial circuit lower bounds in ZPP NP .
Corollary 4.11 (Köbler and Watanabe (1998) ). For every k > 0, ZPP NP 6 & SIZEðn k Þ.
Further improvements
The case-analysis framework provides more results on circuit lower bounds as in Theorems 4.5 and 4.7 by improving the collapse of the polynomial-time hierarchy. Another well-known result is circuit lower bounds in S P 2 , which was introduced in the context of derandomization of randomized classes like BPP in the polynomial-time hierarchy [16, 42] . In the setting of interactive proof systems, L 2 S P 2 can be interpreted as a protocol between a verifier and two competing provers. Given an instance x to two competing provers P yes ; P no and a verifier V, where P yes and P no have unbounded computational power and V is a polynomial-time deterministic Turing machine, P yes (P no , respectively) tries to convince V that x 2 L (x = 2 L, respectively) no matter what x 2 L or not actually. We say L 2 S P 2 if the following holds: If x 2 L, P yes can send a witness y of polynomial length such that V is convinced that x 2 L with y no matter what P no sends of polynomial length. (Such an y is often called an irrefutable proof.) If x = 2 L, P no can send a witness z of polynomial length such that V is convinced that x = 2 L with z no matter what P yes sends of polynomial length. (In the case that x = 2 L, the witness z is also called an irrefutable proof.)
It is known that S P 2 ZPP NP [14] (see Figure A·1 ). In [14] , Cai gave a new collapse theorem of the polynomial-time hierarchy by the competing provers. p u ½ x;u 2 SAT, where x;u is a Boolean formula generated from R L ðx; u; ÁÞ using the Cook-Levin theorem.
We now construct a proof system with two competing provers. Given an instance x 2 f0; 1g n , the provers P yes and P no send y and z of length pðnÞ to the verifier V, respectively. The verifier V performs the following procedure:
(1) Regarding cktðyÞ as a circuit for SAT, construct a circuit C search that tries to search a satisfying assignment for As demonstrated in several theorems above, the case-analysis framework is quite useful to show circuit lower bounds. However, it does not reveal which problem is actually hard, while the diagonalization argument gives explicit problems, i.e., problems defined by single machines, having high circuit complexity as in the proofs of Proposition 3.10 and Theorem 4.2. For example in the case analysis of Theorem 4.5, if SAT 2 SIZEðpolyðnÞÞ the hard problem is L HARD 2 PH ¼ AE P 2 \ Å P 2 defined in the proof of Theorem 4.2, but otherwise, the hard problem is SAT. So, we cannot specify which problem is hard.
Cai and Watanabe showed an explicit problem in AE P 2 that no small circuit can compute from the collapse arguments [15] . Proof. Set ' to the description length of n k -size circuits and ' HARD :¼ 3n k þ 3n k log n k þ 1. By Lemma 3.4, there exists a HARD of length ' HARD with which no n k -size circuit is consistent. The explicit problem L is defined by a combination of two explicit problems PreCKT and HARD. L is defined as 0x 2 L if and only if x 2 PreCKT, and 1x 2 L if and only if x 2 HARD. Therefore, it suffices to show both of the problems are in AE P 2 and at least one of them is not in SIZEðn k Þ. The first problem PreCKT is an NP problem that determines whether on input ð; uÞ some circuit C that is consistent with has u as a prefix in its description descðCÞ, namely, ð; uÞ 2 PreCKT , 9circuit C ðjCj n k Þ u is a prefix of descðCÞ and C is consistent with (jj ' HARD ):
Denote byñ the description length of ð; uÞ. (Then,ñ ¼ Oðn k Þ.) Since PreCKT is obviously in NP, if PreCKT has nõ n k -size circuit (on input lengthñ), we are done. If PreCKT has anñ k -size circuit, we consider the second problem HARD, which is in AE P 2 problem. We can then find theñ k -size circuit C pre for PreCKT by a similar idea to Kannan's lower bound (Theorem 4.2).
To understand the argument, we first assume that C pre is given. How can we use C pre to define the hard problem HARD? First, we existentially guess the hard string HARD , and then, universally check if (i) C pre ð HARD ; "Þ ¼ 0 (" denotes the empty string) and (ii) C pre ð; "Þ ¼ 1 for every lexicographically smaller than HARD . The condition (i) implies that no n k -size circuit is consistent with HARD (recall that such a string exists by Lemma 3.4), and the condition (ii) implies that HARD is the lexicographically first string of such hard ones. So, it defines the unique hard string HARD . Therefore, the problem HARD is defined by the x-th bit of HARD on input x.
The remaining task is to find such C pre in AE P 2 computation. We existentially guess the circuit C pre and check the following two conditions for every ð; uÞ: (1) C pre ð; uÞ ¼ 1 ¼) ð; uÞ 2 PreCKT, (2) ð; uÞ 2 PreCKT ¼) C pre ð; uÞ ¼ 1. Thus, if the guessed circuit passes these conditions, it indeed computes PreCKT.
It is easy to see that the condition (1) can be checked by the following statements: For every of length ' HARD and u of length at most ', C pre ð; uÞ ¼ 1^juj ¼ ' ¼) cktðuÞ is consistent with and C pre ð; uÞ ¼ 1^juj < ' ¼) C pre ð; u0Þ ¼ 1 _ C pre ð; u1Þ ¼ 1:
To implement the condition (2), we give a statement that ''for every positive instances ð; uÞ of PreCKT, we have C pre ð; uÞ ¼ 1:'' For every n k -size circuit C and every prefix u of descðCÞ,
where C :¼ ttðCÞ f1;...;'g , namely, the first ' bits of the truth table of C. Since ð C ; uÞ is a positive instance of PreCKT, it is easy to see this achieves (2) . To wrap up, HARD is defined as follows:
(1) 8 ðjj ¼ ' HARD Þ, 8u ðjuj 'Þ; C pre ð; uÞ ¼ 1 and juj ¼ ' ¼) cktðuÞ is consistent with , C pre ð; uÞ ¼ 1 and juj < ' ¼) C pre ð; u0Þ ¼ 1 or C pre ð; u1Þ ¼ 1, (2) 8C ðjCj n k Þ, 8u (prefix of descðCÞ), C pre ð C ; uÞ ¼ 1, (3) C pre ð HARD ; "Þ ¼ 0, 8 ð < HARD Þ C pre ð; "Þ ¼ 1, and the x-th bit of HARD is 1. The hardness of HARD is obvious from the definition. It is easy to check this can be verified in AE 2 TIMEðn k 2 log kþ1 nÞ. Ã So far, we focused only on superpolynomial and fixed polynomial circuit lower bounds. It is important to show much higher circuit lower bounds in some applications such as hardness-randomness tradeoffs, which provide BPP ¼ P under the assumption that the class E ¼ TIMEð2
OðnÞ Þ has exponential circuit lower bounds (See also Section 6). Miltersen, Vinodchandran and Watanabe examined how much further we can extend the known arguments, and they proved half-exponential lower bounds in ZPEXP NP , which is an exponential-time analogue of ZPP NP [36] . A function f : N ! N is half-exponential if f ð f ðnÞÞ 2 pðnÞ for some polynomial p. For example, n 2 , n log n and 2 2 ðlog log nÞ 2 are half-exponential and time-constructible, but 2 n 1=2 is not half-exponential.
Remark 4.17. Since ZPEXP EXP NP, one might not consider the NP oracle is helpful to strengthen the computational power of ZPEXP, namely, ZPEXP NP ZPEXP. However, the underlying exponential-time machine can generate an instance of length 2 polyðnÞ on input length n and ask it to the NP oracle. It seems hard to simulate this interaction only by a ZPEXP algorithm.
The limitation of the extensions comes from the quantitative performance of the Karp-Lipton theorem. It can be shown that for any superpolynomial time-constructible function f ðnÞ (say, f ðnÞ ¼ n log n ) there is a constant c > 0 such that SAT 2 SIZE½ f ðnÞ ) AE 3 TIME½ f ðnÞ AE 2 TIME½ f ð f ðnÞ c Þ c . So, at the best, we can only obtain half-exponential circuit lower bounds within exponential time bounds. The explicit construction of the hard problem in [15] does not use the Karp-Lipton theorem, but if we generalize the explicit construction, the size of the circuit C pre is a self-composition of f ðnÞ. Therefore, the circuit lower bound is also half-exponential at the best even in the generalization of [15] .
On the other hand, they also provided a proof of exponential lower bounds in higher classes with another argument.
Theorem 4.18 (Miltersen, Vinodchandran and Watanabe (1999)). For every
Proof. Define a problem L as whose polynomial-time analogue has fixed polynomial circuit lower bounds.
Non-Relativizable Separations
As described briefly in Introduction, it is known that there is some limitation in the relativizing arguments. Baker Section 3.4 in Arora and Barak's textbook [5] for more details on the diagonalization and relativization.) To overcome this barrier, it is necessary to develop new useful arguments that do not relativize. In the context of circuit lower bounds, several results were indeed derived from such new techniques that do not relativize. In this section, we study several non-relativizable results for circuit lower bounds.
Interactive proof systems and arithmetization
The non-relativizing arguments for circuit lower bounds stem from a strong proof technique called arithmetization, which was developed from the arguments for interactive proof systems in early 90's.
The interactive proof systems are one of computational models. In this model, the computation is executed over communication between two parties, prover P and verifier V, where the prover P has unbounded computational power and verifier V is modeled as a polynomial-time probabilistic Turing machine.
Let L f0; 1g Ã be any decision problem. Given an input x to P and V, P tries to convince V that x 2 L (whichever x 2 L or not) and V tries to verify if x 2 L with exchanges of messages between P and V. We say L has an interactive proof system if V can verify if x 2 L for any given x with high probability, say 2/3. Via this computation model, we can define complexity classes. For example, if L has an interactive proof system in which P and V exchange their messages polynomially many times, we say L is in the class IP.
Here, we give formal definitions of special two classes of the interactive proof systems MA and AM, which are frequently referred to in the remainder of this article.
Definition 5.1. For a probabilistic Turing machine M, we denote by Mðx; rÞ its output on an input x and random string r.
We say L 2 MA if there exist a polynomial-time probabilistic Turing machine V and polynomial p such that for every x 2 f0; 1g In the setting of interactive proof systems, L 2 MA can be interpreted as one-round protocol between a prover and verifier. Given an instance x to a prover P (a.k.a. Merlin) and verifier V (a.k.a. Arthur), P tries to convince V that x 2 L no matter which x 2 L or not actually, where P has unbounded computational power, but V is a polynomial-time probabilistic Turing machine. P sends V a witness w of polynomial length pðnÞ for a given x 2 f0; 1g n , and then, V verifies with w if x 2 L or not. We say L 2 MA if the following holds: If x 2 L then V outputs 1 on an input ðx; wÞ with probability at least 2/3 (namely, V is convinced with high probability by a witness sent from P) and otherwise V outputs 0 on an input ðx; wÞ with probability at least 2/3 (namely, V is not convinced with high probability no matter what P sends to V). We also say L 2 MATIMEð f ðnÞÞ if the running time of V is at most Oð f ðjxjÞÞ (and thus jwj and jrj are also at most Oð f ðjxjÞÞ). In the setting of interactive proof systems, L 2 AM can be interpreted as a one-round protocol with public randomness. Given an instance x, an all-powerful P tries to convince a probabilistic polynomial-time V that x 2 L by sending w from P to V. In the case of AM, V's random string is public, i.e., P is requested to generate a witness from not only an instance but also a random string.
The theorem IP ¼ PSPACE is one of the seminal results in the interactive proof system [34, 45] , and the proof of this theorem offers new powerful tools for proving circuit lower bounds.
It should be noted that Fortnow, Rompel and Sipser demonstrated the impossibility that no relativizing argument can show the equivalence by showing coNP has no multi-prover interactive proof system (interactive proof system with two provers who cannot communicate with each other prover) in some relativized world, namely, coNP B 6 MIP B [20] . For simplicity, we consider a two-variable case É :¼ 8x 1 9x 2 ðx 1 ; x 2 Þ. By the arithmetization, we can say that É 2 TQBF if and only if ðp ð0; 0Þ þ p ð0; 1Þ À p ð0; 0Þp ð0; 1ÞÞðp ð1; 0Þ þ p ð1; 1Þ À p ð1; 0Þp ð1; 1ÞÞ ¼ 1 (and similarly generalized to n-variable case). Note that the number of the terms can be exponential in the number of the variables. The verifier V checks if this equality holds for p over a large field instead of the binary field.
The embedding to a polynomial over a large field allows us to amplify the probability that V can reject invalid polynomials sent from a cheating prover when É = 2 TQBF. For example, consider that V requests P to send a univariate polynomial p É ðy 1 Þ :¼ p ðy 1 ; 0Þ þ p ðy 1 ; 1Þ À p ðy 1 ; 0Þp ðy 1 ; 1Þ in the above two-variable case. Let q be a polynomial sent from P. It should hold that p É ð0Þ Á p É ð1Þ ¼ qð0Þ Á qð1Þ ¼ 1 if É 2 TQBF, and thus, V checks if qð0Þ Á qð1Þ ¼ 1 or not. However, even if this equality holds, a prover may succeed to find another polynomial q 6 p É that satisfies the equality.
Then, V can exclude such an invalid polynomial with high probability by assigning a random value over F p (rather than f0; 1g) to q and p É and comparing their resulting values. Notice that any degree-d polynomial has at most d roots, and thus, if q differs from p É and they are of degree at most d, we have Pr r2F p ½qðrÞ 6 ¼ p É ðrÞ ! 1 À d=p. By appropriately choosing a large prime p, V can reject invalid polynomials with high probability. The interactive proof system for TQBF is basically constructed from this test combining with other ideas. (For the details, see [34, 45] and Section 8.3 of [5] .)
The powerful tools for circuit lower bounds are derived from this test by the arithmetization. In this test, the task of the prover P is to construct a univariate polynomial q from É. Actually, this task can be done by computational power of PSPACE rather than unbounded power. Therefore, if PSPACE problems can be computed by polynomialsize circuit families, we can simulate the interactive proof system for TQBF by the following one-round protocol (i.e., MA protocol): P first sends V a polynomial-size circuit C that can generate polynomials requested from V and then V simulates the interactive proof system with C as the prover in hand. Consequently, we have the following theorem: We can regard this theorem as a variant of the Karp-Lipton collapse theorem (Theorem 4.3) in the sense that if a high class has a small circuit family then the high class collapses into low ones. The Karp-Lipton theorem and its improvements show the collapse of PH from the assumption that SAT has a small circuit family. Now, Theorem 5.3 shows the collapse of PSPACE into MA from the assumption that any PSPACE problem has a small circuit family.
Similar collapse theorems are known to hold for other classes. For example, Babai, Fortnow and Lund showed that NEXP has multi-prover interactive proof systems (and thus NEXP ¼ MIP) in [7] , and their argument also provides the following collapse theorem.
Theorem 5.4 (Babai, Fortnow and Lund (1991)). EXP & SIZEðpolyðnÞÞ ¼) EXP ¼ MA.
A quantitative version of this theorem is also given by Miltersen, Vinodchandran and Watanabe [36] . These collapse theorems would provide a new framework for circuit lower bounds, and in fact, we can obtain new circuit lower bounds from the new collapses as seen later.
Buhrman, Fortnow and Thierauf's argument
The first application of a non-relativizing argument to circuit lower bounds was demonstrated by Buhrman, Fortnow and Thierauf [13] . They proved a superpolynomial lower bounds in the class MA EXP from the collapse theorem of EXP [7] .
The class MA EXP ¼ [ c>0 MATIMEð2 n c Þ is an exponential-time analogue of MA; the verifier can run in 2 polyðnÞ time on input length n and the prover can also send the verifier a witness up to this exponential length.
Theorem 5.8 (Buhrman, Fortnow and Thierauf (1998) 
The first relation is from Theorem 5.4, the second is from NP NP EXP, the third is from a padding argument, the fourth is from the assumption MA EXP & SIZEðpolyðnÞÞ, and the last inclusion contradicts the superpolynomial circuit lower bounds in ZPEXP NP . Ã
The proof works for MA EXP \ coMA EXP instead of MA EXP by the same argument, namely, we can obtain MA EXP \ coMA EXP 6 & SIZEðpolyðnÞÞ.
They also proved the existence of an oracle A satisfying MA A EXP P A =poly. This implies that no relativizing argument can prove the separation MA EXP 6 & SIZEðpolyðnÞÞ and thus their argument essentially contains nonrelativizing techniques.
A quantitative version can be also obtained from Theorem 5.5 as given in [36] :
Theorem 5.9 (Miltersen, Vinodchandran and Watanabe (1999)). Let f be any time-constructible half-exponential function. Then, MA EXP \ coMA EXP 6 & SIZEð f ðnÞÞ.
Vinodchandran's argument
Vinodchandran showed fixed polynomial circuit lower bounds in PP by combining powerful complexity-theoretic tools and notions with the collapses of Theorem 5.7 in [52] .
His proof uses a notion of the BP operator introduced by Schöning [44] .
Definition 5.10. For every complexity class C, we say a problem L is in BP Á C if there exists and a polynomial p and a problem L 0 2 C such that
The BP operator is useful to automatically produce bounded-error versions of existing classes, and most standard classes with the BP operator naturally coincides with the known bounded-error classes (e.g., BP Á P ¼ BPP, BP Á NP ¼ AM and more).
His result of the circuit lower bounds is as follows.
Proof. We consider a case analysis. The first case is PP & SIZEðpolyðnÞÞ and the second case is PP 6 & SIZEðpolyðnÞÞ.
In the second case, we are done. So, we consider the first case. 
where the inclusion PH BP Á PP is from [47] , the inclusion BP Á PP BP Á MA is from Theorem 5.7, the equalities BP Á MA ¼ AM ¼ MA are from the definitions of MA and the BP operator and the relation NP & SIZEðpolyðnÞÞ )
KAWACHI
AM ¼ MA [6] , the separation PH MA 6 & SIZEðn k Þ is derived from Theorem 4.2, and the last relation is from MA PP [51] . Ã The paper [52] gave no formal evidence of non-relativizability of this separation, but then, Aaronson demonstrated his separation does not relativize by proving that PP A SIZE A ðnÞ for some oracle A [1] . Thus, this result also essentially makes use of a non-relativizing argument.
Santhanam's argument
Santhanam also made use of the arguments derived from the arithmetization to prove fixed polynomial circuit lower bounds in the Merlin-Arthur games with 1-bit advice MA=1 [43] . His lower bound argument utilizes an arithmetized version L Ã of TQBF, which is PSPACE-complete, with special complexity-theoretic properties. His argument is based on a case analysis similar to Kannan's argument and its extension of Section 4. The major difference is how we separate two cases.
Recall that Kannan's argument considered if SAT 2 SIZEðpolyðnÞÞ or not. If not, we are done (since NP has fixed polynomial circuit lower bounds). Otherwise, the circuit lower bounds in PH get down to lower classes (e.g., AE P 2 \ Å P 2 ) by the collapse theorems of the polynomial-time hierarchy (e.g., the Karp-Lipton theorem (Theorem 4.3) ).
On the other hand, his argument considers if the arithmetized version L
SIZEðpolyðnÞÞ holds. However, this consequence is not enough for our goal since we can only obtain a separation PSPACE 6 & SIZEðn k Þ by merging these two separations, which is much weaker than the result PH 6 & SIZEðn k Þ. The key idea is to exploit the special properties of L Ã . These properties enable us to check whether a given instance is in L Ã or not by some MA protocol by assuming L Ã has a small circuit family and letting the prover send the circuit to the verifier, similarly to the idea of the collapse theorems from the arithmetization such as Theorem 5.3 in Section 5.1. Then, if x 2 L Ã the verifier accepts x using the correct circuit, and if not, the verifier can rejects x with high probability no matter the prover sends.
Recall our assumption L Ã 6 & SIZEðpolyðnÞÞ from the case analysis. We consider a padded version L HARD of L Ã . An instance of L HARD consists of x1 y , of which padding length y sets the size of the minimum circuit for L Ã on input length jxj to be bounded by some polynomials in jxj þ y rather than jxj.
We define z ¼ x1 y 2 L HARD , x 2 L HARD and the size of the minimum circuit is in appropriate range given by jxj and y, so that the size is bounded by some polynomials in jxj þ y.
It is easy to see that L HARD has an MA protocol (if we hide the details of the 1-bit advice). The prover guesses the minimum circuit (of polynomial size in jzj ¼ jxj þ y rather than jxj) and the verifier can check if x 2 L Ã with the circuit. On the other hand, we can construct a polynomial-size circuit family for L Ã from a polynomial-size circuit family for L HARD . Therefore, L HARD cannot have such a circuit family from our assumption L Ã = 2 SIZEðpolyðnÞÞ. In the above, we ignore the details of the 1-bit advice. It is in fact required for the case that x 2 L Ã but the minimum circuit is not in the range specified by y (for example, the minimum circuit is too large with respect to y). In this case, z ¼ x1 y should be rejected, but we cannot lower-bound the probability that the verifier rejects z. (Recall that if x 2 L Ã the prover sends a correct circuit and the verifier can accept x and if x = 2 L Ã the verifier can reject x with high probability no matter what the prover sends. However, we can say nothing if x 2 L Ã and the prover sends something the verifier does not expect, say, a circuit whose size is in the expected range but which is not for L.) So, we use the advice to inform the verifier of whether the given instance has appropriate padding length with respect to the size of the minimum circuit. Now, we turn to the details of the above argument.
Proof. The proof relies on the following technical lemma, which stems from an arithmetization technique.
Lemma 5.13. There exist a PSPACE-complete problem L Ã and a polynomial-time probabilistic oracle Turing machine M Ã satisfying the following properties for every input x: (1) M Ã makes a query only of length jxj to an oracle, (2) If x 2 L and L is given as an oracle to M Ã , M Ã accepts x with probability 1,
, no matter what is given to M Ã as an oracle, M Ã rejects x with probability at least 1/2.
The problem L Ã is defined by appropriately modifying the PSPACE-complete problem with special properties called random self-reducibility (the property that an efficient average-case solution implies an efficient worst-case one) and downward self-reducibility given in [49] , which can be obtained by some arithmetization of TQBF.
We first suppose L Ã 2 SIZEðpolyðnÞÞ. Since L Ã is PSPACE-complete, we have PSPACE & SIZEðpolyðnÞÞ and hence PSPACE ¼ MA by Theorem 5.3. Recall Theorem 4.2. Since AE P 3 unconditionally has a hard problem that no n k -size circuit can compute by Theorem 4.2 and we have AE P 2 PH PSPACE, MA ¼ PSPACE has the hard problem. Thus,
SIZEðpolyðnÞÞ. We will then show a variant of the problem L Ã is hard and it is computable in MA=1 by using Lemma 5.13. We define the variant L HARD as:
y , where jxj y and y is a power of 2;
(ii) ðjxj þ yÞ kþ1 sðjxjÞ ðjxj þ 2yÞ kþ1 ;
where sðnÞ :¼ minfjCj : a circuit C computes L Ã on input length ng;
We prove that L HARD 2 MA=1. The 1-bit advice aðmÞ on input length m is defined as aðmÞ ¼ 1 if (ii) holds on input length m ¼ n þ y where y ! n is a power of 2, 0 otherwise.
The MA protocol with the 1-bit advice is given as follows: It is easy to see this MA protocol computes L HARD efficiently. If z 2 L HARD , the advice is 1 and then the size of the minimum circuit C is bounded by a polynomial in jzj ¼ m. So, V can efficiently simulate M Ã with the oracle C. Then, V accepts it with probability 1 since x 2 L Ã and Lemma 5.13 applies. If z = 2 L HARD , either z does not have the correct form, namely, it violates (i) or (ii), or it has the correct form but x = 2 L Ã . In the former case, V rejects z with probability 1 since (i) can be checked easily and (ii) can be checked from the 1-bit advice. In the latter case, V rejects it with probability at least 1/2 by Lemma 5.13. Thus, L HARD 2 MA=1.
We next prove that L HARD = 2 SIZEðpolyðnÞÞ. Assume for contradiction that L HARD has a family of m k -size circuits fC m : f0; 1g m ! f0; 1gg m2N . Recall that we now discuss the case when L Ã = 2 SIZEðpolyðnÞÞ. Therefore, there is an infinite sequence I N such that sðnÞ > ðn þ 1Þ kþ1 for every n 2 I. (Recall that sðnÞ is the size of the minimum circuit for L on input length n.) From the assumption, C m computes L HARD on input length m. We now show that C m provides a small circuit for L which contradicts the definition of sðnÞ.
Let m ¼ n þ y for n 2 I and y ! n which is a power of 2, where ðn þ yÞ kþ1 sðnÞ < ðn þ 2yÞ kþ1 . Since ðn þ 1Þ kþ1 sðnÞ Oð2 n Þ by Theorem 3.1 and y ! n is a power of 2, such an y uniquely determined from n. We then obtain a circuit C 0 n that computes L Ã on input length n 2 I by hardwiring 1 y to C m . We have jC 0 n j < ðn þ yÞ kþ1 , which contradicts the definition of sðnÞ. Therefore, L HARD = 2 SIZEðpolyðnÞÞ. Ã From Theorem 5.12, we can show fixed polynomial lower bounds in Promise-MA. The promise problems are defined as a pair of disjoint sets of bit strings Å yes ; Å no f0; 1g Ã . We say a promise problem L in Promise-MA if the following holds: There exists a polynomial-time probabilistic Turing machine V such that if x 2 Å yes then Pr r ½Vðx; rÞ ¼ 1 ! 2=3 and if x 2 Å no then Pr r ½Vðx; rÞ ¼ 1 1=3.
Proof. Let L HARD be the problem in ðMA=1Þ n SIZEðn kþ1 Þ given in Theorem 5.12. Recall the 1-bit advice strings faðmÞg m2N and the MA protocol P HARD that computes L HARD with advice strings faðmÞg m2N . We define a promise problem ðÅ yes ; Å no Þ that is in Promise-MA n SIZEðn k Þ from L HARD as follows. For any instance x 2 f0; 1g n , we parse it into yb for y 2 f0; 1g nÀ1 and b 2 f0; 1g for n ! 2. (We exclude 1-bit instances from Å yes [ Å no .) For every input length n ! 2, we set b ¼ aðn À 1Þ and P HARD accepts y ¼) x 2 Å yes ; b ¼ aðn À 1Þ and P HARD rejects y ¼) x 2 Å no ;
It is trivial to show ðÅ yes ; Å no Þ is in Promise-MA from Theorem 5.12. It is also straightforward to prove that this promise problem ðÅ yes ; Å no Þ has a circuit lower bound of n k by contradiction to Theorem 5.12. Ã
Since it is known that Promise-MA PP [51] , this separation improves Theorem 5.11 of Vinodchandran. See Santhanam's paper [43] for more detailed comparisons with other circuit lower bounds.
Moreover 
Circuit Lower Bounds from Circuit-SAT Algorithms
Recently, Williams proposed a novel research program for circuit lower bounds in NEXP [55] [56] [57] . He revealed a surprising relation between a non-trivially fast algorithm for the Circuit-SAT problem (CKT-SAT) and circuit lower bounds in NEXP. The problem CKT-SAT, especially for a circuit class C, is defined as follows. Circuit Satisfiability for Circuit Class C (C-CKT-SAT)
Input: a description descðCÞ of an n-bit circuit C 2 C, Output: 1 if and only if C is satisfiable, namely, there exists an 2 f0; 1g n such that CðÞ ¼ 1. Going along this research program, he succeeded to prove that superpolynomial ACC 0 circuit lower bounds in NEXP, namely, some problem in NEXP cannot be computed by a polynomial-size circuit family of ACC 0 [57] . ACC 0 is a class of restricted circuit families. We say that a circuit family fC n g n2N is in ACC 0 if C n is a constantdepth circuit that consists of AND, OR, NOT and MOD m gates for any m 2 N of unbounded fan-in, where MOD m : f0; 1g n ! f0; 1g is the modulo-m gate defined as MOD m ðx 1 ; . . . ; x n Þ ¼ 0 , m j P n i¼1 x i . The limitation of AC 0 , which is a restricted circuit class that excludes MOD m from the gate set in the definition of ACC 0 , has been broadly studied and understood well. For example, it has already turned out three decades ago that the parity problem, which determines the parity of the number of 1s in an input string (and hence it is in P trivially), is not in AC 0 ðpolyðnÞÞ [3, 21] , which immediately provides a separation P 6 & AC 0 ðpolyðnÞÞ. (For a restricted circuit class C, we denote by CðsðnÞÞ sðnÞ-size C circuit family and CðpolyðnÞÞ :¼ S k>0 Cðn k Þ. For notational convenience, we sometimes consider a circuit class C as a class of problems computed by a circuit family in C.) Furthermore, it was shown that even if we add MOD p gates for an odd prime p to the gate set such circuits cannot compute the parity problem [46] .
However, if we add MOD m gates for composite numbers m to AC 0 circuits, no known techniques worked so well, and hence, their computational limitation was analyzed little until quite recently. Indeed, we could not exclude the possibility that even EXP NP & ACC 0 held until Williams' results. So, it was a big issue in computational complexity theory how strong ACC 0 circuits are, as discussed in Section 14.4.2 of Arora and Barak's textbook [5] . Towards the resolution of this issue, Williams' program made substantial progress by proving the separation NEXP 6 & ACC 0 ðpolyðnÞÞ. The overview of his argument for the separation is as follows: Assume NEXP & ACC 0 ðpolyðnÞÞ for contradiction. Then, we can reduce any problem L in NTIMEð2 n Þ to satisfiability problem of a given ACC 0 circuit (ACC 0 -CKT-SAT) efficiently. So, constructing non-trivially fast algorithms for ACC 0 -CKT-SAT implies that L 2 NTIMEð2 oðnÞ Þ, and thus, NTIMEð2 n Þ NTIMEð2 oðnÞ Þ. This contradicts the nondeterministic time hierarchy theorem [54] that implies NTIMEð2 n Þ 6 NTIMEð2 oðnÞ Þ (See also Section 3.2 of [5] for the nondeterministic hierarchy theorem).
Before moving to the details of his arguments, we first introduce important technical tools. In several proofs of his results, derandomization of randomized complexity classes plays crucial roles. It is known as the hardness-randomness tradeoff that we can construct a pseudorandom generator, which takes a string of high circuit complexity, that outputs a pseudorandom string for deterministic simulation of randomized computation such as BPP and MA [24, 31, 38] . (See a comprehensive survey [28] for derandomization.) We here show the framework that achieves a quantitatively good tradeoff by Umans [50] . if there exists no s c 0 -size circuit C : f0; 1g blog jjc ! f0; 1g that is consistent with .
We now briefly see how to derandomize MA protocols by the pseudorandom generator. Consider a protocol for L 2 MA. First, the prover P sends a witness w x to the randomized verifier V on an input x. Then, the randomized verifier Vðx; rÞ can decide if x 2 L correctly on an input x with high probability over a random string r.
Then, applying Theorem 3.8 to V and hardwiring the input x and witness w x into the resulting circuit, we obtain a circuit D V which takes random strings r in V as an input. (Thus, the input length is at most the running time of V, and D V is a circuit of size polynomial in jxj.)
We now use Theorem 6.1. Assume that we have a hard string that has high circuit lower bounds. Then, we have a good pseudorandom generator Gð; ÁÞ from Theorem 6.1. We enumerate all the seeds and run D V with the outputs of the pseudorandom generator Gð; ÁÞ on the seeds. By taking majority vote of the outcomes from D V ðGð; ÞÞ for all the , we can deterministically decide if x 2 L. (Note that the input length of Gð; ÁÞ is logarithmic in jj.) If has sufficiently high circuit lower bounds, the length of the seed becomes short and then we can get efficient deterministic simulation of V.
For example, if no circuit of size 2 :1blog jjc ¼ polyðjjÞ is consistent with , namely, has an exponential circuit lower bound, the number of the output patterns of Gð; ÁÞ is bounded by polyðsÞ, and then, we just run D V polyðjxjÞ times since s ¼ polyðjxjÞ. In summary, we can deterministically simulate the randomized verifier V in polynomial time, and hence, we have MA ¼ NP.
His arguments also utilize tools developed for the collapse theorems. Impagliazzo, Kabanets and Wigderson demonstrated that another collapse theorem can be derived from Theorem 5.4 with a new proof technique called easywitness argument. Proof Sketch. The easy-witness argument considers two cases: (i) a witness of every NEXP problem can be compressed into a polynomial-size circuit, namely, for any L 2 NEXP and any x 2 L we have some polynomial-size circuit C x such that ttðC x Þ represents a witness for x 2 L, and (ii) any witness of some NEXP problem is incompressible.
In the former case, we can show NEXP ¼ EXP by brute-force searching the polynomial-size circuit that generates the witness in exponential time. Assuming NEXP & SIZEðpolyðnÞÞ, we then have
In the latter case, we can derive a contradiction by derandomizing MA from the incompressibility of the NEXP witness via the hardness-randomness tradeoff seen around Theorem 6.1. Combining the incompressibility of the NEXP witness with the derandomization of [31] , we can simulate MA in nondeterministic subexponential time with advice of subpolynomial length on infinitely many input lengths. Under the assumption NEXP & SIZEðpolyðnÞÞ, we can show that this simulation for every problem in MA can be done by an n d 0 -size circuit for some fixed constant d 0 . Thus, it follows that every EXP problem can be computed by a fixed polynomial-size circuit on infinitely many input lengths, but we can show this is impossible by a simple diagonalization argument as done in the proof of Proposition 3. 10 .
Ã This collapse theorem itself shows an interesting relation between circuit lower bounds and derandomization since the statement implies that no derandomization of MA is possible unless NEXP has superpolynomial circuit lower bounds. (Any nontrivial derandomization of MA, say, MA NTIMEð2
OðnÞ Þ, yields a separation of NEXP from MA, and then it holds NEXP 6 & SIZEðpolyðnÞÞ from this theorem.) Furthermore, this theorem provides the following important tool as a building block in the new paradigm using CKT-SAT algorithms. 
Williams' arguments
We are now ready to move into details of the separation results of Williams [56, 57] . Proof Sketch. The proof is two-fold. The first part shows if we can solve C-CKT-SAT nontrivially faster than the bruteforce search by deterministic algorithms then NEXP 6 & C for a natural circuit class C including ACC 0 . The second part is to construct such a fast algorithm for the circuit class ACC 0 . We will now see each of these parts below. (i) Circuit lower bounds in NEXP from faster CKT-SAT algorithms.
We will first show that NEXP 6 & SIZEðpolyðnÞÞ if some fast algorithm solves CKT-SAT for unrestricted circuits, and then, we will modify it the proof for ACC 0 ðpolyðnÞÞ. We assume that a Turing machine A CKT-SAT can solve CKT-SAT in polyðmÞ Á 2 n = f ðnÞ time for any superpolynomial function f , where m is the size of a given circuit and n is the input size of the circuit. (Note that CKT-SAT can be solved in Oðm Á 2 n Þ time by the brute-force search.) The first important ingredient is an efficient reduction from an NTIMEð2 n Þ problem to 3-SAT instances of 2 n polyðnÞ length that can be shown from the results of Tourlakis [48] , and Fortnow, Lipton, van Melkebeek, and Viglas [19] . This reduction allows us to locally access specific clauses from the exponentially long instance in polynomial time. 20 KAWACHI Lemma 6.5. There exists a constant c > 0 such that every problem L 2 NTIMEð2 n Þ can be reduced to 3-SAT of instance size c2 n Á n 4 . Moreover, there exists a Turing machine that, given instance of L and integer i 2 f1; . . . ; c2 n Á n 4 g in binary, outputs the i-th clause of the resulting 3-SAT instance in Oðn 4 Þ time.
Let L be any problem in NTIMEð2 n Þ. We now construct a faster nondeterministic algorithm for L by using the polynomial-size witness circuit from Theorem 6.3 and the local reduction from Lemma 6.5, which contradicts the nondeterministic time hierarchy theorem.
The second important ingredient is the witness circuit shown in Theorem 6.3, given as a byproduct of Theorem 6.2. Now, we consider a polynomial-time computable function R L : f0; 1g Ã Â f0; 1g Ã ! f0; 1g associated with L such that R L ðx; yÞ ¼ 1 if and only if a 3-SAT instance x reduced from x by Lemma 6.5 is satisfied by an assignment y. Assuming NEXP & SIZEðpolyðnÞÞ, a satisfying assignment can be compressed to a polynomial-size witness circuit by Theorem 6.3. Fast nondeterministic algorithm N for L (input: x)
(1) Guess the polynomial-size witness circuit W x for R L .
(2) Convert the following deterministic procedure P to a circuit C P by Theorem 3.8: Procedure P (inputs: x, descðW x Þ and i 2 f1; . . . ; c2 n Á n 4 g in binary) (a) Compute the i-th clause of x by the reduction given in Lemma 6.5. Denote the i-th clause by ðl z 1^l z 2^l z 3 Þ for literals l z 1 ; l z 2 ; l z 3 . (b) Output 1 if and only if the i-th clause is not satisfied by W x ðz 1 Þ; W x ðz 2 Þ; W x ðz 3 Þ. (3) Check the satisfiability of C P ðx; descðW x Þ; ÁÞ by the fast CKT-SAT algorithm A CKT-SAT . Accept x if and only if C P ðx; descðW x Þ; ÁÞ is not satisfiable. Recall that A CKT-SAT runs in polyðmÞ Á 2 n = f ðnÞ for a superpolynomial function f . Since the size of C P ðx; descðW x Þ; ÁÞ is bounded by a polynomial and its input length is n þ Oðlog nÞ, N runs in 2 nÀ!ðlog nÞ time. Therefore, every problem in NTIMEð2 n Þ can be solved in 2 nÀ!ðlog nÞ time nondeterministically. This contradicts the nondeterministic hierarchy theorem.
Next, we modify this proof for polynomial-size ACC 0 circuits. As in the case of unrestricted circuits, we suppose that NEXP & ACC 0 ðpolyðnÞÞ and derive a contradiction. The main obstacle is then that the circuit C P in the nondeterministic algorithm N is not necessarily an ACC 0 circuit. The key idea is to use the circuit evaluation problem, which is to decide the output value CðxÞ of a given circuit C and its input x. It is easy to see this problem is in P. Recall that we first assumed NEXP & ACC 0 ðpolyðnÞÞ. Thus, P & ACC 0 ðpolyðnÞÞ and the circuit evaluation problem is in ACC 0 ðpolyðnÞÞ. Therefore, we can nondeterministically guess and verify a polynomial-size ACC 0 circuit C 0 P that is equivalent with C P via the circuit evaluation in ACC 0 . We discuss this idea in more details. First, we show if P & ACC 0 ðpolyðnÞÞ every circuit family of polynomial size can be simulated by ACC 0 circuit family of polynomial size.
Circuit Evaluation
Input: a description descðCÞ of a circuit C : f0; 1g n ! f0; 1g and its input x 2 f0; 1g n , Output: CðxÞ. We can easily see that some polynomial-time deterministic Turing machine T E solves this problem. Since P & ACC 0 , we have some polynomial-size family of ACC 0 circuits C E which output CðxÞ on a given input ðdescðCÞ; xÞ. Then, a polynomial-size ACC 0 circuit C E ðdescðCÞ; ÁÞ is equivalent to C. We define a problem describing the circuit C P to implement an ACC 0 circuit describing C P : Description of C P Input: a length parameter 1 n and a gate index j of C P (which takes x 2 f0; 1g n in the first input), Output: the gate type (AND, OR, NOT, INPUT) of the j-th gate of C P and two gate indices j 1 and j 2 whose outputs are connected to inputs of the j-th gate. It is easy to see that this problem can be solved in polynomial time since C P is a polynomial-size circuit, and thus, we have a polynomial-size family of ACC 0 circuits D computing this problem under the assumption that NEXP & ACC 0 . Guessing the circuit D for the above problem, we can easily verify if it is correct by comparing the original circuit C P with the outputs of D.
From the circuit D, we also define a problem evaluating an output value of an indicated gate: Evaluation of C P Input: an instance x 2 f0; 1g n of L, a description descðW x Þ of a witness circuit for x, an input i to C P , and a gate index j of C P . Output: an output bit of the j-th gate of C P ðx; descðW x Þ; iÞ. Similarly to the circuit evaluation, this problem is computable in polynomial time. Therefore, some polynomial-size family of ACC 0 circuits E can compute this problem. Guessing the circuit E, we can reduce verification of correctness of E to ACC 0 -CKT-SAT by building a circuit VALUE from D and E as follows: Circuit VALUE (inputs: x, descðW x Þ, i and j)
1. Compute Dð jÞ and obtain the gate type g of the j-th gate of C P and the two gate indices j 1 and j 2 whose outputs are connected to inputs of the j-th gate. 0 -CKT-SAT algorithm to a circuit VALUEðx; W x ; Á; ÁÞ, we can verify the correctness nontrivially fast. Therefore, we can nondeterministically guess and verify a polynomial-size ACC 0 circuit C 0 P :¼ EðÁ; Á; Á; j Ã Þ which is equivalent to C P , where j Ã is the index of the output gate number of C P . Plugging this argument to the previous proof technique for the case of unrestricted circuits, we obtain the unconditional lower bounds NEXP & ACC 0 ðpolyðnÞÞ under the assumption that ACC 0 -CKT-SAT can be solved in polyðmÞ2 n = f ðnÞ time for any superpolynomial function f . (ii) Faster deterministic algorithms for ACC 0 -CKT-SAT. Next, we will construct the fast ACC 0 -CKT-SAT algorithm. The overview of the algorithm is as follows: We first transform a given ACC 0 circuit C of size s into an equivalent depth-two circuit C 0 of size s 0 that has a symmetric function at the top gate and only ANDs as the bottom gates. We can then exploit fast matrix multiplication algorithm to compute ttðC 0 Þ nontrivially fast. From this fact, we can check the satisfiablity of C 0 nontrivially fast. We now discuss the transformation from ACC 0 circuits. Following the work of Yao [59] , Beigel and Tarui [10] , and Allender and Gore [4] we can obtain a lemma for the transformation. Since SYMþ is a symmetric function, it suffices for evaluation of the depth-two circuit C 0 to evaluate the number of the bottom AND gates that output 1. In order to count the number of the AND gates, we exploit a variant of the wellknown matrix multiplication algorithm of [17] : Lemma 6.7. For every sufficiently large N 2 N, multiplication over integers of two 0-1 matrices of dimension N Â N :1 and N :1 Â N can be computed in OðN 2 log 2 NÞ arithmetic operations.
We partition the input indices f1; . . . ; ng into two sets A and B of (roughly) equal size (namely, jAj % jBj % n=2) arbitrarily. Set n A :¼ jAj and n B :¼ jBj. For A and B, we consider two matrices M A and M B . The entries of these matrices are defined as follows: nB . These matrices can be constructed in Oð2 n=2 Á s 0 Á polyðnÞÞ time. Note that M A ði; jÞM B ð j; kÞ ¼ 1 if and only if the output of the j-th AND is 1 by the assignments i and k to A and B. Therefore, defining N :¼ M A M B , Nði; kÞ indicates the number of ANDs that output 1 by the assignment ði; kÞ to the n-bit input of C 0 . By Lemma 6.7, N can be computed in 2 n polyðnÞ time if s 0 2 :1n . (We can suppose that s is a polynomial in n and hence s 0 ¼ Oðn polylogn Þ from the argument of (i).) Also, we construct the output table T of the top gate on the number of 1s in its inputs. Precisely, T i ¼ 1 if and only if the top gate outputs 1 when i inputs of it have 1, where i 2 f0; . . . ; s 0 À 1g. We can construct T in polyðs 0 Þ time since the top gate can be computed in polyðs 0 Þ. Therefore, we can construct ttðC 0 Þ in 2 n polyðnÞ þ polyðs 0 Þ time from the matrix N and the output table T. It should be noticed that a trivial construction of ttðC 0 Þ from evaluation of C 0 requires 2 n polyðs 0 Þ and thus it gives a non-trivial speedup for computing ttðC 0 Þ. It is trivial how to check the satisfiability of C 0 from ttðC 0 Þ. ACC 0 -CKT-SAT algorithm (input: s-size circuit C of depth d and n-bit input) (1) Construct a ð2 ' Á s þ 1Þ-size ACC 0 circuit C Ã of depth ðd þ 1Þ and n À ' inputs from C as follows, where ' :¼ n for an appropriately chosen constant 0 < < 1. (3) Check the satisfiability of C 0 from ttðC 0 Þ constructed by the above technique with Lemma 6.7. Note that C is satisfiable if and only if so is C Ã , and the size of C Ã is 2 ' s þ 1. Then, the running time of this algorithm is at most 2 nÀ' polyðnÞ þ polyðs Á 2 ' Þ 2 nÀn ð1Þ , which satisfies the requirement for obtaining NEXP 6 & ACC 0 ðpolyðnÞÞ at the first part. Ã As done in the above proof, Williams first showed that a fast algorithm for CKT-SAT implies circuit lower bounds in NEXP in the case of general circuits in [55] . Following up the result, Williams himself modified the argument of [55] for the case of classes of restricted circuits, and further provided a fast algorithm for ACC 0 -CKT-SAT. The strategy goes along [56] , but the main obstacle is that NTIME \ coNTIME is strongly believed to have no hierarchy theorems unlike NTIME and thus it is hopeless to reduce to hierarchy theorems with faster CKT-SAT algorithms for circuit lower bounds in NEXP \ coNEXP. Williams' result of [57] succeeded to get around it by reducing to diagonalization arguments. Proof Sketch. Assume for contradiction that ½NTIME \ coNTIMEð2 OðnÞ Þ & ACC 0 ðn log n Þ. Then, TIMEð2 OðnÞ Þ & SIZEðn 2 log n Þ since any gate of fan-in ' in an ACC 0 circuit can be simulated by an Oð'Þ-size circuit with AND, OR, NOT of bounded fan-in (see, e.g., [18] ) and hence ACC 0 ðn log n Þ SIZEðn 2 log n Þ. By Theorem 5.5, we have Also, since TIME is closed under complement, we have Next, we use the hardness-randomness tradeoff argument to derandomize MATIME and coMATIME by circuit complexity of ACC 0 witness circuits. It can be shown that if P & ACC 0 ðpolyðnÞÞ there exists a unary problem L uni & f1 n : n 2 Ng such that L uni 2 NTIMEð2 n Þ does not have 2 n " -size ACC 0 witness circuit. (See Corollary 3 in [57] .) The proof heavily relies on the idea of the proof of Theorem 6.4. We derive a contradiction from the nondeterministic hierarchy theorem with the assumptions that NTIMEð2 n Þ has 2 n " -size witness circuits and P & ACC 0 ðpolyðnÞÞ, and the fact that ACC 0 -CKT-SAT can be solved nontrivially fast, as shown in the second part of the proof of Theorem 6.4.
So, we can nondeterministically generate witnesses of L uni on unary inputs 1 n as the string having ACC 0 circuit complexity of 2 n " on infinitely many n. Using the idea of the circuit evaluation problem from the first part of the proof of Theorem 6.4, we can also derive the unrestricted circuit complexity of 2 n "=2 as well. Plugging L uni having subexponentially high unrestricted circuit complexity into the pseudorandom generator given in Theorem 6.1, we can show that
MATIMEð2
Oðlog 3 nÞ Þ \ coMATIMEð2 Oðlog 3 nÞ Þ io-½NTIME \ coNTIMEðn log d n Þ for some constant d, where for a class C we define io-C as a class of problems L that there exists L 0 2 C such that L \ f0; 1g n ¼ L 0 \ f0; 1g n for infinitely many n 2 N. Combining these arguments, we have new one called the algebrization barrier [2] . At the same time, they proved that a lot of separations and inclusions algebrize, which includes all the separations shown in Section 5.
For two classes C and D, we say a sepration C 6 & D (an inclusion C D, respectively) algebrizes if for every oracle A and every finite field extensionÃ of A we have CÃ 6 & D A (C A DÃ, respectively), whereÃ, which is defined over a non-binary field, coincides with A over f0; 1g Ã , and its truth table (on every input length) represents a polynomial of constant degree independently of input length and order of the field.
Moreover, they showed NEXPÃ & P A =poly and NPÃ & SIZE A ðnÞ, and thus, we cannot prove separations NEXP 6 & SIZEðpolyðnÞÞ and NP 6 & SIZEð!ðnÞÞ by algebrizing arguments as used in Section 5. Therefore, we require more powerful techniques than known non-relativizing ones to prove high circuit lower bounds in lower classes.
One of the candidates is Williams' research program discussed in Section 6. As commented in his paper [56] , his argument essentially relies on the transformation from ACC 0 circuits to the special depth-two circuits, and this transformation does not seem to algebrize. So, the next step would improve the arguments for restricted circuit lower bounds along his program (e.g., replace ACC 0 to TC 0 and/or NEXP \ coNEXP to EXP), and find another novel approach that circumvents the known barriers.
