On proving circuit lower bounds against the polynomial-time hierarchy by Jin-yi Cai & Osamu Watanabe
SIAM J. Comp. Submission Version: Revised 11/27/03
title:
On Proving Circuit Lower Bounds Against the Polynomial-time Hierarchy
author:
Jin-Yi Cai1 and Osamu Watanabe2
aliation:
1. Computer Sci. Dept., Univ. of Wisconsin, Madison, WI 53706, USA
2. Dept. of Mathematical and Computing Sciences, Tokyo Institute of Technology,
Tokyo 152-8552, Japan
corresponding author and email:
Osamu Watanabe (watanabe@is.titech.ac.jp)
acknowledgments to nancial supports:
1. Supported in part by NSF grants CCR-0208013 and CCR-0196197 and
U.S.-Japan Collaborative Research NSF SBE-INT 9726724.
2. Supported in part by the JSPS/NSF Collaborative Research 1999 and
by the Ministry for Education, Grant-in-Aid for Scientic Research (C), 2001.
Abstract: We consider the problem of proving circuit lower bounds against the polynomial-
time hierarchy. We rst revisit a lower bound given by Kannan [Kan82], and for any xed
integer k > 0, we give an explicit 
p
2 language, recognizable by a 
p
2-machine with running
time O(nk2+k), that requires circuit size > nk. Next, as our main results, we give relativized
results showing the diculty of proving polynomial-size circuit lower bounds for languages in
the polynomial-time hierarchy. For providing fair relativized comparisons, we impose a restric-
tion on a simulating machine that it cannot make queries longer than a simulated machine can
access. Under this stronger relativization setting, we show, for example, an oracle with which
all languages in the polynomial-time hierarchy can be recognized by some polynomial-size cir-
cuits. Our proof techniques are based on the decision tree version of the Switching Lemma
for constant depth circuits and Nisan-Wigderson pseudorandom generator. We also take this
opportunity to publish some unpublished older results of the rst author on constant depth
circuits, both straight lower bounds and inapproximability results based on decision tree type
Switching Lemmas.
11 Introduction
It is a most basic open problem in Theoretical Computer Science to give circuit lower bounds for
various complexity classes. The class P has polynomial size circuits. It is also widely believed
that NP does not share this property, i.e., that some specic set such as SAT in NP requires
super polynomial circuit size. While this remains the most concrete approach to the NP vs. P
problem, we can't even prove, for any xed k > 1, that any set L 2 NP requires circuit size
> nk.
If we relax the restriction from NP to the second level of the Polynomial-time Hierarchy1

p
2, R. Kannan [Kan82] did prove that for any xed polynomial nk, there is some set L in 
p
2
which requires circuit size > nk. Kannan in fact proved the existence theorem for some set in

p
2 \ 
p
2. This result has been improved by K obler and Watanabe [KW98] who showed, based
on the technique developed in [BCGKT], that such a set exists in ZPPNP. More recently, the
work in [Cai01] implies that a yet lower class S
p
2 contains such a set. (See [BFT98, MVW99] for
related topics.)
However, Kannan's proof for 
p
2, and all the subsequent improvements mentioned above,
are not \constructive" in the sense that it does not identify a single 
p
2 machine whose language
requires circuit size > nk. In this paper we rst remark this point and give some constructive
proof for 
p
2.
At the top level, all these proofs for the above mentioned results are of the same type. Let
us review Kannan's proof for 
p
2: Either SAT does not have nk size circuits, then we are done,
or SAT has nk size circuits, then we can dene some other set, which by the existence of the
hypothetical circuit for SAT can be shown in 
p
2, and it requires circuit size > nk. Thus, in
each case, we have some set L0 in 
p
2 that has no nk size circuits; but this does not give any
single 
p
2 machine whose language requires circuit size > nk. Constructively, Kannan gave a set
in 
p
4 \ 
p
4. In [MVW99] a set in 
p
3 was constructively given. We improve this to 
p
2.
Theorem 1 For any integer k > 0, we can construct a 
p
2 machine with O(nk2
logk+1 n) run-
ning time that accepts a set with no nk size circuits.
Notice that 
p
2 has complete languages. Thus, by using any standard complete language C
for 
p
2, it is easy to obtain a result like the above. We can argue as follows: Estimate the time
complexity of a reduction from L0 to C, which is possible even from the above \nonconstructive"
proof of the existence of L0. Then dene a padded version C0 of C so that L0 is reducible to
C in linear time. This C0 is a language that requires circuit size > nk; clearly, we can give
explicitly a 
p
2 machine recognzing C0 and its time bound. Our contribution here is to show a
way to construct such a machine directly, which we hope to be of any help when discussing a
similar constructive proof that is open for the stronger statements, i.e., the existence of a set
with nk circuit size lower bound in 
p
2 \ 
p
2 (resp., ZPPNP, and S
p
2).
Our main result in this paper deals with the diculty in proving super polynomial circuit
size lower bound for any set in the Polynomial-time Hierarchy, PH. While it is possible to prove
a lower bound above any xed polynomial, at least for some sets in 
p
2, the real challenge is to
1We will use standard notions and notations in complexity theory; see textbooks, e.g., [DK00], for their
denition.
2prove super polynomial circuit size lower bound for a single language. Not only have we not
been able to do this for any set in NP, but also no super polynomial lower bound is known for
any set in PH. In this paper we prove that it is infeasible to give relativizable super polynomial
lower bound for any set in the Polynomial-time Hierarchy.
For our relativized argument, we propose a new computation model that gives us more
\stringent" relativized results. Relativization results can be generally classied as either sepa-
ration or collapsing/containment results. The implication of a relativized separation result is
that the corresponding collapse is dicult to prove. Similarly a relativized collapsing result im-
plies that the corresponding separation is dicult to prove. Notice that it is still possible (and
in fact, such examples have been shown) to have a separation or collapsing result against the
corresponding relativized result; relativized results just suggest some \diculty" and not \im-
possiblity". Also we should note that the degree of \diculty" may depend on a relativization
type. Here we deal with relativized collapsing results, and we introduce a new relativization
notion | stringent relativization | for demonstrating the diculty of proving circuit lower
bounds for PH.
By surveying existing relativized collapsing results, we came to realize an asymmetry is often
present. In almost all of these relativized collapsing results the proof is achieved by allowing
stronger access to oracles by the simulating computation than the simulated computation. For
example, in the usual proof of PA = NPA or PA = PSPACEA, we encode QBF in the oracle.
In terms of the simulation by the PQBF machine M simulating an NPQBF or PSPACEQBF
computation M0 on an input x, M will access an oracle location polynomially longer than the
corresponding access that M0 makes. That is, PA machines are given more powerful oracle
access. One can argue that this asymmetry is within a polynomial factor, but it nonetheless
denies access to certain segments of the oracle to the simulated machine while aords such
access to the simulating machine. In our present study of specic polynomial bounds such as
nk of either circuit size or running time, this arbitrary polynomial stretch in oracle access is not
acceptable.
The following example will make this point clear. Hopcroft, Paul, and Valiant [HPV77]
proved that any machine with time complexity t can be simulated by some machine with space
complexity s = t
logt that is slightly less than t. In this situation where we deal with specic
bounds like t, to allow the simulating machine oracle access beyond what's allowed for the
simulated machine will cause unacceptable consequences. We can argue as follows: For any
time complexity t, let s = t
logt be the space complexity in the simulation. Take an intermediate
bound s0 that is slightly more than s but slightly less than t (all asymptotically). Then we
can construct an oracle X such that DSPACEX(s0)  DTIMEX(t) holds, with the device of
\unequal access". The idea is to encode all DSPACEX(s0) languages (up to a given length n)
in X at length t(n). Note that any s0-space machine has access only up to length s0(n) < t(n),
the encoding part is beyond where the s0-space machine can look. On the other hand, if the
Hopcroft-Paul-Valiant simulation were to be relativizable with this \unequal access" to the
oracles, then we have DTIMEX(t)  DSPACEX(s). Thus, DSPACEX(s0)  DTIMEX(t) 
DSPACEX(s), violating the relativization of space hierarchy theorem.
We study in this paper some specic circuit size lower bound w.r.t. some specic, say nk, time
bound. Then, as the above observation suggests, we must adopt the following more \stringent"
oracle computation model. In this more \stringent" oracle access model, we require that, for
3any input, the simulating machine or circuit does not access the oracle of length longer than
the simulated machine or circuit can access on this input. We consider circuits consisting of
standard AND, OR, and NOT gates and oracle query gates. An oracle query gate takes m input
bits z = b1b2 :::bm, and has output [z2X], i.e., it outputs 1 or 0 depending on whether z 2 X
or otherwise. Now the proviso of \stringent access to an oracle" is stated as follows: To show
that, at length n, a circuit Cn recognizes the language of machine M with running time nk, we
allow the circuit only access to those strings of length  nk. For any machine M, we say that
a family of circuits fCngn0 simulates MX under a stringent access to the oracle X if at every
length n, Cn recognizes L(MX)=n by stringent access to the oracle X.
Though we dened this denition for the comparison between machines and circuits with
polynomial-time resource bounds, we believe that the notion of \stringent oracle access" is
meaningful in a more general setting. (For more general situations, it might be better to con-
sider a more robust notion of \stringent relativization". We leave this issue and more general
investigations of \stringent relativization" for our future work; see, e.g., [CW03].)
From a more general perspective, the main utility of relativization is to show the inadequacy
of certain proof techniques. Certainly the more \stringent" a requirement we place on the type
of relativization, the stronger the result will be, and perhaps it says more about the infeasibility
of certain techniques. Imagine there are three possible claims of proving a certain lower bound,
such as 
p
2 requires superpolynomial circuit size:
1. A proof totally specic to a specic set in 
p
2 that uses specic properties of the circuit
combinatorics.
2. A proof for a general 
p
2 machine that uses properties of the circuit, but the proof can be
carried through if the machine and circuit were allowed to access any but the same segment
of any oracle, i.e., they could ask queries within the same length bound. (In our case, the
length bound is dened as the time bound of the simulated 
p
2-machine.)
3. A proof more general, for a general 
p
2 machine and circuit, and the proof can go through
even if we allowed the machine and circuit to access dierent segments of the oracle.
Any relativization to the contrary says nothing about the rst possibility. A relativization
to the contrary with stringent access model rules out possibility 2 and 3. If we did not have the
\stringent access requirement", then we can only rule out 3, but not 2.
In this paper we focus on specic time/size bounds. In the stringent oracle access model,
we prove that for any alternating oracle TM M with running time O(nk), there is an oracle X
and a polynomial size circuit family accepting it. Therefore we rule out possibilities 2 and 3
above.
Theorem 2 (Main Theorem) For any integer d > 0 and any real k > 1, let M be an oracle

p
d-machine with running time O(nk). Then we have an oracle X and a family of Boolean
circuits fCngn0 that recognizes L(MX) under a stringent access to the oracle X. For all
suciently large n, the size of Cn is bounded by ncdk, for some universal constant c > 0.
From this, we can conclude that, relative to the oracle X given above, every set in PH
has some polynomial-size circuits, i.e., PHX  PX=poly. Recall that Heller [He84] showed an
oracle Y such that EXPY  PY =poly, which immediately implies that PHY  PY =poly. But
this oracle Y is not used in a stringent way; that is, a circuit simulating a given 
p
d-machine
4M on inputs of length n makes queries to Y that are longer than the time bound of M on
length n inputs. (Notice that our stringency condition is for polynomial time bounds. It would
be possible to extend it to exponential time bounds, and we may claim that the oracle Y is
used in a stringent way in the relation EXPY  PY =poly, beause any polynomial bound for
circuit size is less than exponential time bounds of simulated EXP machines. In this sense, our
stringency notion is not robust, and we need more robust notion for general investigations; see,
e.g., [CW03].)
Our proof technique for the main theorem is based on the decision tree version of the
Switching Lemma for constant depth circuits and Nisan-Wigderson pseudorandom generator.
As these results crucially depend on lower bounds for constant depth circuits, we take this
opportunity to publish some unpublished older results of the rst author on constant depth cir-
cuits, as it would t the theme. These include both straight lower bounds and inapproximability
results based on decision tree type Switching Lemmas. We give some better constants in the
exponents than previously published lower bounds.
2 Proof of Theorem 1
R. Kannan [Kan82] proved that for any xed polynomial nk, there is some set L in 
p
2 \
p
2 with
circuit size > nk. However, in terms of explicit construction, he only gave a set in 
p
4 \ 
p
4. An
improvement to 
p
3 was stated in [MVW99].
In this section we give a constructive proof of Kannan's theorem for 
p
2.
For any n  0, a binary sequence  of length `  2n is called a partial characteristic sequence,
which will specify the membership of lexicographically the rst ` strings of f0;1gn. We denote
this subset of f0;1gn by L(). We say that  is consistent with a circuit C with n input gates,
i 8i, 1  i  `, C(xi) outputs the ith bit of , where xi is the ith string of f0;1gn.
We can encode every circuit C of size  s as a string u of length len(s), where len(s)
is dened as len(s) = ccircbslogsc with some constant ccirc. We may consider every u with
juj = len(s) encodes some circuit of size  s; if a string u is not a proper code or the encoded
circuit has size > s, we assume that this u encodes the constant 0 circuit. The following lemma
is immediate by counting.
Lemma 3 For any s > 1, there exists a partial characteristic sequence of length ` = len(s) + 1
that is not consistent with any circuit of size  s.
Our goal is to dene a set L that has no nk size circuit but that is recognized by some
explicitly dened 
p
2 machine. Our construction follows essentially the same outline as the one
given in [MVW99], which in turn uses ideas given in Kannan's original proof. The further
improvement is mainly an even more ecient use of alternation.
For a given n, let ` = len(nk) + 1. We try to construct a partial characteristic sequence
non of length ` that is consistent with no circuit of size  nk. We will introduce an auxiliary
set PreCIRC that is in NP. With this PreCIRC, some 
p
2 machine can uniquely determine the
desired characteristic sequence non (on its accepting path). We would like to dene our set L
(partially) consistent with this sequence non. But 
p
2 computation using some auxiliary NP
set cannot be implemented, in general, by any 
p
2 machine. Suppose here that PreCIRC has
5nk size circuits; then some 
p
2 machine can guess such circuits, verify them, and use them for
computing non and recognizing strings according to non. What if there are no such circuits for
PreCIRC? We will dene L so that one part of L is consistent with PreCIRC (while the other
part is consistent with non if PreCIRC is computable by some nk size circuits). If PreCIRC has
no nk size circuit, then the part of L that is consistent with PreCIRC can guarantee the desired
hardness of L.
Now we describe our construction in detail. We x any suciently large n, and let ` =
len(nk) + 1. By \v  u" we mean that u is a prex of v. To compute the \hard" characteristic
sequence non, we want to determine, for a given pair of a partial characteristic sequence  and
a string u, whether u can be extended to some description v of a circuit that is consistent with
. The set PreCIRC is dened for this task. More precisely, for any n > 0, and for any strings
 of length ` and u of length  len(nk), we dene PreCIRC as follows.
1n0u01len(nk) juj 2 PreCIRC
, (9v  u) [ jvj = len(nk), and the circuit encoded by v is consistent with  ]:
Strings of any other form are not contained in PreCIRC. For simplifying our notation, we will
simply write (;u) for 1n0u01len(nk) juj. Since n determines `, and the length of  is `,  and
u are uniquely determined from 0n1u10len(nk) juj. The length of (;u) is e n = n+2`+1. Note
that e n is O(nk logn).
We now dene our machine M. Informally we want M to accept an input x if and only
if either x 2 1f0;1gn 1 and x 2 PreCIRC, or x 2 0f0;1gn 1 and x 2 L, where L=n is a set
with no nk size circuits, for all suciently large n, if PreCIRC=n has nk size circuits for all
suciently large n. Specically, M is designed so that L=n would be L(non) where non is
lexicographically the rst  of length ` with no nk size circuit, provided PreCIRC=e n has a e nk
size circuit for length e n. Note that L(non)  0f0;1gn 1 since jnonj = len(nk) + 1 < 2n 1.
More formally, for any given input x of length n, if x starts with 1, then M accepts it i
x 2 PreCIRC. Suppose otherwise; that is, x starts with 0. Then rst M existentially guesses
a partial characteristic sequence non of length ` and a circuit Cpre of size e nk, more precisely, a
string vpre of length len(e nk) encoding a circuit for PreCIRC=~ n of size  e nk. (Below we use Cpre
to denote the circuit that is encoded by the guessed vpre.) After that, M enters the universal
stage, where it checks the following items.
(1) Cpre makes no mistake whenever it says `yes': 8, jj = `, and 8u, juj  len(nk), verify
that Cpre is \locally consistent" on (;u) if Cpre(;u) = 1, that is, check as follows:
Cpre(;u) = 1 & juj = len(nk) =) the circuit that u encodes is consistent with , and
Cpre(;u) = 1 & juj < len(nk) =) either Cpre(;u0) = 1 or Cpre(;u1) = 1.
(2) Cpre says `yes' for all positive instances: 8u, juj = len(nk), compute the u of length `
dened by (the circuit encoded by) u, and verify that Cpre(u;u0) = 1 for every prex u0 of
u.
(3) The guessed non is lexicographically the rst string of length ` such that no circuit of size
s is consistent with it, according to Cpre: Check Cpre(non;) = 0, and 8 such that jj = `
and  is lexicographically smaller than non, check Cpre(;) = 1 holds. (Here  denotes
the empty string.)
6Finally on each universal branch, if M passes the particular test of this branch, then M accepts
the input x 2 0f0;1gn 1 i non has bit 1 for the string x.
For all (;u), such that jj = ` and juj  len(nk), if Cpre passes (1), then
Cpre(;u) = 1 =) (;u) 2 PreCIRC;
and if Cpre passes (2), then
(;u) 2 PreCIRC =) Cpre(;u) = 1:
Of course there is no guarantee that there exists a circuit Cpre (more precisely, vpre) that will
pass the tests in items (1) and (2). But if there is such a Cpre, then some existential path leads to
such a Cpre together with the right non. This non is the lexicographically rst string of length
` such that no circuit of size nk is consistent with it, which exists by Lemma 3. In particular it
is independent of the particular Cpre guessed. Hence, if there is a circuit for PreCIRC=e n of size
e nk, then M accepts x 2 0f0;1gn 1 i x is in L(non), where non is the unique lexicographically
rst string of length ` with no consistent circuit of size  nk.
On the other hand, if no circuit of size e nk can accept PreCIRC=e n correctly, then no circuit
passes the tests in items (1) and (2), and hence, M simply rejects all x 2 0f0;1gn 1. But
since PreCIRC=e n has no e nk size circuit, the hardness is guaranteed by the PreCIRC part of
L(M). More formally, if PreCIRC=n has no circuit of size nk innitely often, then we are done.
Otherwise, for all suciently large n, and hence for all suciently large e n, a circuit Cpre exists
for PreCIRC=e n of size e nk; then the part L(M) \ 0f0;1gn 1 is dened so that no nk size circuit
can accept it correctly, and hence again we are done. Therefore, we can conclude that L(M)
has no nk size circuit (which by denition means that for innitely many n this is so). This 
p
2
language proves Theorem 1. It can be easily checked that the machine M runs in O(nk2
logk+1 n)
steps.
3 Proof of Theorem 2
We rst give an outline of the proof.
3.1. Proof Outline
Consider any 
p
d polynomial time bounded oracle alternating Turing machine M, with time
bound nk. We want to design an oracle X so that some family of small size circuits can simulate
MX with stringent oracle access. More specically, x any suciently large n, we want a circuit
CM that simulates MX on inputs of length n, where, since M can only query strings of length
at most nk, we require that CM can also only ask queries of length at most nk.
It is well known from [FSS81] that a 
p
d machine M bounded in time nk with oracle X, when
given an input x of length n, gives rise to a bounded depth Boolean circuit Cx of the following
type: The inputs are Boolean variables, and their negations, representing membership of a string
z 2 f0;1gnk
in the oracle X. The Boolean circuit Cx starts with an OR gate at the top, and
alternates with AND's and OR's with depth d + 1, where the bottom level gates have bounded
fan-in at most nk, and all other AND and OR gates are unbounded fan-in, except by the overall
circuit size, which is bounded by nk2nk
. Without loss of generality we may assume the Boolean
7circuit is tree like, except for the input level, where each Boolean variable corresponding to
[z2X] is represented by a pair of complemented variables, which we will denote by z and z.
Our rst idea is to use random restrictions to \kill" the circuit. Here is what we mean.
For any circuit C over Boolean variables x1;:::;xn, a random restriction  (for some specied
parameter p) is a random function that assigns each xi either 0, 1, or , with probability
Pr[(xi) = ] = p and Pr[(xi) = 0] = Pr[(xi) = 1] = (1   p)=2, for each i independently.
Assigning  means to leave it as a variable. Let C j denote a circuit obtained by this random
restriction. It is known that after a random restriction  (for a suitably chosen parameter p),
the circuit C j is suciently weakened so as to have either small min-terms or small max-terms.
Results of this type are generally known as Switching Lemmas, and the strongest form known
is due to H astad [H as86a]. (See also [Ajt83, FSS81, Yao85, Cai86, H as86b]). However it turns
out that we need a dierent form, namely a decision tree type Switching Lemma [Cai86]. We
want to assign a suitably chosen random restriction , after which the circuit admits a small
depth decision tree. We in fact will have to consider an aggregate of 2n such Boolean circuits
Cx simultaneously, one each for an input x of size n. We want to assign , after which all these
circuits have small depth decision trees. We then will proceed to set those variables to ensure
that all these circuits are \killed", i.e., they all have a denite value now, either 0 or 1. We
need to assign those variables consistently over all 2n small depth decision trees. For decision
trees, it is easy to achieve this by always setting \the next variable" asked by the decision tree
to 0, say; it is not clear how to maintain this consistency in terms of min-terms and max-terms.
If each decision tree has depth bounded by t, then we will have assigned at most 2nt many
variables corresponding to those strings of length nk where  initially assigned a  (i.e., they
are left unassigned by ). We will argue that there are still plenty of unassigned variables left,
where we may try to encode the now-determined computational values of these 2n circuits. We
will argue that t is suciently small, and yet with high probability all 2n circuits admit decision
trees of depth at most t.
The problem with this idea is that after we have coded the values of all the 2n circuits in X,
there does not seem to be any easy way to recover this information. Since X had already been
\ravaged" by the random restriction , it is not clear how to distinguish those \code bits" from
those \random bits". Further complicating the matter are those bits assigned during the decision
tree settlement. All of this must be sorted out, supposedly, by a polynomial size oracle circuit
which is to accept L(MX)=n. Note that, after a random restriction , it is probabilistically
almost impossible to have an easily identiable segment of the set X all assigned  by , (e.g.,
all strings in f0;1g=nk
with a certain leading bit pattern), not to mention the subsequent all 0
assignment to x the decision trees. On the other hand, we have 2n computations to code. It
is infeasible for the nal polynomial size oracle circuit to \remember" more than a polynomial
number of bits as the address of the coding region. So it appears that we must have an easily
identiable region to code, identied with at most a polynomial number of bits for its address,
and, to accommodate 2n computations, this region must be large.
To overcome this diculty, our idea is to use not true random restrictions, but pseudo-
random restrictions via the Nisan-Wigderson generator [Nis91a, Nis91b, NW88]. Nisan and
Wigderson designed a pseudorandom generator (which we will call a NW generator) provably
indistinguishable from true random bits by polynomial size constant depth circuits. While our
circuits are not of polynomial size, this can be scaled up easily. Our idea is then to use the
8output of some NW generator to perform the \random" restriction, and to argue that all 2n
circuits are \killed" with high probability, just as before with true random restrictions. The
basic argument is that no constant depth circuits of an appropriate size can tell the dierence
under either a true random assignment or a pseudorandom assignment coming from the NW
generator. However, for our purpose in this paper, we wish to say that a certain behavior of
these 2n constant depth circuits | namely they are likely to possess small depth decision trees
after a \random" restriction with 0, 1 and 's | is preserved when \pseudorandom restrictions"
are substituted for \random restrictions". It is vitally important that whatever property we
wish to claim to have been maintained by the substitution of random bits by pseudorandom
bits, the property must be expressible as a constant depth circuit with an appropriate size upper
bound. It is not clear the property of \having a small depth decision tree" can be expressed in
this way.
We overcome this diculty by using a weaker property which is a consequence of \having
a small depth decision tree", which nonetheless is sucient for our purpose. Namely, we take
directly the property that, after a restriction with 0, 1 and 's, every one of the 2n circuits can be
determined by assigning additionally 0's to a small number of variables, which had been assigned
's. This property is expressible in a constant depth way. Then we will mimic the probability
distribution of the 0, 1 and 's under the random restrictions by uniform random bits 0's and
1's, so that we can come up with a constant depth circuit D with the following property: It
takes only boolean inputs 
 of 0's and 1's, and D evaluates to 1 i when a restriction 
 with
0, 1 and 's dened by 
 is applied to all 2n circuits Cx, every Cx can be set to either 0 or 1
after a small number of additional variables are set to 0. We will design D in such a way that
under a uniform bit sequence 
, D will almost certainly evaluate to 1.
In fact we need more than that. We also need to have the property that a certain segment
of the oracle is untouched by the additional setting of 0's in all 2n decision tree settlements. We
will argue by the pigeonhole principle, that our bounds guarantee a suitable region unspoiled by
all these decision tree settlement variables. It is not reasonable to expect that any such region
is entirely assigned with 's, but at least there should be many 's.
Assume now we have designed such a D satisfying all these requirements. For this D we
apply the NW generator, substituting pseudorandom bits for true random bits 
 given to D
as inputs. We conclude that D still evaluates to 1 with high probability. In particular, there
must be some setting of the source bits ! for the generator, such that D is evaluated to 1.
This implies that we can assign the oracle set X rst according to the pseudorandom restriction
described by the pseudorandom bits, then according to the 2n small depth decision trees, which
are guaranteed by the evaluation of D, and set these additional variables all to 0. This settles all
the decision trees and thus the values of all 2n circuits Cx are determined. Furthermore, there
is a signicant segment Ty0 of X free from any variables used in any decision tree settlement,
where we will code these 2n results of Cx.
Even though this segment Ty0 is free from any variables used in any decision tree settlement,
in order to code the computation results of Cx, there must be plenty of  left, and they must
be recoverable by polynomial size circuits. We will show that with high probability over a
uniformly chosen random seed !, the pseudorandom restriction dened by ! will leave plenty
of  in each segment such as Ty0. We then in fact choose a sequence of bits ! that satises both
the requirement D = 1 and this additional requirement.
9Finally, we will show that with a suitable choice of parameters in the combinatorial design
used in the NW generator, we will be able to recover in polynomial time the location where we
assigned 's in X, in particular from within the coding segment of X given the address of this
segment. Now our polynomial size circuit CM is designed as follows: It remembers (is hardwired
with) y0, i.e., the address of Ty0, and remember the seed ! for the NW generator, which is of
polynomial length. Then on any input x, it performs the polynomial time computation over a
nite eld to extract the coded result of Cx from the appropriate location in X.
3.2. Proof Detail
Now we specify the parameters and state our proof precisely.
Fix any 
p
d polynomial time bounded oracle alternating Turing machine M, with time
bound nk. For notational convenience we will assume k > 2 and d  7. We assume that n is
suciently large. On input of length n, M can only query strings of length at most nk. We will
use m to denote nk and M to denote 2m throughout this proof.
Assume that membership in the oracle set has already been decided for all strings of length
less than m. Our task is to x the membership for \z 2 X?" of length exactly m in X, so
that for each input x of length n, membership \x 2 L(MX)?" can be decided by a polynomial
size circuit CM with oracle gates that can access X=m. Since X<m has already been xed,
membership \x 2 L(MX)?" is determined by the set X=m. Here we specically require that
the circuit CM can access only those strings that can be possibly accessed by the simulated
machine M on input of length n.
There are 2n inputs x of length n, each computation of M on x gives rise to a depth d + 1
Boolean circuit Cx with bottom fan-in at most m. The inputs to each circuit Cx are the 2M
literals z and z, where z 2 f0;1gm corresponds to the truth value of [z2X]. (To simplify
our notation, we will denote by z both a string in f0;1gm as well as the Boolean variable
corresponding to [z2X].) As stated earlier, we assume that each circuit Cx is a tree, starting
with an OR gate at the top, and alternating with AND's and OR's until inputs z's and z's,
where these inputs are duplicated to keep the tree structure. That is, each circuit Cx is a depth
d + 1 tree with size at most mM and bottom fan-in at most m.
A Switching Lemma shows that such a constant depth circuit is suciently weakened, after
a suitably chosen random restriction , so as to have either small min-terms or small max-terms.
The strongest form known is due to H astad [H as86a]. For our purpose in this paper, however,
we will require something more.
The decision tree complexity of a Boolean function f, denoted by DC(f), is the smallest
depth of a Boolean decision tree computing the function. It can be shown easily that if DC(f) 
t, then f can be expressed both as an AND of OR's as well as an OR of AND's, with bottom
fan-in at most t. Moreover, clearly, there is a subset of no more than t variables, if one assigns
all of them to 0, the function f will be determined. This is an important advantage as we will
have to assign many non-disjoint subsets of variables for multiple Boolean functions, and all
these assignments need to be consistent.
Adapting H astad's proof to the decision tree model, one can prove the following lemma. In
Section 4 we will discuss these lemmas more thoroughly.
10Lemma 4 For any depth d+1 Boolean circuit C on M inputs z1;z2;:::;zM, of size at most s
and bottom fan-in at most t, we have
Pr[DC(C j)  t] 
s
2t;
where the random restriction  is dened for p = 1
(10t)d.
To reduce all circuits Cx, x 2 f0;1gn, to small depth decision trees, we apply a random
restriction with p = 1=(20m)d to these circuits. Then by the union bound we have,
Claim 1
Pr[
_
x2f0;1gn
[DC(Cx j)  2m]]  2n 
mM
22m =
m
2m n:
That is, with probability close to 1, a random restriction reduces every circuit Cx to a decision
tree of depth < 2m.
Below we will carry out a sequence of transformations on the circuits Cx, x 2 f0;1gn, with
the ultimate goal of constructing the circuit D which, in some sense, is a test for the success of
a \random restriction".
Step 1 (C1
x): C1
x takes 2M Boolean inputs (az;bz), for z 2 f0;1gm. The pair (az;bz) will
represent the status of the Boolean variable z to Cx as follows: az = 1 i the value of z is set
(to either 0 or 1, i.e., not set to ), and az = 0 otherwise. If az = 1, then the 0-1 value of z is
represented by bz. If the pair (az;bz) represents the value of z, then the pair (az;bz) represents
that of zi. Clearly, if z is set 0 (resp., 1), then z must be set 1 (resp., 0).
C1
x is constructed from Cx as follows. Each gate g in Cx will be represented by a pair of
gates (gs;gv). gs = 1 i g is set to either 0 or 1, i.e., it is determined; gs = 0 otherwise. If
gs = 1 then g = gv. Thus, (gs;gv) = (0;0) or (0;1) represent the situation where g has not been
determined, and (gs;gv) = (1;0), or (1;1) respectively, represent the case where g is set to 0, or
1 respectively.
Suppose g is an OR gate, g =
Ws
i=1 g(i), where g(i) is an input literal or an internal gate. Sup-
pose g(i) is represented by the pair (g
(i)
s ;g
(i)
v ). This representation is already dened inductively.
Then we let
gs =
s _
i=1

(g(i)
s ^ g(i)
v )

_
  s ^
i=1
(g(i)
s ^ g
(i)
v )
!
:
That is, g is set i either some gi is set to 1, or else all gi are set to 0. Note that the formula
given for gs is a depth 2 circuit of size O(s). Also let
gv =
s _
i=1
g(i)
v :
Note that gv is only a \valid" value for g when gs = 1. Also gv is depth 1 and has size s.
The case g =
Vs
i=1 g(i) is dual. In this case, g is set i either some gi is set to 0, or else all
gi are set to 1. Thus
11gs =
s _
i=1

(g(i)
s ^ g
(i)
v )

_
  s ^
i=1
(g(i)
s ^ g(i)
v )
!
; and gv =
s ^
i=1
g(i)
v :
Again they are depth 2, size O(s), and depth 1, size s, respectively.
In order to maintain alternating form of AND's and OR's in the circuit C1
x, with all negations
pushed to the input level, we can represent each gate g by both g and its negated value g. This
can introduce at most a factor of 2 in the size. (In fact we will dene only three gates gs, gv, and
gv in our construction; we do not need (g)s. But we will omit the detailed analysis of constant
factors.) C1
x has two output gates gs and gv for the output gate g of Cx. It follows that
size(C1
x) = O(size(Cx)); and depth(C1
x) = 2 depth(Cx):
We can take the constant in O(size(Cx)) to be 10, say.
Step 2 (C2
x): Let p = 1
(2m)d. Let L = dlog2
1
pe  dk log2(20n). C2
x takes Boolean inputs
(az;1;:::;az;L;bz), for z 2 f0;1gm. The circuit C2
x is identical to C1
x, except instead of taking
inputs az, it has az =
WL
j=1 az;j.
Note that, a random restriction  with parameter Pr[(z) = ] = 1=2L on Cx is simulated
by uniformly and independently assigning all the bits (az;1;:::;az;L;bz) to 0 or 1, in C2
x, for
z 2 f0;1gm. The behavior of Cx is represented in C2
x exactly. Here we have
size(C2
x) = size(C1
x) + O(ML); and depth(C2
x) = depth(C1
x) + 1:
Note also that 2 L  p. The same upper bound in Lemma 4 and Claim 1 still applies when 
has parameter 2 L.
Step 3 (C3
x): In C3
x we will check for the existence of a subset S  [M] of cardinality jSj = 2m
such that, rst they are assigned  by the , and second if we further set them all to 0, it would
determine the circuit Cx. We know from Claim 1 that this is almost certainly true for our
random restriction.
Thus, we let
C3
x =
_
S
"
^
z2S
az ^ [(C2
x)s]S
#
;
where
W
S ranges over all subsets S  [M] of cardinality jSj = 2m, and (C2
x)s is the \set bit
output" for C2
x, and [(C2
x)s]S is obtained from (C2
x)s by setting all bz = 0 for z 2 S. Recall that
az =
WL
j=1 az;j. Then we have
size(C3
x) 
 
M
2m
!
(size(C2
x) + O(m)); and depth(C3
x) = depth(C2
x) + 2:
Step 4 (D): Finally, dene D by
D =
^
x2f0;1gn
C3
x:
12Then we have
size(D) = 2n(size(C3
x)); and depth(D) = depth(C3
x) + 1:
This completes the construction of D, with
size(D) < 23m2
; and depth(D)  2d + 6  3d   1:
Below we will denote 3d   1 by b d.
From our construction, it follows that (i) the uniform independent distribution on the input
bits of D simulates the random restriction  with p = 2 L  1=(20m)d, and that (ii) D becomes
true if every Cxj has decision tree depth at most 2m. Hence, the following claim follows from
Claim 1.
Claim 2
Pr[D = 1]  1  
m
2m n;
where the probability is over uniform input bits of D.
Now we apply a NW generator to this circuit D. First we recall some basic notions on NW
generators from [NW94].
Let U, M, m and q be positive integers. Let [U] be some set of cardinality U, e.g.,
f1;2;:::;Ug. A collection of subsets S = fS1;:::;SMg of some domain [U] is called a (m;q)-
design if it satises the following conditions.
(1) 8i, 1  i  M [ jSij = q ], and
(2) 8i, 8j, 1  i 6= j  M [ jSi \ Sjj  m ].
Based on a given (m;q)-design S = fS1;:::;SMg with domain [U], we dene the following
function gS: f0;1gU ! f0;1gM, which we call a (parity based) NW generator.
gS(x1 xU) = y1 yM;
where each yi, 1  i  M, is dened
by yi = xs1    xsq (where Si = fs1;:::;sqg  [U]).
For the pseudorandomness of this generator, we have the following lemma [NW94].
Lemma 5 For any positive integers U;M;m;q;s and e, and positive real , let gS be the NW
generator dened using an (m;q)-design fS1;:::;SMg with domain [U], and suppose for any
depth e + 1 circuit C on q input bits and of size at most s + cnw2mM (where cnw is some
constant), the q bit parity function has the following bias:


 Pr(u1;:::;uq)2f0;1gq[C(u1;:::;uq) = u1    uq]  
1
2


  

M
:
Then gS has the following pseudorandomness against any depth e circuit E on M input bits and
of size at most s.
 
Pry2f0;1gM[E(y) = 1]   Prx2f0;1gU[E(gS(x)) = 1]
 
  :
13To apply the NW generator to our depth b d circuit D constructed above, we set our parame-
ters and dene our (m;q)-design, as follows. For the parameters m and M, we will use the same
ones that have been used so far, namely m = nk and M = 2m. We will take a nite eld F, and
set q = jFj and U = q2. We will take a specic nite eld F = Z2[X]=(X23u
+X3u
+1) [vL91],
where each element  2 F takes K = 2  3u bits, and q = jFj = 2K. We choose u so that
q  (3m2 + 1)b d+2. Then q1=(b d+2)  log2(23m2
+ cnw2mM), where cnw is the constant in the
above lemma. Clearly q  nckd will do, for some universal constant c, for example c = 7.
Then K = O(dk logn). Thus, this eld has polynomial size and each element is represented by
O(logn) bits. All arithmetic operations in this eld F are easy.
We will consider precisely M = 2m polynomials fz() 2 F[], each of degree at most m,
where each fz is indexed by its coecients, concatenated as a bit sequence of length exactly m.
The precise manner in which this is done is not very important, but for deniteness, we can
take the following. We take polynomials of degree  = bm=Kc = 
(nk=(dk logn))  n2, with
exactly  + 1 coecients,
fz() = c + ::: + c1 + c0;
where all cj varies over F, except c is restricted to exactly 2m K many values. Note that 0 
m   K   < K. The concatenation z = hc c0i has exactly m bits. Each fz denes a subset
of F  F of cardinality q, f(;fz()) j  2 Fg, which we denote by Sz. A (m;q)-design that
we will use is dened as S = fS1;:::;SMg, indexed by z 2 f0;1gm, which we identify with the
index set f1;:::;Mg. Note that F  F is a domain [U] with U = q2. The rst condition of a
(m;q)-design is immediate, and the second condition, i.e., jSz \ Sz0j  m, for all z 6= z0, is also
easy to see by noting that deg(fz) < m and deg(fz0) < m. Note that our NW generator gS
generates a pseudo random sequence of length M = 2m from a seed of length U = q2.
For showing the pseudorandomness of gS, we use the following lemma which follows from
the decision tree version of the Switching Lemma.
Lemma 6 For any depth e, and for all suciently large q, any circuit C on q inputs and of
size at most 2q1=(e+1)
, satises


 Pr(u1;:::;uq)2f0;1gq[C(u1;:::;uq) = u1    uq]  
1
2


   2 q1=(e+1)
:
Then the following claim is immediate from Lemma 5 and Lemma 6.
Claim 3 Our NW generator gS has the following pseudorandomness against any circuit E of
size at most 23m2
and depth b d:


Pry2f0;1gM[E(y) = 1]   Prx2f0;1gU[E(gS(x)) = 1]


  2m 3m2
:
Recall that the circuit D takes (L + 1)M Boolean inputs, i.e., (az;1;:::;az;L;bz), for z 2
f0;1gm, where M = 2m and L = dlog2
1
pe. We provide these input values by our NW generator
that produces a M bit pseudorandom string from a q2 bit random seed. Hence, for the seed
to the generator, a random string of length (L + 1)q2 is needed, and we use a sequence of
independently and uniformly distributed bits fu
(0)
;;u
(1)
;;:::;u
(L)
;g, for each ; 2 F. That is,
14for each j = 1;:::;L, we use q2 bits fu
(j)
; j ; 2 Fg to generate the M Boolean values of
az;j, for z 2 f0;1gm. Similarly, the set fu
(0)
; j ; 2 Fg of q2 bits is used to generate the M
Boolean values of bz, for z 2 f0;1gm. More specically, for each z 2 f0;1gm and j = 1;:::;L,
we dene az;j and bz as follows.
az;j =
M
2F
u
(j)
;fz(); and bz =
M
2F
u
(0)
;fz():
Then we have the following claim.
Claim 4 Let g
(i)
S denote the pseudorandom output sequence of gS on random seed bits fu
(i)
; j
; 2 Fg, for 0  i  L. Then
Pr[D(g
(1)
S ;:::;g
(L)
S ;g
(0)
S ) = 1]  1   o(1);
where the probability is over independently and uniformly distributed bits fu
(0)
;;u
(1)
;;:::;u
(L)
;g,
for ; 2 F.
Proof. Let us denote by ai and b respectively a sequence of M true random bits assigned to D's
input variables az;i and bz, for z 2 f0;1gm. Then our goal is to show that Pr[D(g
(1)
S ;:::;g
(L)
S ;g
(0)
S )
= 1] is close to 1. We claim Pr[D(g
(1)
S ;:::;g
(L)
S ;g
(0)
S ) 6= 1]  1=2n 1. For a contradiction suppose
it is > 1=2n 1.
Recall that from Claim 2 Pr[D(a1;:::;aL;b) 6= 1]  m=2m n.
Then we have


Pr[D(a1;:::;aL;b) 6= 1]   Pr[D(g
(1)
S ;:::;g
(L)
S ;g
(0)
S ) 6= 1]


 >
1
2n 1  
m
2m n >
1
2n:
This implies, by the telescoping argument,
1
2n < jPr[D(a1;:::;aL;b) 6= 1]   Pr[D(g
(1)
S ;:::;g
(L)
S ;g
(0)
S ) 6= 1]j
 jPr[D(a1;:::;aL;b) 6= 1]   Pr[D(g
(1)
S ;a2;:::;aL;b) 6= 1]j
+jPr[D(g
(1)
S ;a2;:::;aL;b) 6= 1]   Pr[D(g
(1)
S ;g
(2)
S ;a3;:::;aL;b) 6= 1]j

+jPr[D(g
(1)
S ;:::;g
(L)
S ;b) 6= 1]   Pr[D(g
(1)
S ;:::;g
(L)
S ;g
(0)
S ) 6= 1]j;
that there exists some i such that
 
Pr[D(g
(1)
S ;:::;g
(i 1)
S ;ai;:::;aL;b) 6= 1]   Pr[D(g
(1)
S ;:::;g
(i 1)
S ;g
(i)
S ;ai+1;:::;aL;b) 6= 1]
 
 >
1
L2n:
By an averaging argument, this bound still holds by appropriately xing random bits other
than ai and the source bits for g
(i)
S . In other words, for some circuit D0 with M input variables
of size at most size(D) = 23m2
and depth depth(D) = b d, we have


Pr[D0(ai) = 1]   Pr[D0(g
(i)
S ) = 1]


 >
1
L2n:
This is a contradiction to Claim 3, the pseudorandomness of the generator gS, since L =
O(dlogm). t u (Claim 4)
15This claim states that with high probability, a pseudorandom sequence satises D, meaning
that the random restriction induced from the pseudorandom sequence reduces every Cx to a
simple function (e.g., a small decision tree) whose value can be xed by xing t = 2m additional
variables (for each Cx) to 0. Next we will argue that, for such a pseudorandom restriction, one
can nd some space to encode the determined value of each Cx.
Consider a restriction induced by a pseudorandom sequence satisfying D. Apply this restric-
tion to all variables z of circuits Cx, and x further the value of some set Y of variables to 0 in
order to determine the value of circuits Cx for all x 2 f0;1gn. We may assume that the size of Y
is at most 2m2n, which is guaranteed by the fact that D = 1 with our pseudorandom sequence.
Then there exists y0 of length n2=2 such that a segment Ty0 = fz 2 f0;1gm j y0 is a prex of zg
has no intersection with Y ; that is, all variables in Ty0 are free from any variables used to x
the value of circuits Cx . This is simply because 2m2n  2n2=2. Our plan is to code the results
of Cx by a Boolean variable z of the form z = y0xw, for some w. The key requirements are
that (i) the variable z is assigned  by the pseudorandom restriction, and (ii) it is easy to nd
such z (i.e., w) from a given x. (We may assume that the string y0 and the seed for the chosen
pseudorandom sequence are remembered by being encoded in the target polynomial size circuit
CM.)
Let u; be a column vector of 0-1 uniform bits (u
(1)
;;u
(2)
;;:::;u
(L)
;)T. Recall that in D's
simulation of circuits Cx, a Boolean variable z (of Cx) is assigned  if and only if az;j = 0 for all j
= 1;:::;L. Hence, z is assigned  by a pseudorandom restriction if and only if
P
2F u;fz() = 0
in ZL
2. y0 is determined by the pigeonhole principle, and depends on the source bits u;. We
also need to have plenty of 's in the segment Ty0. Since we cannot predetermine y0, we demand
all segments Ty have plenty of 's. So, we want our source bits u; to satisfy the following
condition.
8y 2 f0;1gn2=2; 8x 2 f0;1gn; 9z = yxw 2 f0;1gm
2
4
X
2F
u;fz() = 0
3
5: (1)
Furthermore, such a w should be easy to compute from the source bits u;, and the given y,
and x.
Recall that for any z 2 f0;1gm, fz is dened by the sequence of the coecients hc c0i
which concatenates to z. Let  be the largest index such that the binary concatenation hc ci
becomes longer than n2=2 + n bits, so n2=2 + n < jhc cij  n2=2 + n + K. Then for any
y 2 f0;1gn2=2 and x 2 f0;1gn, we have some subsequence of coecients c;:::;c such that
yx0v = hc ci, with some v for padding. Note that  > 0, since m = nk and k > 2. We
will show (see Claim 5 below) that with high probability a sequence of random source bits u;
satises the following.
8c 2 F; :::; 8c 2 F; 9c0 2 F
2
4
X
2F
u;fz() = 0
3
5; (2)
where z is a string in f0;1gm that is the concatenation hc c00c0i. Observe that this
condition (2) is sucient for our requirement (1). Consider any y 2 f0;1gn2=2 and x 2 f0;1gn,
16and let c;:::;c be the coecients corresponding to yx0v. Then from (2), there exists some c0
by which we can dene z = hc c00c0i satisfying the condition of (1). Furthermore, we
will show that we can easily nd such c0 (thus z) given u;, and c;:::;c. by checking all q
elements of F.
We now summarize our oracle construction. Choose any setting of the random bits ! = u;,
such that it generates (L + 1)M pseudorandom bits 
 satisfying both D = 1 and (2); let 

be the restriction induced by this pseudorandom sequence. We construct the segment X=m of
our oracle by 
 as follows. Below z denotes a string in f0;1gm whose membership to X has
not been determined yet in the construction. Let Xxed (resp., Xxed) be the set of strings
in f0;1gm whose membership to X (resp., X) has been determined. Initially, both Xxed and
Xxed are empty. First x the membership according to 
; that is, z is put into Xxed (resp.,
Xxed) if and only if 
 sets 1 (resp., 0) to the corresponding variable. Secondly, choose a set
Y  f0;1gm  (Xxed[Xxed) of at most 2m2n strings such that adding Y to Xxed determines
the value of circuits Cx for all x 2 f0;1gn. This set Y is guaranteed by D = 1. Add Y to Xxed.
Fix one y0 such that Ty0 \ Y = ;. This y0 exists by the pigeonhole principle. Then for any
x 2 f0;1gn, put any z of the form y0xw for some w into Xxed (resp., Xxed) if and only if the
(already determined) value of Cx is 1 (resp., 0). Then put all remaining z into Xxed.
Now we explain how to design a polynomial size circuit CM simulating MX. We may assume
that the information on the seed ! (of length (L+1)q2 = nO(kd)) and y0 are hardwired into the
circuit and they can be used in the computation. For a given input x, the circuit exhaustively
searches for c0 2 F satisfying the condition of (2) for the coecients c;:::;c corresponding to
y0x0v. Since the seed is given, for any z = hc c00c0i, one can compute
P
2F u;fz()
within polynomial time in n. Also the size of F is q = nO(kd). Thus, the desired c0 (and
hence, z) is computable in polynomial time. When z is obtained, the circuit queries the oracle
whether \z 2 X?" and accepts the input if and only if z 2 X. It is easy to check that the
whole computation can be implemented by some circuit of size nckd for some constant c > 0.
We complete the proof by proving the following claim.
Claim 5 Over q2L independent and uniform random bits fu
(1)
;;:::;u
(L)
; j ; 2 Fg, the condi-
tion (2) holds with probability 1   o(1).
Proof. For any xed c;:::;c, let z(c) denote hc c00ci. Then fz(c)() is expressed as
fz(c)() = g() + c, where the polynomial g() = c +  + c is independent of c.
Dene u
;c = u;g()+c. Then since u
;c = u;fz(c)(), the condition (2) can be stated as
8c; :::; 8c; 9c0
2
4
X
2F
u
;c0 = 0
3
5:
Notice that for any xed c;:::;c, for any , 0, c, and c0, the vectors u
;c and u
0;c0 consist
of disjoint sets of bits, unless  = 0 and c = c0. Hence, if c 6= c0, they are (probabilistically)
independent, from which the following bound follows: 8c;:::;c,
Pr
2
4 8c0 [
X
2F
u
;c0 6= 0]
3
5 =
Y
c2F
Pr
2
4
X
2F
u
;c 6= 0
3
5 =

1  
1
2L
q
< e 
(q=(20m)d);
17where the probability is taken uniformly over all the bits u
(1)
;;:::;u
(L)
;, for all ; 2 F. Then
the claim is proved as follows:
Pr
2
4 8c; :::; 8c; 9c0 [
X
2F
u
;c0() = 0]
3
5  1   2n2=2+n+Ke 
(q=(20m)d) = 1   o(1):
t u (Claim 5)
Remark 1: For convenience we assumed in the proof that k > 2 and d  7. This is only
to simplify notations. Clearly d  7 is unnecessary. We only need to forgo the estimate of
2d+6  3d 1, and use 2d+6. Also any machine M in 
p
d for d < 7 can always be considered
in a higher level. Similarly, k > 2 is not necessary. If one traces through the proof, with slight
modication, any real number k > 1 is sucient.
Remark 2: The nal computation by the polynomial size circuit can be done in NC1. We only
need to evaluate some arithmetic operations in the nite eld F. It turns out that since elements
in F are represented by O(logn) bits, the only step that really requires NC1 is the parity sum
of nO(1) terms, when we evaluate the polynomial fz.
Remark 3: Though the proof is stated for simulating one machine M, it is also possible to
construct a single oracle X such that for every d and k, and every 
p
d-machine M running in
time O(nk), the language L(MX) can be recognized by some polynomial size circuit family with
stringent access to oracle X.
4 Some results on constant depth circuits
As lower bound results on constant depth circuits play a crucial role in this work, we take
this opportunity to present some unpublshed older results of the rst author on these circuits.
In particular we emphasize the decision tree viewpoint, and give some better constants in the
exponents than previously published lower bounds. We give a historical account at the end of
the section.
The decision tree perspective was rst proposed in [Cai86] where a weaker version of the
following Lemma 7 was proved. The following proof essentially adapts the techniques from
[H as86a].
We say a boolean function G on variables fx1;:::;xng is a t-And-Or if G = G1^G2^:::^Gw,
where each Gi is the Or of at most t literals, (a literals is a variable or its complement). Similarly,
we say G is a t-Or-And if G = G1_G2_:::_Gw, where each Gi is the And of at most t literals.
A restriction is a partial assignment of some of the variables to f0;1g. More formally, it is
a map  from the set f1;2;:::;ng to the set f0;1;g. The restriction of G by , denoted by
Gj, is the boolean function obtained by setting xi to be (i) if (i) 2 f0;1g and leaving xi as a
variable otherwise. A random p-restriction is a restriction  picked by independently assigning
(i) =  with probability p and either 0 or 1 with probability (1   p)=2.
Lemma 7 Let G be a t-And-Or formula G1 ^ G2 ^ ::: ^ Gw. Let  be a random p-restriction.
Then, for all   0,
Pr[ DC(Gj)   ]  (5pt): (3)
18Proof. The Lemma is proved by an induction on w. Concerning G1, immediately there are 2
cases, either G1j  1 or G1j 6 1. By renaming literals, we may assume G1 =
W
i2T xi. Then
G1j  1 is equivalent to (i) = 1 for some i 2 T. If G1j  1, we want to prove that the
conditional probability that the rest of G has DC(Gj)   is no larger. If however G1j 6 1,
we want to carefully analyze what happens to the variables in T. All of this will accumulate
as some prior condition on . It will be seen that the inductive step will carry a condition that
refers to some collection of subsets of variables on each of which  has assigned some variable of
it in some denite way. In the earlier proof of Yao [Yao85], as well as in the proof of Cai [Cai86],
these conditions are explicitly carried along in the proof. The following device used in H astad's
proof [H as86a] is more elegant.
One makes the stronger claim, that for any boolean function F, we have
Pr[ DC(Gj)   j Fj  1 ]  ; (4)
where  will be set to 5pt, and we agree that the conditional probability is 0 if the condition is
not satised. The Lemma follows from (4) by taking F to be the constant function 1.
The statement (4) is trivially true for  = 0, since the RHS becomes 1 in this case. Similarly,
if   1, then the statement is true. Thus, we may assume  > 0 and  < 1.
We prove (4) by induction on w. If w = 0, then G  1 by denition and the statement
holds since the LHS is 0. Let w > 0. Put G = G1 ^ G0, where G0 = G2 ^ ::: ^ Gw. Now, either
G1j  1 or G1j 6 1. If G1j  1, then we have, by induction
Pr[ DC(Gj)   j Fj  1;G1j  1 ]
= Pr[ DC(G0j)   j (F ^ G1)j  1 ]  :
Now consider the case G1j 6 1. We want to prove
Pr[ DC(Gj)   j Fj  1;G1j 6 1 ]   (5)
as well. We have renamed the variables so that G1 =
W
i2T xi. Then G1j 6 1 means that for
each i 2 T, (i) = 0 or . Moreover, since  > 0, it cannot be that (i) = 0 for all i 2 T, or
else G1j  0, and DC(Gj) = 0. Thus, the set of restrictions  such that Fj  1;G1j 6 1 and
DC(Gj)   is contained in
[
;6=Y T
f : (Y ) = ;(T   Y ) = 0;Fj  1;DC(Gj)  g:
Suppose (Y ) =  and (T   Y ) = 0, for some ; 6= Y  T.
First we assume jY j < . Then there must be some assignment Y : Y ! f0;1g, and
Y 6= 0Y , where we denote by 0Y the all 0 assignment on Y , such that DC(GjjY )     jY j.
For otherwise, one could obtain some decision tree of depth <  for Gj by rst asking all the
variables in Y . Note that such a Y 6= 0Y because the all 0 assignment leads to G1jj0Y  0.
For Y 6= 0Y , G1jjY  1, so that GjjY  G0jjY . Then
Pr[ DC(Gj)   j Fj  1 ^ (Y ) =  ^ (T   Y ) = 0 ]

X
Y :Y !f0;1g
Y 6=0Y
Pr[ DC(G0jjY )     jY j j Fj  1 ^ (Y ) =  ^ (T   Y ) = 0) ]: (6)
19Set
0T Y = the all 0 assignment on T   Y ,
e F =
V
Y :Y !f0;1g Fj0T Y jY and
e  =  restricted to the complement of T;
then under the condition (Y ) =  ^ (T   Y ) = 0 we have
Fj  1 () e Fje   1:
Hence, the sum in (6) has the upper bound
X
Y :Y !f0;1g
Y 6=0Y
Pr[ DC(G0j0T Y jY je )     jY j j e Fje   1 ]  (2jY j   1) jY j; (7)
by induction.
The upper bound (7) holds for
Pr[ DC(Gj)   j Fj  1 ^ (Y ) =  ^ (T   Y ) = 0 ] (8)
for all Y 6= ; with jY j < . However, for jY j  , the bound in (7) holds trivially for a
probability (8), since in this case the bound in (7) is  1, as jY j   > 0 and  < 1. Hence in
fact it holds for all Y 6= ;.
Let
aY = Pr[ (Y ) =  ^ (T   Y ) = 0 j Fj  1 ^ G1j 6 1 ];
bY = Pr[ (Y ) =  j Fj  1 ^ G1j 6 1 ]:
Then
bY =
X
Y ZT
aZ;
and by the M obius Inversion Formula,
aY =
X
Y ZT
( 1)jZ Y jbZ:
It follows that
Pr[ DC(Gj)   j Fj  1;G1j 6 1 ]

X
;6=Y T
aY  (2jY j   1) jY j
=
X
Y T
aY  (2jY j   1) jY j:
20Substituting bZ for aY , we have
Pr[ DC(Gj)   j Fj  1;G1j 6 1 ]

X
Y T
X
Y ZT
( 1)jZ Y jbZ  (2jY j   1) jY j
=
X
ZT
bZ
X
Y Z
( 1)jZ Y j(2jY j   1) jY j
=
X
ZT
bZ( 1)jZj X
Y Z
"
 2

jY j
 

 1

jY j#
=  X
ZT
bZ( 1)jZj
"
1  
2

jZj
 

1  
1

jZj#
=  X
ZT
bZ
"
2

  1
jZj
 

1

  1
jZj#
:
Concerning bZ, intuitively, under the condition that Fj  1 ^ G1j 6 1, the probability of
(Z) =  is at most qjZj, where q = p=(p +
1 p
2 )  2p, i.e., bZ  qjZj. We already saw that
G1j 6 1 means that each variable in Z is assigned either 0 or . The additional condition that
Fj  1 can only decrease the probability that some variable is assigned a . We will argue this
point more carefully. For the moment, we accept the upper bound bZ  qjZj.
Then, since the coecients of bZ are non-negative, we have
 X
ZT
bZ
"
2

  1
jZj
 

1

  1
jZj#
  X
ZT
qjZj
"
2

  1
jZj
 

1

  1
jZj#
= 
(
1 + q

2

  1
jTj
 

1 + q

1

  1
jTj)
 
(
1   q +
2q

t
 

1   q +
q

t)
: (9)
At this point, we can recover the bound (5pt) as follows [H as86a]. Observe that

1   q +
2q

t
 

1   q +
q

t


1 +
2q

t
 

1 +
q

t
: (10)
If we set c = 1=log  2:078, where  = 1+
p
5
2  1:618 is the golden ratio, then we have
e2=c   e1=c = 1. Then set  = cqt < 5pt, we get

1 +
2q

t
 

1 +
q

t
=

1 +
2
ct
t
 

1 +
1
ct
t
< e2=c   e1=c = 1:
Then
Pr[ DC(Gj)   j Fj  1;G1j 6 1 ] < :
21This completes the proof of
Pr[ DC(Gj)   j Fj  1 ] < (5pt):
Finally, we show that bZ  qjZj. Note that for Z  T, we have
Pr[ (Z) =  j G1j 6 1 ] = qjZj;
This is because G1j 6 1 is the same as  assigns only  or 0 on T.
We show that Fj  1 cannot increase the probability of (Z) = . This is trivial if Z = ;.
Suppose Z 6= ;. Consider any xed restriction 0 on the complement of Z, 0 : Zc ! f0;1;g.
Then, there is a unique extension of 0 over Z, call it , that satises (Z) = .
We claim that
Pr[ (Z) =  j Fj  1;G1j 6 1;jZc = 0 ]  qjZj:
The event (Z) =  refers to the unique , under the condition jZc = 0. If Fj 6 1, then
the above conditional probability is 0 and the claim trivially holds. Otherwise, Fj  1 for all
extensions  of 0 to Z. Hence Fj  1;G1j 6 1;jZc = 0 refers to exactly 2jZj assignments ,
such that (i) 2 f0;g for all i 2 Z. The claim follows. Lemma 7 is proved. t u
If we take p = 1
10t, then we get the following bound: For any G as in Lemma 7, and for all
  0,
Pr[ DC(Gj)   ]  2 : (11)
Using this bound as the base case, we can inductively prove Lemma 4.
On the other hand, it is possible to obtain a slightly stronger bound from (9). In fact the
use of the inclusion-exclusion formula has been ignored in (10). In the following, we will show
this slightly stronger bound.
We will set q = =t for some constant  > 0, to be determined later. Set
 = = ln
"
1 +
p
1 + 4e
2
#
:
Then
e2=   e= = e:
It follows that

1   q +
2q

t
 

1   q +
q


=

1 +

2

  

1
t
t
 

1 +



  

1
t
t
< e
2
     e

  = 1:
Replacing the analysis after (9) in the above proof, we obtain the following lemma.
22Lemma 8 Let G be a t-And-Or formula G1 ^ G2 ^ ::: ^ Gw. For any , 0 <  < t, let  be
a random p-restriction, where p =

t , and let  = =ln

1+
p
1+4e
2

. Then for all   0, we
have
Pr[ DC(Gj)   ]  :
Minimizing  we nd at 0 = 0:227537, 0 = (0)  2 1:2638031  0:4164447. Let 0 =
0=2  0:1137685. Then we have the following bound. This is a strengthening of (11).
Lemma 9 Let G be a t-And-Or formula G1^G2^:::^Gw, and let  is a random 0=t-restriction.
Then for all   0, we have
Pr[ DC(Gj)   ]  
0 :
Proof. Let q = 0=t and p =
q
2 q. Then q =
2p
1+p is the probability of getting a 0 or a  in a
random p-restriction.
We have shown that
Pr[ DC(Gj0)   ]  
0 ;
where 0 is a random p-restriction.
Since p > q=2 = 0=t, a random 0=t-restriction  can be realized by rst applying a random
p-restriction 0, followed by a 0=(pt)-restriction. Note that if DC(Gj0) <  then DC(Gj) < .
The Lemma follows. t u
Now consider general constant depth circuits. Denote by Cd(s;t) the class of depth d circuits
with b2  t, and the number of gates above the rst level  s. Denote by Cd(s) the class
of depth d circuits without a b condition but with total size  s. By extending one level
with fan-in 1, clearly Cd(s) = Cd+1(s;1). (Here in this notation we suppress the number n of
variables and the depth d, where s and t are understood to be functions of one or both of them.)
Lemma 10 For all C 2 Cd(s;0n1=d), we have
Pr[ DC(Gj)  0n1=d ]  s  
0n1=d
0  s  2 0:143781n1=d
;
where  is a random 1=n
d 1
d -restriction.
Proof. Apply Lemma 9 repeatedly d 1 times, each time with a random 1=n
1
d-restriction. Note
that any function with decision tree depth   can be expressed both as a -And-Or as well as
a -Or-And. After switching bottom level And-Or formulas to Or-And's, or vice versa, one can
merge two successive levels of gates and reduce the depth by 1. Then the lemma follows. t u
Let C 2 Cd(s) with no b requirement. By considering C 2 Cd+1(s;1) we may rst apply
Lemma 9 to each of the bottom depth 2 subcircuit with b 1, with a random 0-restriction. But
we can actually do slightly better by looking at it directly.
2b is the abbreviation of bottom fan-in, the maximum fan-in of the bottom level gates. By a \b condition"
we mean a bound of the form b  t that is given in each context.
23Fix any 1-Or-And formula S. (The case with any 1-And-Or is dual.) S is just a simple Or,
by renaming variables, we may assume S =
Wm
i=1 xi. Fix any  > 0. If we apply a random
p-restriction , and if  assigns any xi = 1, or if  assigns all xi to 0 or  but fewer than  of
them are assigned , then DC(Sj) < . Thus
Pr[ DC(Sj)   ] 
X
Jf1;:::;mg
jJj
Pr[ (J) = ;(Jc) = 0 ]
=

1 + p
2
m m X
j=
 
m
j
!
qj(1   q)m j;
where Pr[(i) 6= 1] = p +
1 p
2 =
1+p
2 , and q = Pr[(i) = j(i) 6= 1] =
2p
1+p. Hence
Pr[ DC(Sj)   ]  q

1 + p
2
m m X
i=0
 
m
i
!
(1   q)m i = q < (2p):
So if we rst apply a random restriction with p = 0
2  0:2082223, with probability >
1   s1
0n1=d
0 , all bottom level 1-Or-And subcircuits are switched to 0n1=d-And-Or (or all 1-
And-Or switched to 0n1=d-Or-And), where s1 is the total number of level 1 gates in the depth
d circuit C, which are the depth 2 gates in the depth d+1 circuit with b 1. After the switching
we get a circuit of depth d + 1 with b  0n1=d, but with the same type of gates on the 2
levels just above the bottom level gates. After merging these two levels, we get a circuit in
Cd(s0;0n1=d), where s0 = s   s1. Now we apply Lemma 10. This gives the following bound.
Lemma 11 For all C 2 Cd(s), we have
Pr[ DC(Gj)  0n1=d ] < s  
0n1=d
0  s  2 0:143781n1=d
;
where  is a random 0=(2n
d 1
d )-restriction.
These results can be used to prove circuit lower bounds for such circuits. Consider any circuit
C in Cd(s;0n1=(d 1)). Apply d   2 rounds of random 1=n1=(d 1)-restrictions, with probability
> 1 s2 0:143781n1=(d 1)
, we get a circuit in C2(1;0n1=(d 1)) after switching and merging. The
number of variables N left has expectation Exp[N] = n1=(d 1). By Cherno bound, we have
Pr[ N  0n
1
d 1 ] = Pr[ N   n
1
d 1   (1   0)n
1
d 1 ] < e 
(1 0)2
2 n
1
d 1 < e 0:3927n
1
d 1:
Hence, if s < 20:143781n1=(d 1)
, the probability is approaching 1 that both C is reduced to a
circuit in C2(1;0n1=(d 1)) and N > 0n1=(d 1). Therefore C does not compute the parity.
Lemma 12 For all C 2 Cd(s;0n1=(d 1)), if C computes the parity function, then its size s
must satisfy
s  20:143781n1=(d 1)
:
24Let C 2 Cd(s) with no b requirements. As in the proof of Lemma 11 we will separate
the bottom level gates from the rest. Thus we rst apply a random 0=2-restriction followed
by d   2 rounds of random (n1=(d 1)) 1-restrictions. Thus altogether we applied a random
0(2n
d 2
d 1) 1-restriction, and with probability > 1 (s 1)2 0:143781n1=(d 1)
we end up with a
circuit in C2(1;0n1=(d 1)). By Cherno bound again, if N is the number of variables left, then
Exp[N] = 0n1=(d 1)=2, and therefore
Pr[ N  0n
1
d 1 ] = Pr
"
N  
0n1=(d 1)
2
  

0
2
  0

n
1
d 1
#
< e 0:021423n
1
d 1:
Hence, if s < 20:143781n1=(d 1)
, C does not compute the parity.
Theorem 13 For all C 2 Cd(s), if C computes the parity function, then its size s must satisfy
s  20:143781n1=(d 1)
:
Now we consider the inapproximability type lower bound. The decision tree depth lower
bound is ideally suited for deriving the inapproximability type lower bound, and the decision
tree perspective was introduced precisely for this reason.
Denote for the rest of this section m = n1=d. Let C be a depth d circuit. Note that after
some restriction , if C is reduced to a decision tree of depth smaller than the number of variables
left, then for exactly half of the 0-1 extensions of , C agrees with parity. This is because at
every leaf of the decision tree, the circuit C is completely determined. (This property was called
monochromaticity in [Cai86].)
Consider Pr[ C(x1;:::;xn) = (x1;:::;xn) ], where (x1;:::;xn) denotes the parity func-
tion, and the probability is over all 2n assignments. This can be evaluated by rst assigning
any random restriction, followed by an unbiased 0-1 assignment for all the remaining variables.
Let E = E1 ^ E2, where E1 denotes the event that after the random restriction, we end up
with a decision tree of depth t, and E2 denotes the event that the number of variables N
assigned to  is more than t, where t will be specied later as O(m). Let [C = ] denote
[C(x1;:::;xn) = (x1;:::;xn)] as a short hand.
Note rst
Pr[C = ] = Pr[E]  Pr[C = jE ] + Pr[:E]  Pr[C = j:E ]
= Pr[C = jE ] + Pr[:E]( Pr[C = j:E ]   Pr[C = jE ]):
As we noted Pr[C = jE ] = 1=2. Then



Pr[C = ]  
1
2



 
1
2
Pr[:E]:
Also since Pr[C = ]   Pr[C 6= ] = 2(Pr[C = ]   1
2), we have
jPr[C = ]   Pr[C 6= ]j  Pr[:E]:
Now we specify the parameter of the random restrictions.
First consider any C 2 Cd(s;0m). Then let t = 0m, and we apply Lemma 10. With a
random (n
d 1
d ) 1-restriction, we have
25Pr[:E1]  st
0  s  2 0:143781m:
Then again by using the Cherno bound, we estimate Pr[:E2] = Pr[N  0m] as follows.
Pr[:E2]  e 
(1 0)2
2 m < e 0:3927m:
Thus Pr[:E2] is dominated by Pr[:E1]. This analysis gives the following bound.
Lemma 14 For all C 2 Cd(20:07189n1=d
;0n1=d), we have
jPr[C = ]   Pr[C 6= ]j  2 0:07189n1=d
:
Finally we consider C 2 Cd(s) with no b condition. This time we have to work harder to
optimize the exponents. Our strategy is as follows. We will rst assign a 0
2 -restriction, and this
will give us a depth d circuit with b  0m. Then we assign d   2 rounds of 1=m-restrictions,
each time using Lemma 9 with the same parameters p = 1=m, and t =  = 0m. This will give
us a depth 2 circuit with b  0m. So far the failure probability is (s   1)
0m
0 . Finally we
assign another 1=m-restriction, but this time use the parameters t = 0m and  = x0m, where
0 < x < 1 is to be determined later. The overall failure probability is < s
0m
0 +
0 +Pr[N  ],
where N is the number of variables left.
It turns out that if we used the same values for t and  for the estimate in the last round,
the bound for Pr[N  ] would be too weak. We will use a more exacting form of the Cherno
bound, and then optimize the overall bound by balancing the last two terms with a judicious
choice of x.
We use the following version of the Cherno bound. (See, for example, p. 70 of [MR95].)
Pr[N < (1   )Exp[N]] < exp[ Exp[N]  ( + (1   )  ln(1   ))]:
Here we have
Exp[N] =
0
2
m; and  = 1  
20x
0
:
We balance the two bounds by setting x such that
x0 ln
1
0
=
0
2
[ + (1   )  ln(1   )]:
This leads to choosing x = 0:617945, and we get both

0 < 2 0:0888488m and Pr[N  ] < 2 0:0888488m:
Then by setting s = 20:07189m we get a balanced discrepancy lower bound.
Theorem 15 For all C 2 Cd(20:07189n1=d
), we have
jPr[C = ]   Pr[C 6= ]j  2 0:07189n1=d
:
26Note that the bound in Theorem 15 is the same as that of lemma 14 with the b condition.
Lemma 6 follows from Theorem 15 for large input size.
Remark: The original motivation for Furst-Saxe-Sipser [FSS81] where super polynomial lower
bounds were proved for parity against constant depth circuits, was to provide an oracle separa-
tion of PH and PSPACE. This was achieved in a breakthrough result by Yao [Yao85] who proved
a lower bound of the form 2n
(1=d)
for parity on n bits for depth d circuits. Yao's bound was
further improved by H astad[H as86a] from 2 n
(1=d)
to 2  1
10n
1
d 1 , and his proof has become the
standard proof. Independently, Yao's work was improved upon in another direction. Cai inves-
tigated in [Cai86] whether constant depth circuits of size 2n
(1=d)
must err on an asymptotically
50 % of inputs against parity. This was motivated by another long standing open problem, that
of random oracle separation of PH and PSPACE (see also [Bab87]). To attack this problem, the
decision tree point of view was rst adopted in [Cai86], although a dierent but completely syn-
onymous terminology (Master-Player Game and t-monochromaticity) was used. It was proved
in [Cai86] that after a suitable random restriction , with high probability, the constant depth
circuit C j has decision tree depth smaller than the number of unassigned Boolean variables.
In such cases, Pr[C = ] is exactly 1
2. Thus the discrepancy
jPr[C = ]   Pr[C 6= ]j (12)
was shown to be o(1) for circuits of depth d and size 2n
(1=d)
. Implicitly a bound of the form
2 n
(1=d)
for the discrepancy (12) was proved there as well [Cai86]. The o(1) upper bound for
the discrepancy was sucient for the random oracle separation result which was the purpose of
[Cai86], but one needs H astad's technique to improve the bound from 2 n
(1=d)
to 2 cn
1
d as in
Lemma 15. However, the weaker bound 2 n
(1=d)
would have suced for our Theorem 2. Ko
[Ko89a] also used circuit lower bounds to establish the following: For any k, one can construct
an oracle with which the Polynomial Hierarchy collapses to exactly k levels.
It was a marvelous application by Nisan and Wigderson [Nis91a, Nis91b, NW88] who turned
the inapproximability type of lower bounds based on decision trees on its head, and produced
an explicit construction|usually considered an upper bound|of a pseudorandom generator
provably indistinguishable from true random bits by polynomial size constant depth circuits. A
central ingredient in [Nis91a, Nis91b, NW88] is a suitable combinatorial design. Seen in this
way, our proof of Theorem 2 can be viewed as using a lower bound (Switching Lemma), to get
an upper bound (the NW pseudorandom generator), to prove a lower bound (to kill all 2n circuits
Cx simultaneously with the pseudorandom assignments), to nally prove an upper bound (to be
able to code all the computations). And all this, is to show that it is impossible to prove super
polynomial circuit lower bound for any xed language in the Polynomial-time Hierarchy, with a
relativizable proof with stringent access to an oracle.
References
[Ajt83] M. Ajtai, 1
1-formulae on nite structures, Ann. Pure Applied Logic, 24, 1{48, 1983.
[Bab87] L. Babai, Random oracles separate PSPACE from the polynomial-time hierarchy, In-
formation Processing Letters, 26(1): 51-53 (1987).
27[BCGKT] N. Bshouty, R. Cleve, R. Gavald a, S. Kannan, and C. Tamon, Oracles and queries
that are sucient for exact learning, J. Comput. and System Sci. 52(3), 421{433, 1996.
[BFT98] H. Buhrman, L. Fortnow, and T. Thierauf, Nonrelativizing separations, in Proc. the
13th IEEE Conference on Computational Complexity (CCC'98), IEEE, 8{12, 1998.
[Cai86] J-Y. Cai, With probability one, a random oracle separates PSPACE from the
polynomial-time hierarchy, in Proc. 18th ACM Symposium on Theory of Computing
(STOC'86), ACM, 21{29, 1986. See also the journal version appeared in J. Comput.
and System Sci. 38(1): 68-85 (1989).
[Cai01] J-Y. Cai, SP
2  ZPPNP, in Proc. 42th IEEE Symposium on Foundations of Computer
Science (FOCS'01), IEEE, 620{628, 2001.
[CW03] J-Y. Cai and O. Watanabe, On proving circuit lower bounds against the polynomial-
time hierarchy: positive and negative results, Proc. 9th International Computing and
Combinatorics Conference (COCOON'03), LNCS 2697, 202{211, 2003.
[CW03] J-Y. Cai and O. Watanabe, BPP = PH by polynomially stringent relativization, Re-
search Report C-168, Dept. of Math. and Computing Sciences, Tokyo Inst. of Tech-
nology, 2003.
[FSS81] M. Furst, J. Saxe, and M. Sipser, Parity, circuits, and the polynomial time hierarchy, in
Proc. 22nd IEEE Symposium on Foundations of Computer Science (FOCS'81), IEEE,
260{270, 1981.
[DK00] D-Z. Du and K-I. Ko, Theory of Computational Complexity, John Wiley & Sons, 2000.
[H as86a] J. H astad, Almost optimal lower bounds for small depth circuits, in Proc. 18th ACM
Symposium on Theory of Computing (STOC'86), ACM, 6{20, 1986.
[H as86b] J. H astad, Computational Limitations for Small-Dept Circuits, MIT Press, 1986.
[He84] H. Heller, On relativized polynomial and exponential computations, SIAM J. Comput.
13(4), 717{725, 1984.
[HPV77] J.E. Hopcroft, W.J. Paul, and L.G. Valiant, On time versus space, J. ACM 24(2),
332-337, 1977.
[Kan82] R. Kannan, Circuit-size lower bounds and non-reducibility to sparse sets, Information
and Control, 55, 40{56, 1982.
[KL80] R.M. Karp and R.J. Lipton, Some connections between nonuniform and uniform com-
plexity classes, in Proc. 12th ACM Symposium on Theory of Computing (STOC'80),
ACM, 302{309, 1980.
[Ko89a] K. Ko, Relativized polynomial time hierarchies having exactly k levels, SIAM J. Com-
put., 18, 392{408, 1989.
28[KW98] J. K obler and O. Watanabe, New collapse consequences of NP having small circuits,
SIAM J. Comput., 28, 311{324, 1998.
[MVW99] P.B. Miltersen, N.V. Vinodchandran, and O. Watanabe, Super-Polynomial versus
half-exponential circuit size in the exponential hierarchy, in Proc. 5th Annual Inter-
national Conference on Computing and Combinatorics (COCOON'99), Lecture Notes
in Computer Science 1627, 210{220, 1999.
[MR95] R. Motwani and P. Raghavan, Randomized Algorithms, Cambridge Univ. Press, 1995.
[Nis91a] N. Nisan, Pseudorandom bits for constant depth circuits, Combinatorica 11(1), 63{70,
1991.
[Nis91b] N. Nisan, Using Hard Problems to Create Pseudorandom Generators, MIT Press, 1991.
[NW88] N. Nisan and A. Wigderson, Hardness vs randomness, in Proc. 29th IEEE Symposium
on Foundations of Computer Science (FOCS'88), IEEE, 2{12, 1988.
[NW94] N. Nisan and A. Wigderson, Hardness vs randomness, J. Comput. Syst. Sci. 49, 149{
167, 1994.
[vL91] J. van Lint, Introduction to Coding Theory, Springer-Verlag, 1991.
[Wil83] C.B. Wilson, Relativized circuit complexity, in Proc. 24th IEEE Symposium on Foun-
dations of Computer Science (FOCS'83), IEEE, 329{334, 1983.
[Yao85] A.C. Yao, Separating the polynomial-time hierarchy by oracles, in Proc. 26th IEEE
Symposium on Foundations of Computer Science (FOCS'85), IEEE, 1-10, 1985.
29