Circuit Bottom Fan-in and Computational Power by Cai, Liming et al.
Circuit Bottom Fan-in
and Computational Power
Liming Cai

Department of Mathematics
East Carolina University
Greenville, NC 27858-4353
cai@math1.math.ecu.edu
Jianer Chen
y
Department of Computer Science
Texas A&M University
College Station, TX 77843-3112
chen@cs.tamu.edu
Johan H

astad
Department of Computer Science
Royal Institute of Technology
Stockholm, Sweden
johanh@nada.kth.se
Abstract
We investigate the relationship between circuit bottom fan-in and
circuit size when circuit depth is xed. We show that in order to com-
pute certain functions, a moderate reduction in circuit bottom fan-in
will cause signicant increase in circuit size. In particular, we prove
that there are functions that are computable by circuits of linear size
and depth k with bottom fan-in 2 but require exponential size for cir-
cuits of depth k with bottom fan-in 1. A general scheme is established
to study the trade-o between circuit bottom fan-in and circuit size.
Based on this scheme, we are able to prove, for example, that for any
integer c, there are functions that are computable by circuits of linear
size and depth k with bottom fan-in O(logn) but require exponential
size for circuits of depth k with bottom fan-in c, and that for any
constant  > 0, there are functions that are computable by circuits
of linear size and depth k with bottom fan-in logn but require super-
polynomial size for circuits of depth k with bottom fan-in O(log
1 
n).
A consequence of these results is that the three input read-modes of
alternating Turing machines proposed in the literature are all distinct.
Warning: Essentially this paper has been published in SIAM Jour-
nal on Computing and is hence subject to copyright restrictions.
It is for personal use only.

Supported in part by Engineering Excellence Award from Texas A&M University.
y
Supported in part by the National Science Foundation under Grant CCR-9110824.
1
Key words:
computational complexity, circuit complexity, lower bound, al-
ternating Turing machine
AMS(MOS) subject classication:
68Q05, 68Q10, 68Q15, 68Q25, 68Q30
Abbreviated title:
Circuit Bottom Fan-in
The Corresponding Author
Jianer Chen
Department of Computer Science
Texas A&M University
College Station, TX 77843-3112
chen@cs.tamu.edu
2
1 Introduction
To prove lower bounds for various computational models remains as one of
the most challenging tasks in complexity theory. Much progress has been
made recently in deriving lower bounds for computational models with lim-
ited capabilities, with the hope that these may lead to better lower bounds
for more general computational models and to better understanding of in-
trinsic complexity of computation.
One of the most successful trials is the derivation of lower bounds for
constant depth circuits. The rst strong lower bounds were given by Furst,
Saxe, and Sipser [12], independently by Ajtai [1], who show that the size of
a constant depth circuit computing the parity function is superpolynomial.
The results were subsequently sharpened by Yao [18] who derived an ex-
ponential lower bound. Hastad [14, 15] further strengthened the result and
obtained near optimal lower bounds. A direct consequence of these results is
that the logarithmic time hierarchy [17], i.e., the set of languages accepted
by families of circuits of constant depth and polynomial size, is a proper
subset of P .
The logarithmic time hierarchy was further rened by Sipser [17] who
showed that for each integer k > 1, there are functions that are computable
by a circuit of depth k and polynomial size but require superpolynomial size
for circuits of depth k  1. Thus, all levels of the logarithmic time hierarchy
are distinct. Exponential lower bounds for the depth k to k   1 conversion
were rst claimed by Yao [18] and then fully proved by Hastad [14, 15].
In this paper, we will further sharpen the separation results in the log-
arithmic time hierarchy by investigating the relationship between circuit
bottom fan-in and circuit size when circuit depth is xed. We show that in
order to compute certain functions, a moderate reduction in circuit bottom
fan-in will cause signicant increase in circuit size. In particular, we prove
that there are functions that are computable by circuits of linear size and
depth k with bottom fan-in 2 but require exponential size for circuits of
depth k with bottom fan-in 1. A general scheme is established to study
the trade-o between circuit bottom fan-in and circuit size. Based on this
scheme, we are able to prove, for example, that for any integer c, there are
functions that are computable by circuits of linear size and depth k with
bottom fan-in O(logn) but require exponential size for circuits of depth k
with bottom fan-in c, and that for any constant  > 0, there are functions
that are computable by circuits of linear size and depth k with bottom fan-in
log n but require superpolynomial size for circuits of depth k with bottom
3
fan-in O(log
1 
n). Therefore, the computational power of constant depth
circuits depends not only on its depth, but also strictly on its bottom fan-in
when the depth of the circuits is xed.
Another motivation of our present research is from the study of input
read-modes of a sublinear-time alternating Turing machine, which is an im-
portant computational model in the study of complexity classes. A number
of input read-modes for sublinear-time alternating Turing machines have
appeared in the literature. In the standard model proposed by Chandra,
Kozen, and Stockmeyer [9], a computation path of the machine can read up
to O(log n) input bits in time O(logn). Ruzzo [16] proposed an input read-
mode in which each computation path can read at most one input bit and
the reading must be performed at the end of the path. An input read-mode
studied by Sipser [17] insists that each input reading takes time 
(log n).
These input read-modes have been carefully studied by Cai and Chen [6],
who have given a precise circuit characterization for each read-mode for
log-time alternating Turing machines of constant alternations. Input read-
modes of log-time alternating Turing machines also nd applications in the
study of computational optimization problems [5, 8] and in the study of
limited nondeterminism [7].
Based on Cai and Chen's circuit characterizations and our separation re-
sults in constant depth circuits, we are able to show that the three proposed
input read-modes for alternating Turing machines are all distinct. More
precisely, if we let 
U
k
(resp. 
R
k
, 
S
k
) be the class of languages accepted by
log-time k-alternation alternating Turing machines using Chandra, Kozen,
and Stockmeyer's (resp. Ruzzo's, Sipser's) input read-mode, then we can
show that for all integers k  1

R
k
 
S
k
 
U
k
 
R
k+1
where  means \proper subset". This gives a very detailed renement of
the logarithmic time hierarchy, and shows the rich structural properties of
the logarithmic time hierarchy.
The paper is organized as follows. Section 2 introduces necessary def-
initions and related previous work. In Section 3, we show that in order
to compute certain functions, an O(log n) factor reduction in circuit bot-
tom fan-in may cause exponential increase in circuit size. In Section 4, we
show that for circuits computing certain special functions, even reducing
the circuit bottom fan-in by 1 will result in exponential increase in circuit
size. A general scheme is established in Section 5 to study the trade-o be-
4
tween circuit bottom fan-in and circuit size. The relationship to the input
read-modes of alternating Turing machines is given in Section 6.
2 Preliminaries
We briey review the fundamentals related to the present paper. For fur-
ther discussion on the theory of circuit complexity and alternating Turing
machines, the reader is referred to [3, 11].
An (unbounded fan-in) Boolean circuit 
n
with input x = x
1
x
2
   x
n
of
length n is a directed acyclic graph. The fan-in of a node in the circuit is
the in-degree of the node. The nodes of fan-in 0 are called inputs and are
labeled from the set f0; 1; x
1
; x
1
;    ; x
n
; x
n
g. The nodes of fan-in greater
than 0 are called gates and are labeled either and or or. One of the nodes
is designated the output node. The size is the number of gates, and the
depth is the maximum distance from an input to the output. Without loss
of generality, we assume the circuits are of the special form where all and
and or gates are organized into alternating levels with edges only between
adjacent levels. Any circuit may be converted to one of this form without
increasing the depth and by at most squaring the size [10]. In this special
form, the gates that are connected to input nodes will be called bottom level
gates, or depth 1 gates. The gates that receive inputs >from depth 1 gates
will be called depth 2 gates, and so on. The bottom fan-in of a circuit is
the maximum over fan-ins of all bottom level gates. The following notation
introduced by Boppana and Sipser [3] will be especially convenient in our
discussion.
Denition [3] A circuit  is a 
s
k
-circuit (resp. 
s
k
-circuit) if  is a depth k
circuit of size at most s with an and-gate (resp. an or-gate) at the output.
A circuit  is a 
s;c
k
-circuit (resp. 
s;c
k
-circuit) if  is a depth k + 1 circuit
of size at most s with bottom fan-in c and an and-gate (resp. an or-gate)
at the output.
A family of circuits is a sequence f
n
jn  1g of circuits, where 
n
is
with input of length n. A family of circuits may be used to dene a language.
A family f
n
j n  1g of circuits is said to be a 
poly
k
-family (resp. 
poly;c
k
-
family) if there is a polynomial p such that for all n  1, 
n
is a 
p(n)
k
-circuit
(resp. 
p(n);c
k
-circuit).
The Sipser function f
m
k
, as dened in [14, 15], is given by the circuit
5
n^



Z
Z
Z
n
_


A
A
n
_


A
A
n n n
_ _ _






A
A
A
A
A
A
n n n n n n
^ ^ ^ ^ ^ ^












B
B
B
B
B
B
B
B
B
B
B
B
x
1
x
2
x
n 1
x
n
p
km logm=2
m
m
p
m= logm
Fan-inf
m
k
(x
1
;    ; x
n
)
Figure 1: The circuit C
m
k
dening the function f
m
k
C
m
k
shown in Figure 1. The circuit C
m
k
is a tree of depth k in which every
gate in the bottom level has fan-in
p
km logm=2, the fan-in of the output
gate is
p
m= logm, and the fan-in for all other gates is m. Each variable x
i
,
1  i  n, occurs at only one leaf. Note that the number n of variables of
the function f
m
k
equals m
k 1
p
k=2.
The following theorem is proved by Hastad [14, 15].
Theorem 2.1 ([14, 15]) There is no depth k circuit computing the function
f
m
k
with bottom fan-in
1
12
p
2k
q
m
logm
and fewer than 2
1
12
p
2k
p
m
logm
gates of
depth  2, for m > m
0
, where m
0
is a absolute constant.
To further sharpen the separation results and study the relationship
between circuit bottom fan-in and circuit size, we introduce a variation f
m;b
k
of the Sipser function by explicitly specifying the bottom fan-in for the
dening circuits.
Denition Let C
m;b
k
be the tree circuit dening the Sipser function f
m
k
,
as illustrated in Figure 1, except that each bottom level gate of C
m;b
k
has
fan-in b instead of
p
km logm=2. Dene f
m;b
k
to be the function computed
by the tree circuit C
m;b
k
. The number n of variables of the function f
m;b
k
is
6
n = bm
k 2
p
m= logm.
The discussion of the present paper is centered on the complexity of the
function f
m;b
k
.
3 On circuits that compute f
m;
(logm)
k
In this section, we consider the complexity of the function f
m;b
k
, where b is
of order 
(logm).
Our main result in this section is that for b > k logm, the function f
m;b
k
cannot be computed by any depth k circuit without exponential size and
with bottom fan-in
b
c logm
for a particular constant c. This result requires
two dierent proofs depending on whether the bottom fan-in b is larger or
smaller than the bottom fan-in of the standard Sipser function circuit C
m
k
given in Figure 1.
We rst consider the case b 
p
km logm=2. For this, we need to briey
review some notation and results by Hastad [14].
Denition ([14]) Let q be a real number and (B
i
)
r
i=1
a partition of the
variables. Let R
+
q;B
be the probability space of restrictions  that take values
as follows.
For every B
i
, 1  i  r independently
1. With probability q let s
i
=  and else s
i
= 0.
2. For every x
k
2 B
i
let (x
k
) = s
i
with probability q and else (x
k
) = 1.
Similarly, an R
 
q;B
probability space of restrictions is dened by inter-
changing the roles played by 0 and 1.
The idea behind these restrictions is that a block B
i
will correspond to
the variables leading into one of the bottom level gates of the circuit C
m;b
k
that denes the function f
m;b
k
. If the bottom level gates of C
m;b
k
are ands,
we use a restriction from R
+
q;B
; if the bottom level gates of C
m;b
k
are ors, we
use a restriction from R
 
q;B
.
Denition ([14]) For a restriction  2 R
+
q;B
, let g() be the restriction de-
ned as follows: For all B
i
with s
i
= , g() gives the value 1 to all variables
which are given value  by  except the one with the highest index among
7
those variables given value  by , to which g() gives value .
If  2 R
 
q;B
, then g() is dened similarly but now takes the value 0 and
. Note that for a given restriction , the restriction g() can be obtained
by a deterministic process that makes each block B
i
have at most one .
Let g() denote the composition of the two restrictions. That is, g()
is the restriction g() obtained from the restriction .
Lemma 3.1 (The Switching Lemma [14]) Let  be an and of ors all of
size  t and  2 R
+
q;B
. Then the probability that under the restriction g()
 cannot be written as an or of ands all of size < s is bounded by 
s
, where
 <
4qt
ln 2
< 5:78qt.
The Switching Lemma is also true if we do either or both of the following
replacements: (1) replacing the probability space R
+
q;B
by the probability
space R
 
q;B
; and (2) replacing  by an or of ands to be converted to an and
and ors.
Now we are ready for our rst main result of this section.
Theorem 3.2 For k logm < b 
p
km logm=2, the function f
m;b
k
cannot be
computed by any depth k circuit of bottom fan-in
b
12k logm
and size bounded
by 2
1
12
p
2(k 1)
p
m
logm
, for m > m
0
, where m
0
is an absolute constant.
Proof. The proof is similar to the induction step in the proof given by
Hastad for Theorem 2.1 (see [14], pages 48-50). Therefore, we only outline
the proof and describe in detail those places that are dierent.
We set q =
k logm
b
. Suppose that 
1
, : : :, 
r
, r = m
k 2
p
m= logm, are
the bottom level gates of the tree circuit C
m;b
k
dening the function f
m;b
k
.
Let (B
j
)
r
j=1
be the partition of the variables of f
m;b
k
such that block B
j
is
the set of variables leading into the bottom level gate 
j
of C
m;b
k
.
Claim 1: The probability that under the restriction g() any bottom level
gate 
j
of the tree circuit C
m;b
k
does not take value s
j
is bounded by
1
6m
.
The proof is identical to the proof by Hastad for Fact 1 in [14] (page 49):
Such a gate 
j
does not take the corresponding value s
j
with probability
(1   q)
jB
j
j
<
1
6
m
 k
. Since there are fewer than m
k 1
bottom level gates in
the tree circuit C
m;b
k
, the probability in Claim 1 is bounded by
1
6m
.
8
Claim 2: The probability that under the restriction g() any depth 2 gate
in the tree circuit C
m;b
k
gets fewer than
p
(k   1)m logm=2 's from the
bottom level is bounded by
1
m
.
Let p
i
=
 
m
i

q
i
(1   q)
m i
be the probability that a xed depth 2 gate
 in the tree circuit C
m;b
k
gets exactly i 's from the bottom level. With
the condition b 
p
km logm=2, we can show that for i 
p
(k   1)m logm,
we have
p
i
p
i 1

p
2. Thus the probability that the gate  gets fewer than
p
(k   1)m logm=2 's is bounded by m
 k
for suciently large m. Since
there are fewer than m
k 1
depth 2 gates in the tree circuit C
m;b
k
, the prob-
ability in Claim 2 is bounded by
1
m
.
Now suppose that the theorem is not true. Thus, there is a depth k
circuit C
0
of size bounded by 2
1
12
p
2(k 1)
p
m
logm
and bottom fan-in t 
b
12k logm
that computes the function f
m;b
k
. Furthermore, assume that the gates in the
bottom level of the circuit C
0
are or gates (the dual case can be proved
similarly).
Claim 3: The probability that under the restriction g() any depth 2 gate
in the circuit C
0
cannot be written as an or of ands of size
1
12
p
2(k 1)
q
m
logm
is bounded by
1
2
.
Let  be a xed depth 2 gate in the circuit C
0
. By the Switching Lemma,
under the restriction g(), the probability that  cannot be written as an or
of ands of size
1
12
p
2(k 1)
q
m
logm
is bounded by (5:78qt)
1
12
p
2(k 1)
p
m
logm
. Since
the circuit C
0
has at most 2
1
12
p
2(k 1)
p
m
logm
depth 2 gates, the probability in
Claim 3 is bounded by
(5:78qt)
1
12
p
2(k 1)
p
m
logm
 2
1
12
p
2(k 1)
p
m
logm
< (
5:78
12
)
1
12
p
2(k 1)
p
m
logm
 2
1
12
p
2(k 1)
p
m
logm
 (
5:78
6
)
1
12
p
2(k 1)
p
m
logm
which is smaller than
1
2
when m is suciently large.
Therefore with a probability larger than 1  (
1
6m
+
1
m
+
1
2
) >
1
3
, the tree
circuit C
m;b
k
becomes a circuit that computes a function f at least as hard as
the Sipser function f
m
k 1
, and the circuit C
0
becomes a depth k 1 circuit that
computes the function f and has bottom fan-in
1
12
p
2(k 1)
q
m
logm
and fewer
9
than 2
1
12
p
2(k 1)
p
m
logm
gates of depth  2. But this contradicts Theorem 2.1.
Letting b =
p
km logm=2 in Theorem 3.2, we obtain Theorem 2.1. Note
that the size bound is slightly improved.
Note that the condition b 
p
km logm=2 in Theorem 3.2 is essential
in the proof for Claim 2. For larger bottom fan-in b, we have the following
theorem.
Theorem 3.3 For b  2
p
km logm=2, the function f
m;b
k
cannot be com-
puted by any depth k circuit of bottom fan-in
b
25ke logm
and size bounded by
2
1
12
p
2k
p
m
logm
, for m > m
0
, where m
0
is an absolute constant and e is the
base of the natural logarithm.
Proof. Let q =
1:04
p
km logm=2
b
. Consider the following probability space
R
+
q
of restrictions:
For each variable x
k
of the function f
m;b
k
, let 
+
(x
k
) =  with
probability q and else 
+
(x
k
) = 1.
The probability space R
 
q
is dened similarly except that the value 1 is
replaced by value 0.
>From now on, we assume that the bottom level gates of the tree circuit
C
m;b
k
dening f
m;b
k
are and gates. The case when the bottom level gates of
C
m;b
k
are or gates can be proved similarly by using the probability space
R
 
q
instead of the probability space R
+
q
.
We rst show that under a restriction 
+
2 R
+
q
, with very large prob-
ability, the tree circuit C
m;b
k
computes a function at least as hard as the
Sipser function f
m
k
. The proof is similar to that for Claim 2 in Theorem 3.2.
Thus, we only describe the dierences.
Let  be a bottom level gate in the tree circuit C
m;b
k
. The gate  is an
and gate of fan-in b. Let p
i
=
 
b
i

q
i
(1   q)
b i
be the probability that the
gate  gets exactly i 's under a restriction 
+
2 R
+
q
. First we consider the
ratio
p
i
p
i 1
=
b  i+ 1
i

q
1  q
>
b  i
i

q
1  q
For i  1:02
p
km logm=2, we have
b  i
i

b  1:02
p
km logm=2
1:02
p
km logm=2
10
and
q
1  q
=
(1:04
p
km logm=2)=b
1  (1:04
p
km logm=2)=b
=
1:04
p
km logm=2
b  1:04
p
km logm=2
Thus, we have
p
i
p
i 1
>
b  1:02
p
km logm=2
1:02
p
km logm=2

1:04
p
km logm=2
b  1:04
p
km logm=2

52
51
This gives (51=52)
j i
p
j
> p
i
for i < j  1:02
p
km logm=2.
Now under a restriction 
+
2 R
+
q
, the probability that the gate  gets
fewer than
p
km logm=2 's is bounded by
P
p
km logm=2
i=0
p
i
<
P
p
km logm=2
i=0
(51=52)
p
km logm=2 i
p
p
km logm=2

 52p
p
km logm=2
< 52(51=52)
0:02
p
km logm=2
p
1:02
p
km logm=2

 52(51=52)
0:02
p
km logm=2
and 52(51=52)
0:02
p
km logm=2
is smaller than
1
m
k
for suciently large m.
Since the circuit C
m;b
k
has fewer than m
k 1
bottom level gates, we con-
clude that under a restriction 
+
2 R
+
q
, the probability that any bottom level
gate of the tree circuit C
m;b
k
gets fewer than
p
km logm=2 's is bounded by
1
m
.
Now suppose that the theorem is not true. Thus, there is a depth k cir-
cuit C
0
of bottom fan-in at most
b
25ke logm
and size bounded by 2
1
12
p
2k
p
m
logm
such that the circuit C
0
computes the function f
m;b
k
. We show that under
a restriction 
+
2 R
+
q
, with very large probability, the circuit C
0
becomes a
depth k circuit of bottom fan-in at most
1
12
p
2k
q
m
logm
.
Let  be a bottom level gate of fan-in c 
b
25ke logm
in the circuit C
0
,
and let r
i
=
 
c
i

q
i
(1  q)
c i
be the probability that the gate  gets exactly i
's under a restriction 
+
2 R
+
q
. We have
 
c
i
!
q
i
(1  q)
c i

c!
i!(c   i)!
q
i

c
i
q
i
i!


cqe
i

i
where the last inequality is based on Stirling's approximation [13]
i!  0:9(i=e)
i
p
2i  (i=e)
i
; for i  1
11
Let s =
1
12
p
2k
q
m
logm
. Under a restriction 
+
2 R
+
q
the probability that the
gate  gets more than s 's is bounded by
c
X
i=s+1
r
i

c
X
i=s+1

cqe
i

i

c
X
i=s+1
0
@
1:04
25
p
2k
q
m
logm
i
1
A
i
For i > s =
1
12
p
2k
q
m
logm
, we have (
1:04
25
p
2k
q
m
logm
)=i <
12:48
25
. Thus under a
restriction 
+
2 R
+
q
the probability that the gate  gets more than s 's is
bounded by
c
X
i=s+1
(12:48=25)
i
< (12:48=25)
s
Since the circuit C
0
has at most 2
s
bottom level gates, we conclude that
under a restriction 
+
2 R
+
q
, the probability that any bottom level gate of
the circuit C
0
gets more than s 's is bounded by
(12:48=25)
s
 2
s
= (24:96=25)
s
= (24:96=25)
1
12
p
2k
p
m
logm
which is smaller than
1
m
for suciently large m.
Thus, under a restriction 
+
2 R
+
q
, with probability  1 
1
m
 
1
m
>
1
2
,
all bottom level gates of the tree circuit C
m;b
k
get at least
p
km logm=2 's
(thus C
m;b
k
is converted to a circuit computing a function at least as hard as
the Sipser function f
m
k
), and all bottom level gates of the circuit C
0
get at
most
1
12
p
2k
q
m
logm
's. Note that if a bottom level gate  of the circuit C
0
gets at most
1
12
p
2k
q
m
logm
's, then either the gate  is eliminated from the
bottom level (e.g.,  is an and gate and gets an input with value 0) or the
gate  becomes a gate of fan-in at most
1
12
p
2k
q
m
logm
. In any case, we have
derived that there is an assignment that converts the circuit C
0
into a depth
k circuit C
0
of bottom fan-in bounded by
1
12
p
2k
q
m
logm
and size bounded by
2
1
12
p
2k
p
m
logm
such that the circuit C
0
computes a function at least as hard
as the Sipser function f
m
k
. But this contradicts Theorem 2.1.
This completes the proof.
4 On circuits that compute f
m;2
k
In the previous section, we showed that for circuits to compute the function
f
m;b
k
, b = 
(logm), an O(logm) factor reduction in bottom fan-in may cause
12
an exponential increase in the circuit size. In this section, we will show that
in certain cases, even reducing the circuit bottom fan-in by 1 will cause an
exponential increase in the circuit size. More precisely, we will show that
the function f
m;2
k
can be computed by a depth k circuit of linear size and
bottom fan-in 2, but requires exponential size for depth k circuits of bottom
fan-in 1. Note that a depth k circuit of bottom fan-in 1 is actually a depth
k   1 circuit.
We prove the above result with a new probability space of restrictions.
We start with the following lemma.
Lemma 4.1 Partition the Boolean variables fx
1
;    ; x
n
g into groups of c
variables each. For each group, randomly pick r variables and assign them
0, and assign the rest c   r variables . Let  be an or of a subset S

of fx
1
; x
1
;    ; x
n
; x
n
g such that S

contains at least h negative literals x
i
.
Then with the above random assignment,
Pr[ 6 1]  ((c   r)=c)
h
Proof. Let s = n=c be the number of groups in the partition given in the
statement of the lemma. We rst rename the literals in the set S

so that
x
(d)
1
, : : :, x
(d)
j
d
, d = 1; : : : ; s, j
1
+    + j
s
= h, are h negative literals in S

,
where x
(d)
1
, : : :, x
(d)
j
d
belong to the same group in the partition, d = 1; : : : ; s.
Let A
(d)
t
be the event that the variable x
(d)
t
is not assigned 0 by the
random assignment, and let E
 61
be the event that  is not identical to 1.
Then
E
 61

s
\
d=1
(A
(d)
1
\    \A
(d)
j
d
)
Thus
Pr[ 6 1] = Pr[E
 61
]  Pr[
s
\
d=1
(A
(d)
1
\    \A
(d)
j
d
)] =
s
Y
d=1
Pr[A
(d)
1
\    \A
(d)
j
d
]
The last equality is because the event A
(d)
1
\    \A
(d)
j
d
and the event A
(d
0
)
1
\
   \A
(d
0
)
j
d
0
are independent for d 6= d
0
.
Now consider Pr[A
(d)
1
\    \ A
(d)
j
d
]. If j
d
> c   r, then by the way we
assign the dth group, at least one of the variables x
(d)
1
, : : :, x
(d)
j
d
is assigned
0. Therefore,
Pr[A
(d)
1
\A
(d)
2
\    \A
(d)
j
d
] = 0
13
If j
d
 c  r, then
Pr[A
(d)
1
\A
(d)
2
\    \A
(d)
j
d
] =
Pr[A
(d)
1
]  Pr[A
(d)
2
jA
(d)
1
]  Pr[A
(d)
3
jA
(d)
1
\A
(d)
2
]   Pr[A
(d)
j
d
jA
(d)
1
\    \A
(d)
j
d
 1
]
Note that
Pr[A
(d)
i
jA
(d)
1
\    \A
(d)
i 1
] =
c  r   i+ 1
c  i+ 1

c  r
c
Thus,
Pr[A
(d)
1
\A
(d)
2
\    \A
(d)
j
d
]  ((c  r)=c)
j
d
This gives directly
Pr[ 6 1] 
s
Y
d=1
Pr[A
(d)
1
\    \A
(d)
j
d
] 
s
Y
d=1

c  r
c

j
d
=

c  r
c

h
Similarly we can prove
Lemma 4.2 Partition the Boolean variables fx
1
;    ; x
n
g into groups of c
variables each. For each group, randomly pick r variables and assign them
1, and assign the rest c   r variables . Let  be an or of a subset S

of
fx
1
; x
1
;    ; x
n
; x
n
g such that S

contains at least h positive literals x
i
. Then
with the above random assignment,
Pr[ 6 1]  ((c   r)=c)
h
Now we are ready for the main theorem of this section.
Theorem 4.3 The function f
m;2
k
cannot be computed by any depth k   1
circuit of size bounded by 2
1
12
p
2(k 1)
p
m
logm
for m > m
0
, where m
0
is an
absolute constant.
Proof. To simplify the expressions, we let s =
1
12
p
2(k 1)
q
m
logm
. Suppose
that the theorem is not true and that there is a depth k  1 circuit C of size
2
s
that computes the function f
m;2
k
. Furthermore, we assume that the gates
in the bottom level of C are or gates (the case that the bottom level gates
of C are and gates can be proved similarly.)
14
Randomly pick one variable from each pair x
2i 1
and x
2i
and assign it 0.
This will reduce the tree circuit C
m;2
k
dening f
m;2
k
to the tree circuit C
m;m
k 1
dening f
m;m
k 1
.
Let  be an or gate in the bottom level of the circuit C such that  has
more than s negative literals in its input, then by Lemma 4.1,
Pr[ 6 1]  (1=2)
s+1
Let 
1
,   , 
r
, r  2
s
, be all the gates in the bottom level of the circuit C
such that there are more than s negative literals in their input, then
Pr[
1
6 1_    _ 
r
6 1]  Pr[
1
6 1] +   +Pr[
r
6 1]  2
s
(1=2)
s+1
= 1=2
Thus,
Pr[
1
 1 ^    ^ 
r
 1]  1=2
Therefore, there is an assignment that converts the circuit C
m;2
k
to the circuit
C
m;m
k 1
and eliminates all gates in the bottom level of the circuit C in whose
input there are more than s negative literals. Let the circuit obtained from
C by this assignment be C
0
.
Now partition the input of the function f
m;m
k 1
into groups of m variables
each such that each group corresponds to the inputs to a bottom level gate
of the tree circuit C
m;m
k 1
. Randomly pick half of the variables in each group
and assign them 1. The circuit C
m;m
k 1
under such an assignment is converted
to the circuit C
m;m=2
k 1
dening the function f
m;m=2
k 1
.
Let  be an or gate in the bottom level of the circuit C
0
with more than
s positive literals in its input, then by Lemma 4.2,
Pr[ 6 1]  (1=2)
s+1
Let 
1
,   , 
t
, t  2
s
, be all the gates in the bottom level of the circuit C
0
with more than s positive literals in their input, then
Pr[
1
6 1_  _
t
6 1]  Pr[
1
6 1]+   +Pr[
t
6 1]  2
s
(1=2)
s+1
= 1=2
Thus,
Pr[
1
 1 ^    ^ 
t
 1]  1=2
Therefore, there is an assignment that converts the circuit C
m;m
k 1
to the
circuit C
m;m=2
k 1
and eliminates all gates in the bottom level of C
0
that have
more than s positive literals in their input. Let the circuit obtained from C
0
by this assignment be C
00
.
15
Since each gate in the bottom level of the circuit C
00
has neither more
than s negative literals nor more than s positive literals in its input, the bot-
tom fan-in of the circuit C
00
is at most 2s =
1
6
p
2(k 1)
q
m
logm
, which is smaller
than
m=2
25e(k 1) logm
for suciently large m. Thus, we have constructed a cir-
cuit C
00
of depth k 1, bottom fan-in less than
m=2
25e(k 1) logm
, and size bounded
by 2
s
= 2
1
12
p
2(k 1)
p
m
logm
such that C
00
computes the function f
m;m=2
k 1
. This
contradicts Theorem 3.3.
The following corollary will be used in Section 6.
Corollary 4.4 The function f
m;2
k
can be computed by a circuit of depth k,
linear size, and bottom fan-in 2, but cannot be computed by any depth k  1
circuit of polynomial size.
Corollary 4.5 For each pair of integers k; c > 1, there are functions that
are computable by circuits of linear size and depth k with bottom fan-in c
but require exponential size for circuits of depth k   1.
5 Trade-o between bottom fan-in and size
We rst summarize the results in the previous two sections in the following
theorem.
Theorem 5.1 For all integers b  2 and suciently large m, the function
f
m;b
k
can be computed by a depth k circuit of linear size and bottom fan-in
b, but requires size larger than 2
1
12
p
2k
p
m
logm
for depth k circuits of bottom
fan-in
b
25ek logm
.
Proof. For the case 2  b  k logm, since
b
25ek logm
< 1, the theorem
is implied by Theorem 4.3. The case k logm < b 
p
km logm=2 is proved
in Theorem 3.2. For the case
p
km logm=2 < b < 2
p
km logm=2, since
b
25ek logm

p
km logm=2
12k logm
, the theorem is implied by Theorem 3.2. Finally,
the case b  2
p
km logm=2 is proved by Theorem 3.3.
A number of important consequences follow directly >from Theorem 5.1.
16
Theorem 5.2 For any integers k  1 and h  1, and for any real number r,
there are functions that are computable by circuits of linear size and depth k
with bottom fan-in O(log
h
n), but require exponential size for depth k circuits
of bottom fan-in r log
h 1
n.
Proof. Let b = 25ek(k   1)
h 1
r log
h
m. Note that in this case, the
number of variables in the function f
m;b
k
is n  m
k 1
for m large enough.
Thus, logm  log n  (k   1) logm. By the denition, the function f
m;b
k
can be computed by a depth k and linear size circuit with bottom fan-in
b = O(log
h
n). On the other hand, according to Theorem 5.1, the function
f
m;b
k
requires exponential size for depth k circuits whose bottom fan-in is
r(k   1)
h 1
log
h 1
m. Note that r(k   1)
h 1
log
h 1
m is at least as large as
r log
h 1
n.
By more careful selections of the bottom fan-in b in Theorem 5.1, com-
bined with a padding technique, we are able to obtain general results for the
trade-o between circuit size and circuit bottom fan-in. We illustrate this
technique by the following theorem, which can be easily extended to other
cases using the same technique.
Theorem 5.3 For any integer k  1 and for any real number  > 0, there is
a function F

k
that is computable by a circuit of linear size and depth k with
bottom fan-in logn, but requires superpolynomial size for depth k circuits of
bottom fan-in O(log
1 
n).
Proof. Choose h such that
h
h+1
> 1   , and then use Theorem 5.2
to choose a function f
m;b
k
of  m
k 1
variables which can be computed by
a depth k circuit of linear size and bottom fan-in b = 25ek log
h+1
m but
requires size
2
1
12
p
2k
p
m
logm
(1)
when the bottom fan-in is  log
h
m.
Now make the function f
m;b
k
formally the function F

k
of n = 2
25ek log
h+1
m
variables by adding dummy variables that are not used. The theorem now
follows for the function F

k
since the size bound (1) is superpolynomial in n
and c log
1 
n < log
h
m for any xed constant c when m is suciently large.
17
In particular, if we let  = 1 and h = 1, then we obtain the following
corollary that will be used in Section 6.
Corollary 5.4 For any integer k  1, there is a function F
k
that is com-
putable by a circuit of linear size and depth k with bottom fan-in log n, but
requires superpolynomial size for depth k circuits of bottom fan-in O(1).
6 Input read-modes of Turing machines
An important application of the above investigation is to the input read-
modes of a sublinear-time alternating Turing machine, which is an important
computational model in the study of complexity classes.
To make sublinear-time Turing machines meaningful, we allow a Turing
machine to have a random access input tape plus a read-write input address
tape, such that the Turing machine has access to the bit of the input tape
denoted by the contents of the input address tape.
An O(logn)-time alternating Turing machine (log-time ATM) is dened
as an extension of the O(log n)-time deterministic Turing machines in the
usual way [9]. Given an input, the computation of a log-time ATM M can
be represented by an ^-_ tree. Each computation path in the ^-_ tree can
be divided into phases, which are the maximal subpaths in which M does
not make alternations. The rst conguration in each phase is called an
alternation (conguration). In particular, the starting conguration of M is
always an alternation.
A number of input read-modes for sublinear-time alternating Turing ma-
chines have appeared in the literature. The standard input read-mode intro-
duced by Chandra, Kozen, and Stockmeyer [9] allows a computation path
of an O(log n)-time alternating Turing machine to read up to (logn) input
bits. Ruzzo [16] proposed an input read-mode in which each computation
path can read at most one input bit and the reading must be performed at
the end of the path. An input read-mode studied by Sipser [17] insists that
the input address tape be always reset to blank after each input reading so
that each input reading takes time 
(logn).
It can be shown that many complexity classes such as NC
k
for k  1
and AC
k
for k  0 remain the same for all these input read-modes of
alternating Turing machines. On the other hand, it was unknown whether
these input read-modes aect the classes of lower complexity such as the
levels in the logarithmic time hierarchy. Recently, Cai and Chen [6] have
demonstrated how each level of the logarithmic time hierarchy based on each
18
of the above input read-modes can be characterized by a uniform family of
circuits. Combining these characterizations with the separation results given
in the previous sections, we are able to show that all these input read-modes
are distinct.
Formally, the logarithmic time hierarchy is dened to be the union of the
following classes:

1
;
2
;    ;
k
;   
where 
k
is the class of languages accepted by a log-time ATM that always
starts with an ^-state and makes at most k alternations.
The above denition ignores the input read-modes of the log-time ATMs
and thus is not very precise. To be more precise, we will call

U
1
;
U
2
;    ;
U
k
;   
the logarithmic time hierarchy based on Chandra-Kozen-Stockmeyer's model,

R
1
;
R
2
;    ;
R
k
;   
the logarithmic time hierarchy based on Ruzzo's model, and

S
1
;
S
2
;    ;
S
k
;   
the logarithmic time hierarchy based on Sipser's model, where 
U
k
(resp. 
R
k
,

S
k
) is the class of languages accepted by a log-time ATM based on Chandra-
Kozen-Stockmeyer's input read-mode (resp. on Ruzzo's input read-mode,
on Sipser's input read-mode) that always starts with an ^-state and makes
at most k alternations.
Theorem 6.1 ([6]) For all integers k  1,
(1). If a language L is in the class 
R
k
, then L is accepted by a 
poly
k
-
family of circuits;
(2). If a language L is in the class 
S
k
, then L is accepted by a 
poly;c
k
-
family of circuits for some constant c;
(3). If a language L is in the class 
U
k
, then L is accepted by a

poly;d log n
k
-family of circuits for some constant d;
According to the denitions, it is easy to see that 
R
k
 
S
k
 
U
k
. A
proof for the inclusion 
U
k
 
R
k+1
can be found in [6]. Our main result for
this section is that all these inclusions are strict, as proved in the following
theorem.
19
Theorem 6.2 For any integer k  1, we have

R
k
 
S
k
 
U
k
 
R
k+1
where  means \proper subset".
Proof.
(1). 
R
k
 
S
k
.
It has been proved by Cai and Chen in [6] that 
R
1
 
S
1
. Thus, we only
need to prove the strict inclusion for k  2.
Without loss of generality, we suppose that the output gate of the tree
circuit C
m;2
k+1
dening the function f
m;2
k+1
is an and gate (otherwise, we con-
sider the negation of the function f
m;2
k+1
). Moreover, to make the operations
such as m
k
, logm, and
p
m feasible within O(log n) deterministic time, we
consider only the case where m is a power of 2.
Let S
1
be the language whose characteristic function is given by the
functions f
m;2
k+1
, where m is a power of 2 (in particular, a string is not in
the set S
1
if its length does not match the number of variables for any such
function f
m;2
k+1
). We rst construct a log-time ATM M
1
that accepts the set
S
1
as follows. On input x, M
1
rst computes the length n of x. This can
be done in deterministic O(log n) time by reading O(log n) input bits [2].
Then M
1
veries that n = 2m
k 1
p
m= logm for some integer m that is a
power of 2. After this, M
1
simply traces the tree circuit C
m;2
k+1
dening the
function f
m;2
k+1
, except that in the kth phase, M
1
reads the two consecutive
input bits for the corresponding bottom level gate and directly computes
the value for the gate. Since the output gate of the circuit C
m;2
k+1
is an and
gate, the log-time ATM M
1
starts with an ^-state and makes at most k
alternations.
According to our assumption, k  2. Thus, the log-time ATM M
1
reads
at most two input bits in its last phase. By Theorem 3.1 in Cai and Chen [6],
M
1
can be simulated by a log-time ATM based on Sipser's input read-mode
that always starts with an ^-state and makes at most k alternations. This
proves that the set S
1
is in the class 
S
k
.
Suppose that S
1
is also in the class 
R
k
. Then by Theorem 6.1(1), S
1
is
accepted by a 
poly
k
-family of circuits. Thus, for any integerm that is a power
of 2, the function f
m;2
k+1
is computable by a depth k circuit of polynomial size.
This contradicts Corollary 4.4.
This shows 
R
k
 
S
k
.
(2). 
S
k
 
U
k
.
20
The proof is similar to that for Case (1). Consider the function F
k+1
in Corollary 5.4, which is a function of n variables obtained from the
function f
m;b
k+1
, b = 25e(k + 1) log
2
m by adding dummy variables, where
n = 2
25e(k+1) log
2
m
. We also make the similar assumptions as we did for
Case (1). Thus, the output gate of the tree circuit C
m;b
k+1
dening the func-
tion f
m;b
k+1
is an and gate, and m is a power of 2. Under these assumptions,
it is easy to see that the set S
2
whose characteristic function is given by the
functions F
k+1
in Corollary 5.4 is in the class 
U
k
: A log-time ATM M
2
rst
veries the length of the input and traces the tree circuit C
m;b
k+1
dening the
corresponding function f
m;b
k+1
except that in the kth phase, M
2
reads directly
a consecutive block of b input bits that are the inputs to the corresponding
bottom level gate. Note that the b consecutive input bits can be read in
deterministic O(b + logn) time [4], and that b is logarithmic in the input
length n of the function F
k+1
. This proves that the language S
2
is in the
class 
U
k
.
Suppose that S
2
is also in the class 
S
k
. By Theorem 6.1(2), the language
S
2
is accepted by a 
poly;c
k
-family of circuits for some constant c. That is,
the function F
k+1
is computable by a depth k + 1 and bottom fan-in c
circuit whose size is polynomial. But this contradicts Corollary 5.4. Thus,

S
k
 
U
k
.
(3). 
U
k
 
R
k+1
.
The proof is similar to those for the other two cases. Let S
3
be the
language whose characteristic function is given by the Sipser functions f
m
k+1
.
Then the set S
3
can be accepted by a log-time ATM M
3
that always starts
with an ^-state and makes at most k + 1 alternations. Moreover, the last
phase of M
3
reads at most one input bit. By Theorem 3.1 in Cai and
Chen [6], M
3
can be simulated by a log-time ATM based on Ruzzo's input
read mode that always starts with an ^-state and makes at most k + 1
alternations. Thus, the set S
3
is in the class 
R
k+1
. On the other hand, by
Theorem 6.1(3), S
3
2 
U
k
would imply that S
3
is accepted by a 
poly;O(logn)
k
-
family of circuits. That would in turn imply that the Sipser function f
m
k+1
is
computable by a depth k+1 circuit of polynomial size whose bottom fan-in
is O(log n), contradicting Theorem 2.1.
This completes the proof.
Corollary 6.3 For each k  1, the kth levels of the logarithmic time hierar-
chy based on Chandra-Kozen-Stockmeyer's input read-mode, Sipser's input
read-mode, and Ruzzo's input read-mode are all distinct.
21
Acknowledgements.
The second author would like to thank Mike Sipser for an early discussion
that initializes this line of research. He is also grateful to Ken Regan for his
comments and constructive discussions. Finally, the authors are especially
thankful to two anonymous referees for comments and suggestions that have
improved the presentation. In particular, one of the referees pointed out a
technical bug in an earlier version of the present paper.
References
[1] M. Ajtai, 
1
1
-formulae on nite structures, Ann. Pure Appl. Logic
24, (1983), pp. 1-48.
[2] D. Barrington, N. Immerman, and H. Straubing, On uniformity
within NC
1
, J. Comput. System Sci. 41, (1990), pp. 274-306.
[3] R. B. Boppana and M. Sipser, The complexity of nite functions,
in J. van Leeuwen, ed., Handbook of Theoretical Computer Science Vol.
A, Elsevier, Amsterdam, 1990, pp. 757-804.
[4] S. R. Buss, The Boolean formula value problem is in ALOGTIME,
Proc. 19th Annual ACM Symposium on Theory of Computing, (1987),
pp. 123-131.
[5] L. Cai and J. Chen, Fixed parameter tractability and approximabil-
ity of NP-hard optimization problems, Proc. 2rd Israel Symposium on
Theory of Computing and Systems, (1993), pp. 118-126. Journal version
submitted.
[6] L. Cai and J. Chen, On input read-modes of alternating Turing
machines, Theoretical Computer Science 148, (1995), pp. 33-55.
[7] L. Cai and J. Chen, On the amount of nondeterminism and the
power of verifying, SIAM Journal on Computing, to appear.
[8] L. Cai, J. Chen, R. G. Downey, and M. R. Fellows, On the
structure of parameterized problems in NP, Information and Computa-
tion 123, (1995), pp. 38-49.
22
[9] A. K. Chandra, D. C. Kozen, and L. J. Stockmeyer, Alterna-
tion, J. Assoc. Comput. Mach. 28, (1981), pp. 114-133.
[10] J. Chen, Characterizing parallel hierarchies by reducibilities, Infor-
mation Processing Letters 39, (1991), pp. 303-307.
[11] S. Cook, A taxonomy of problems with fast parallel algorithms,
Information and Control 64, (1985), pp. 2-22.
[12] M. Furst, B. Saxe, and M. Sipser, Parity, circuits, and the
polynomial-time hierarchy, Math. Systems Theory 17, (1984), pp. 13-
27.
[13] R. L. Graham, D. E. Knuth, and O. Patashnik, Concrete Mathe-
matics: A Foundation for Computer Science, Addison-Wesley, Reading,
MA, 1989.
[14] J. H

astad, Computational limitations for small-depth circuits, The
MIT Press, Cambridge, MA, 1986.
[15] J. H

astad, Almost optimal lower bounds for small depth circuits,
in S. Micali, ed., Advances in Computing Research 5, JAI Press Inc.,
Greenwich, CT, 1989, pp. 143-170.
[16] W. L. Ruzzo, On uniform circuit complexity, J. Comput. System
Sci. 22, (1981), pp. 365-383.
[17] M. Sipser, Borel sets and circuit complexity, Proc. 15th Annual ACM
Symposium on Theory of Computing, (1983), pp. 61-69.
[18] A. C. Yao, Separating the polynomial-time hierarchy by oracles, Proc.
26th Annual IEEE Symposium on Foundations of Computer Science,
(1985), pp. 1-10.
23
