f; g { set brackets
[; ] { square brackets, used like parentheses 0 { the \prime" sign, used in expressions such as . 1 { one l { italics letter ell, used in expressions such as n l { used at the end of proofs. j { a vertical bar, used in expressions such as jxj.
Introduction
In this paper, we present a number of depth-reduction theorems: theorems that state that certain classes of circuits of a given depth can be simulated eciently by circuits of smaller depth. Before presenting our results, let us rst contrast our work with earlier work showing that in a variety of settings, depth reduction cannot be achieved.
Some of the strongest lower bound results in complexity theory are bounds on the size required to compute functions on circuits of AND and OR gates of unbounded fan-in. (Throughout this paper, unless we explicitly say otherwise, negated input bits are available at the input level of the circuit, and thus NOT gates are not needed, by DeMorgan's Laws.) It was shown in Yao (1985) and H astad (1987) that for all k there is a function computed by such circuits of linear size and depth k that cannot be computed on such circuits of depth k 0 1 and size less than exponential. That is, Yao (1985) and H astad (1987) show that depth reduction is impossible in this setting.
Another circuit model that is of interest is the threshold circuit model; a threshold circuit is a circuit composed of MAJORITY gates. (A MAJORITY gate is a gate that takes the value 1 i more than half of its inputs have the value 1. Note that in the literature, the term \threshold circuit" is used to refer to any of a number of closely-related models of computation. In particular, some work considers a model in which arbitrary real numbers are allowed as weights on the outputs of subcircuits; this model can be simulated eciently by MAJORITY circuits by at most doubling the depth (Siu and Bruck (1991) ). Throughout this paper, all results will be stated in terms of the model dened using MAJORITY gates.) Threshold circuits are studied in part because MAJORITY gates have roughly the same computational power as integer multiplication gates (Chandra et al. (1984) ), and also because the \neural net" model of the brain is computationally equivalent to a threshold circuit (Parberry and Schnitger (1989) , Parberry (1990) ). Little is known about depth reduction in the context of threshold circuits; the best results in this direction are the results of Hajnal et al. (1987) , where it was shown that there is a language recognized by a family of polynomialsize depth three majority circuits that cannot be recognized by polynomial-size depth two majority circuits.
In order to better understand the threshold circuit model, Yao (1989) . That is, in contrast to the monotone case, depth k circuits can be simulated by depth 3 threshold circuits, with only a \modest" increase in size.
Another circuit model that has received considerable attention consists of circuits with AND and OR gates and MODp gates. Smolensky (1987) , building on work by Razborov (1987) , showed that constant-depth circuits of this type with MODp gates for a xed prime modulus p can be approximated by lowdegree polynomials, and then used this fact to derive lower bounds on the size of such circuits to compute MODq, for q 6 = p. We show that for circuits of this type of size 2 log O(1) n , constant depth is no more powerful than depth 4. Our results are closely related to the important work of Toda (1991) , showing that the polynomial hierarchy is contained in P PP . Connections between the polynomial hierarchy and constant-depth families of circuits were established in Furst et al. (1984) , and similar observations regarding threshold circuits and PP were made in Tor an (1991) . By making use of those connections, our results can be viewed as providing a circuit-based interpretation of some of Toda's work.
The paper is organized as follows. In Section 2 we present the basic denitions and notational conventions. In Section 3 we prove some elementary lemmas using algebraic properties allowing us to convert a circuit into a simpler form. In section 4 we prove our main depth reduction results in the framework of nonuniform circuit complexity. Section 5 contains the main lemmas needed to make these depth reduction results carry over into the uniform setting, and our uniform depth reduction results are presented in Sections 6 and 7. A summary is found in Section 8.
Denitions and Background
We assume familiarity with the basics of circuit complexity. For additional background, see Boppana and Sipser (1990) , Barrington et al. (1990), and Ruzzo (1981) .
A family of circuits is a set fC n : n 1g where each C n is a circuit for inputs of length n. In later sections of this paper, we will require that the function n 7 ! C n be easily computable in some sense (which will be made precise there).
Such circuit families are called uniform. If no such restriction on constructibility is imposed, the circuit families are called nonuniform. It is no loss of generality to consider only circuits that are \leveled" in the sense that each gate can be assigned to a \level" denoting the distance from the gate to the input level, where inputs to a gate at level i come only from gates at level i 0 1. Thus the inputs to a circuit are at level 0, the gates that directly process those inputs are at level 1, and the output gate of a depth k circuit is at level k.
We will also have cause to consider circuit complexity classes dened by other size bounds and using other types of gates. That leads to the following denition:
Denition 1 (i) SIZE(s(n))DEPTH(d(n)) GATES(S) denotes the class of languages which can be recognized by circuit families of size s(n) and depth d(n) where the types of gates which can be used are in the set S .
(ii) BPSIZE(s(n))DEPTH(d(n)) GATES(S) denotes the analogous class which is dened in terms of probabilistic circuits. That is, the circuits have some number of probabilistic bits as auxiliary inputs, drawn from the uniform 3 Depth Reduction and the Distributive Law
The main tool that we will use in later sections to carry out depth reduction is to view a circuit in a certain form as a polynomial of low degree, and then express that polynomial in standard form. (Of course, it is not a new observation that circuits correspond to polynomials over nite elds. In particular, this connection was used very eectively in Razborov (1987) and Smolensky (1987) .) In this section, we derive the particular bounds that we will need later on.
As the 8-operation is exactly the same as addition in the eld GF(2) and thê -operation is exactly the same as multiplication, we can view a 8-gate of fan-in s as a sum over s summands, and similarly an^-gate of fan-in r as a product of r factors. In this way, it is clear that a circuit of AND and PARITY gates may be viewed as a polynomial over GF(2). Next we note that similar observations hold for MODp gates for any prime p, and use this to show that depth two circuits of and MODp gates can eciently simulate circuits of greater depth. Since we focus on the fan-in of the various gates, we will express this in the abbreviated form Thus over GF(p) we can rearrange our four level circuit as follows: The nal formula represents the desired circuit.
To complete the proof, we need only remark that each step in this transformation can easily be carried out by a Turing machine that needs only to store a constant number of indices (i.e., gate names) at any one time. Thus the entire transformation can be done using space logarithmic in the size of the resulting circuit. (Perhaps the only step where uniformity might not be entirely obvious is the step where the product is distributed over the sum. Thus consider a subcircuit of the form 
Depth Reduction for Nonuniform Circuits
In this section, we present our main depth reduction theorems. Unfortunately, the proofs presented in this section work only for nonuniform circuit families. In later sections, we show how to achieve depth reduction for uniform circuit families. Our reasons for including this section are:
The proofs are much simpler in the nonuniform setting. Certain kinds of depth reduction are not yet known to hold in the uniform setting.
We are able to achieve slightly better size bounds in the nonuniform setting.
The proof outline may be given quite simply:
Show that constant depth circuits can be simulated by probabilistic depth two circuits. (This simulation is very similar to the result of Razborov (1987) (generalized by Barrington (1987) and Smolensky (1987) ) showing that circuits with small depth can be approximated by polynomials of small degree. Our proof is also very similar to his. The dierence is that we need a probabilistic circuit that works well on all inputs, as opposed to a deterministic circuit that works well on most inputs. It is possible to use the result of Smolensky (1987) to derive the result we need, but we feel it is simpler and more transparent to give a direct proof.)
Use established techniques to make these probabilistic circuits deterministic. (Exactly which techniques are used depends on the type of deterministic circuit being constructed.) Lemma 4 For any prime p and any constant k, there is a family of probabilistic depth-two circuits of size n O(log n)
, computing the OR of n bits, with error less than 1=n k . The rst level of this circuit consists of ANDs of fan-in O(log n), and the second level consists of a MODp gate.
Proof: In order to simplify the exposition, assume that p = 2. The generalization to other primes is straightforward.
In order to compute the OR of b 1 ; b 2 ; : : : b n , rst consider the circuit B n with one PARITY gate, where the inputs to the parity gate are f1g [fAND(b i ; p i ) : 1 i ng, where the p i are probabilistic bits. It is easy to see that if OR(b 1 : : : b n ) = 0, then B n outputs 1, and if OR(b 1 : : : b n ) = 1, then B n outputs 0 with probability exactly 1/2. (When carrying out the generalization to other primes p, this error will be 1=p, requiring the expression \k log n" in the next paragraph to be replaced by \ck log n" for some constant c. The statement of the lemma does not depend on p, however.)
Now take k log n separate copies of B n (with independent probabilistic inputs for each copy of B n ) and AND these k log n circuits together. Call this new circuit C n . It is immediate that if OR(b 1 : : : b n ) = 0, then C n outputs 1, and if OR(b 1 : : : b n ) = 1, then C n outputs 0 with probability 1 0 1=n k . Now by Lemma 3 (letting s 1 = 1; t = k log n; s 2 = n +1, and r = 2), C n can be converted into an equivalent circuit D n consisting of a PARITY gate of n O(log n) AND gates, where each AND gate has fan-in O(log n). Let E n be the circuit that computes the negation of this D n (i.e., the PARITY gate has an additional 1 input). Then E n is a circuit with the properties claimed by the lemma. ANDs of fan-in O(log k n), and the second level consists of a MODp gate.
Proof: Again, to simplify the exposition, we assume that p = 2. The generalization to arbitrary primes p is straightforward.
The proof proceeds by induction on k. The nontrivial parts of the basis case are proved in Lemma 4 and Corollary 5. For the induction step, let L be accepted by a family of depth k circuits of polynomially-many unbounded-fan-in AND, OR, and PARITY gates. Consider the circuit C n for inputs of length n. Assume without loss of generality that the output gate of C n is an AND gate (the proof is entirely symmetric when it is an OR gate, it is trivial if it is a PARITY gate, and the slightly more general MODp case proceeds along similar lines, using Lemma 3.). Thus C n is the AND of at most n l circuits of depth k 0 1 for some l. By the inductive hypothesis, each of these n l circuits may be replaced by a probabilistic depth 2 circuit of size n O(log k01 n)
, having error probability at most 1=n a (where a may be any constant). The resulting circuit has error probability at most n l =n a . Also, the top-level AND in this circuit can be replaced by a probabilistic depthtwo circuit of the sort guaranteed by Corollary 5; the resulting circuit is of the form (: : :) and may be constructed to have error probability less than 1=q(n). (The only dependence on the polynomial q is in the choice of the constant a above, and in the constants in the various \O(log : : :)" terms in this expression.) The proof is completed with an appeal to Lemma 3.
Theorem 6 shows how to simulate a deterministic circuit by a probabilistic circuit of depth two. Now it remains only to simulate the probabilistic circuits by deterministic circuits; this can be done using known techniques. In order to justify the precise bounds we claim, we present the details below in Theorems 7 and 9. . Proof: We use the technique of Ajtai and Ben-Or (1984) .
By Theorem 6, L is accepted by a family of probabilistic f^,_,MODpg-circuits of depth two with error probability less than 1 2n 2 . Now we take n 2 of these circuits and AND them together on level 3. The error in the positive case (the input should be accepted) is still less than 1 2 while the error in the negative case (the input should be rejected) is now less than 1 (2n 2 ) n 2 which is far less than 1 2 n 2 . Now take n of these depth 3 circuits and OR them together. This results in a depth 4 circuit with error less than 1 2 n in the positive case, and error less than n 2 n 2 in the negative case. The error probability being less than 2 0n in both directions one can use the argument of Adleman (1978) and Ajtai and Ben-Or (1984) that there must be values for the random bits that always lead to the correct output. Thus the four level circuit can be made deterministic (in a nonuniform way).
Theorem 7 shows that depth four suces for f^,_,MODpg-circuits. By using the more powerful MAJORITY gates, we can reduce the depth even further. First, however, we need an easy proposition showing that MODp gates can be replaced by MAJORITY gates in some situations, without increasing circuit depth. (This type of simulation is a more-or-less standard technique; see, for example, Bruck (1990) . A more general result of this sort is proved by Bultman (1990) .)
Proposition 8 If C is a depth two circuit with one MAJORITY gate as output and r MODp gates on level 1, where no MODp gate has more than pm inputs, then C is equivalent to a depth two threshold circuit with at most 2(p01)mr +1 MAJORITY gates. .
Proof: The depth two probabilistic circuit constructed in Theorem 6 can be converted into a deterministic depth three circuit using the technique of Proposition 4.2 in Hajnal et al. (1987) ; this involves (1) building 2n independent copies of the probabilistic depth two circuit and taking the MAJORITY of their outputs, and (2) noting that the resulting circuit has exponentially small error probability, and thus (nonuniformly) there exists a sequence of probabilistic bits that may be hardwired in, yielding a correct circuit. The resulting circuit is only polynomially larger than the original probabilistic circuit, and consists of AND gates on level 1, MODp gates on level 2, and a MAJORITY gate as the output gate.
The theorem now follows by Proposition 8 and the trivial reducibility of AND and OR to MAJORITY.
The depth-reduction theorems are even more striking when stated in terms of circuits of size 2 Proof: Immediate from Theorems 6, 7, and 9, and from the result of Ajtai and Ben-Or (1984) that any probabilistic circuit may be made deterministic (in the nonuniform setting) by increasing the size by a polynomial factor, and increasing the depth by an additive constant.
Most, but not quite all, of these depth-reduction theorems are also known to hold in the setting of uniform circuit complexity. This is taken up in the following sections.
It is natural to wonder if these depth reduction results can be improved. For example, is every set in AC . Thus computing the MOD 3 of log 2kr+1 n bits in depth k needs size greater than 2 O(log r n)
. However, it is well-known that this function can be computed by AC 0 circuits (Fagin et al. (1985) , Denenberg et al. (1986) ).
A Uniform Simulation
Although the proofs presented in Section 4 are quite simple, they suer from the drawback that they are only suitable for nonuniform circuit complexity. That is, if L is accepted by a family of AC 0 circuits fC n : n 2 Ng such that the function n 7 ! C n is eciently computable, then the results of Section 4 tell us that there exists a family fD n g of depth-three threshold circuits of size 2 log O(1) n accepting L, but we have no guarantee that there is any ecient way to construct the circuits
The reason for this is that the proofs in the preceding section make use of probabilistic constructions. Although we were able to make use of established techniques for turning probabilistic circuits into deterministic circuits (as in Adleman (1978) , Parberry and Schnitger (1988) , and Hajnal et al. (1987) ), these techniques seem to be inherently nonuniform.
In the literature on circuit complexity, a circuit family C n is called \uniform" if the function n 7 ! C n is \easy to compute" in some sense. There are many notions of uniformity that are worthy of consideration, each with a dierent notion of \easy to compute." For example, in one of the rst papers to consider uniform circuit complexity, Ruzzo (1981) considers a variety of uniformity notions, and P-uniform circuit complexity is discussed in Allender (1989a) . Even more relevant to this paper are the notions of uniformity discussed in Barrington et al. (1990) . In that paper, Barrington et al. consider what version of uniform circuit complexity is most appropriate for use in dening classes of languages accepted by circuits of polynomial size and depth O(1).
With some work, it would be possible to adapt the denitions of Barrington et al. (1990) to make them applicable to circuits of size 2 log O(1) n . In the interest of simplicity, however, we do not choose to do that here. Instead, we use a rather generous notion of uniformity. By doing so, it will be obvious that the circuits we construct are uniform because of their regularity, whereas if we were to use a more stringent notion of uniformity, it would be necessary to argue at length that the circuits satisfy the requirements of the uniformity condition.
Note that any machine that constructs a circuit of size 2 log O(1) n must use at least space log O(1) n. That leads us to the following Convention: Throughout the rest of this paper, a family of circuits C n will be said to be uniform if the function n 7 ! C n is computable in space log O(1) n.
Note that the composition of two functions computable in log O(1) n space is also computable in log O(1) n space. Thus when we show that a circuit C n of one sort can be converted into an equivalent circuit D n of another sort via a constant number of transformations, each of which is computable in log O(1) n space, we can conclude that the circuit family fD n g is uniform if fC n g is uniform.
In order to achieve depth reduction in the setting of uniform circuits, we will drastically reduce the number of probabilistic bits used by depth two probabilistic circuits computing AND and OR. To do this we make use of the following result by Valiant and Vazirani (1986) : v j w j mod 2). Let P n (S) be the probability that jS i j = 1 for some i 2 f0; : : : ; ng. Then P n (S) Proof: We give the proof only for the computation of OR, for the computation of AND essentially the same proof works, starting with the negated inputs and using de Morgan's laws. For MODp the statement holds trivially.
Let m be the least integer that is strictly greater than log 2 n. We will construct a circuit C Level 2 consists of n 1 m MODp gates P a;k (1 a n; 1 k m); these gates will compute the value a 1 w k . To see how to do this, note that the value of a 1w k is computed by taking the PARITY of O(log n) bits. (Exactly which bits of w k take part in this computation depends on the constant a.) The DNF expression of this PARITY function can be expressed with n O (1) AND gates, with the property that on any given input, either none of the AND gates evaluates to true, or exactly one does. This can clearly be computed by a subcircuit with a MODp gate on level 2, with n O(1) AND gates on level 1, where the AND gates have fan-in O(log n). The collection of all these subcircuits can be produced in logarithmic space.
Level 3 consists of n(m + 1) gates D a;k (1 a n; 0 k m) which shall take the value 1 if and only if x a = 1 and a 1 w i = 0 for all i k. These gates can obviously be realized as AND gates with fan-in O(log n), using the P a;j with j k. Level 4 consists of m + 1 gates E k (0 k m) which shall take the value 1 if and only if p does not divide (p 0 1) plus the number of a's (1 a n) such that x a = 1 and a 1 w i = 0 for all i k. These gates can obviously be realized as MODp gates with fan-in n + (p 0 1), the inputs being the outputs of the D a;k (1 a n) and (p-1) constants 1. Note that in the case where x a = 0 for all a 2 f1; : : : ; ng, all E k have value 1. Level 5 consists of one AND gate F of fan-in m +1, with the E k (0 k m) as inputs. This circuit will output 1, if OR(x 1 ; : : : ; x n ) = 0. But if OR(x 1 ; : : : ; x n ) = 1, the set of all a, such that x a = 1 is nonempty. Thus by Theorem 12, with probability at least 1 4 there is a k, such that D a;k has value 1 for exactly one a; hence E k has value 0, and consequently C To amplify the probability in the case that the OR should be 1, we take 3c log n independent copies of the circuit, each using the same inputs x 1 ; : : : ; x n but each with its own set of probabilistic bits. We combine the outputs of these circuits by one AND gate and call the resulting circuit C , and let c be a constant. Then L is accepted by a uniform family of probabilistic circuits of depth two with error less than for inputs of length n. Assume without loss of generality that the output gate of C (n) is an AND gate (the proof is entirely symmetric when it is an OR gate, and it is trivial when it is a MODp gate). Thus C (n) is the AND of at most 2 d1log r n circuits of depth k 0 1 for some constant d. By the inductive hypothesis, each of these circuits may be replaced by a probabilistic circuit of size 2 O(log 4(k01)r n)
, having error probability at most 1 2 3cd1log r n , and using O(log 3r n) probabilistic bits. If we use one sequence of O(log 3r n) probabilistic bits and use this sequence as the probabilistic input to each of these subcircuits, the resulting circuit has error probability at most 2 d1log r n 2 3cd1log r n 1 2 2c1log r n . Also, by Lemma 13, the top level AND in this circuit (with fan-in 2 d1log r n ) can be computed by a formula of the form
with error probability at most 1 2 2c1log r n , using at most O(log 3r n) probabilistic bits. Putting these two parts together results in a circuit with error probability at most 1 2 c1log r n , of the form To complete the proof, one can show via an easy induction on k that the circuits produced by this construction are uniform.
Depth Reduction for Uniform Deterministic Circuits
In this section we want to use the results of Section 5 to obtain depth reductions for deterministic circuits. Thus we rst have to investigate how probabilistic circuits can be made deterministic using a constant number of levels of AND and OR gates. Our lemma accomplishing this is a circuit-based interpretation of the inclusion BPP 6 p 2 (Sipser (1983) , Lautemann (1983) ), and our proof proceeds along exactly those same lines.
Lemma 15 Let fC n g be a uniform family of probabilistic circuits accepting a language L, of size s(n), depth d(n), using m(n) probabilistic bits, and having error probability less than , the inputs to the OR gates are AND gates of fan-in m(n), and the inputs to these AND gates come from 2 m(n) copies of the original circuit C n , each with a dierent probabilistic sequence hardwired in.
Proof: Let x be any string of length n, let y be a sequence of m = m(n) bits, and let c y denote the output (zero or one) produced by C n on input x with probabilistic sequence y. Thus . Once this claim is proved, it is easy to see that the formula on the right hand side of (1) can be realized by a circuit with the desired requirements, and that the circuits can be constructed uniformly.
To prove equivalence (1) )DEPTH(4) GATES (^,_,MODp) Proof: Applying Lemma 15 to the depth two probabilistic simulation of Lemma 14, one obtains a depth 5 circuit, whose lower three levels are AND, MODp and again AND levels. As the third level AND gates have fan-in equal to the number of probabilistic bits in the probabilistic simulation, one can apply Lemma 3 to compress these three levels to two, achieving the desired size bound.
Just as Theorem 16 is a uniform analogue of Theorem 7, the following theorem provides a result similar to Theorem 9 in the uniform setting. with O(log 3r n) probabilistic bits. This can be done by taking the majority over all sequences of random bits (i.e., a MAJORITY gate with inputs from 2 O(log 3r n) copies of the depth two circuit { one copy for each probabilistic sequence). The resulting circuit accepts the correct language, and can be converted to a threshold circuit using Proposition 8. (It is easily observed that the conversion presented in the proof of Proposition 8 can be done uniformly.)
Our main depth reduction results for uniform deterministic circuits are summarized in the next corollary, which follows immediately from Theorems 16 and 17: , use the technique of Ajtai and Ben-Or (1984) (which we also used in Theorem 7) to reduce the error probability to 1 n , increasing the size by a polynomial factor, and adding only a constant number of levels to the depth. (Note that this transformation can be carried out uniformly.) Now we simulate that circuit (viewed as a deterministic circuit on the original inputs plus the random bits) via Lemma 14 by a depth two circuit introducing negligable additional error probability. Thus the error probability of the new circuit (now again viewed as a probabilistic circuit on the original inputs) is less than As we saw in the proof of Theorem 16, it is very easy to change probabilistic circuits into deterministic circuits in a uniform way if only log O(1) n probabilistic bits are used. Fortunately, in many cases, a probabilistic circuit can be converted into an equivalent one using only a small number of probabilistic bits. The technique for doing this was developed by Nisan and Wigderson. The following paragraphs outline some of the results of Nisan and Wigderson (1988) .
Nisan and Wigderson showed how to construct a certain type of pseudorandom generator. They have a very general construction that takes as its starting point a \hard" function. For example, it easily follows from Chapter 8 of H astad (1987) that: where all strings of length log k n are equally probable.
Using this \hardness" result for PARITY, Nisan and Wigderson then construct a pseudorandom generator that takes input of length log k n and produces output that \looks random" to any circuit family in (nonuniform) AC By Theorem 21, we can assume that L is accepted by a uniform circuit family fC n g using only log O(1) n probabilistic bits, and with error at most 1 8 . As in the proof of Corollary 19, note that the circuits themselves are deterministic, if we view the probabilistic bits as being part of the input; to be precise let this family be fD m g, where for some constant c, C n = D n+log c n . By Lemma 14, the circuit family fD m g is simulated by a uniform family of probabilistic depth two circuits fE m g using log . That is, L itself is accepted by a uniform family of depth two circuits using only log c n + log d n probabilistic bits and gates from f^; _; MODpg, with error at most 1 4 . By lemma 15, L is accepted by a uniform family of depth ve circuits, and these can be converted into depth four circuits as in the proof of Theorem 16.
Unfortunately, the pseudorandom generators constructed in Nisan and Wigderson (1988) do not allow us to prove anything about circuits that have PARITY gates. On the other hand, the technique of Nisan and Wigderson (1988) is quite general, and if we had an example of a suitable \hard" function, the proof strategy of Theorem 22 would carry over to the setting of circuits with PARITY gates. The following paragraphs make this precise. where all strings of length log k n are equally probable. It is reasonable to conjecture that suitably hard functions exist. In fact, the results of Razborov (1987) and Smolensky (1987) make it plausible that MAJOR-ITY and MOD3 are suitably hard. Unfortunately, we do not see how to adapt the proof techniques of Razborov (1987) and Smolensky (1987) to show that there are any suitably hard functions. We conjecture that for prime p, uniform probabilistic f^,_,MODpg-circuits of constant depth can be simulated by uniform deterministic circuits of depth four using the same type of gates, but we were able to prove this only under unproven assumptions (Theorem 23).
In addition, we noted that these results are essentially optimal, in the sense that such a simulation in any constant depth cannot be achieved in size less than 2 log O(1) n , even if we try only to simulate AC 0 -circuits (Proposition 11). It remains open whether or not AC 0 -circuits can be simulated by threshold circuits of xed depth and polynomial size. A possible rst step toward settling this question appears in Bruck and Smolensky (1990) .
Recently, a number of other results have been proved concerning small depth threshold circuits of size 2 log O(1) n . (Barrington has suggested calling this complexity class \Quasi-TC 0 .") Yao (1990) improved our Theorems 9 and 17, removing the restriction that the modulus p be prime. We note, however, that it is still not known if depth reduction theorems such as Corollaries 10, 18, and 19 can be generalized to composite moduli. Tarui (1991) has shown that AC 0 can be simulated by probabilistic depth two threshold circuits of size 2 log O(1) n with one-sided error. Related results may also be found in Beigel et al. (1990) .
These developments in circuit complexity go hand-in-hand with signicant progress being made in understanding the relationships that exist among various subclasses of PSPACE, such as the polynomial hierarchy, PP, and the counting hierarchy, starting with the seminal result of Toda (1991) . These connections are surveyed in Allender and Wagner (1990) , and similar connections are explored in Kannan et al. (1991) .
Amid all of this recent work showing the surprising power of Quasi-TC 0 circuits, it has been suggested that even apparently larger complexity classes such as N C on Complexity Theory, where some of these discussions took place. We thank Ileana Streinu and Walter Hohberg for nding some errors in an earlier version, and we thank one of the anonymous referees for giving the paper a very thorough and careful reading. Finally, we thank Klaus Wagner for making our collaboration possible.
