Abstract. A Threshold Circuit consists of an acyclic digraph of unbounded fanin, where each node computes a threshold function or its negation. This paper investigates the computational power of Threshold Circuits. A surprising relationship is uncovered between Threshold Circuits and another class of unbounded fanin circuits which are denoted Finite Field Z P n Circuits, where each node computes either multiple sums or products of integers modulo a prime P n. In particular, it is proved that all functions computed by Threshold Circuits of size Sn n and depth Dn can also be computed by Z P n Circuits of size OS nlog Sn + nP n log P n and depth ODn. Furthermore, it is shown that all functions computed by Z P n Circuits of size S n and depth Dn can be computed by Threshold Circuits of size O 1 2 Sn log P n 1+ and depth O 1 5 Dn.
1. Introduction. A threshold k function is a boolean function whose output is 1 depending on whether at least k of its inputs have v alue 1. For example, a threshold 5 function is de ned to be 1 if at least 5 inputs are 1. A Threshold Circuit is a boolean circuit in which each node computes a threshold function or its negation, and the nodes have u n bounded fanin.
Many basic physical devices such as transistors and neurons can be modeled as threshold devices. Since an individual neuron may h a v e v ery high fanin, a Threshold Circuit is a natural model for a neural net. For reasons described below, we will be particularly concerned with bounded depth Threshold Circuits.
Certainly any massively parallel computing device that uses a large number of relatively slow components must have small computational depth on a given computation if the overall computation is to be fast. For example, the reaction time of the lower brain for many nontrivial behavioral and recognition responses is less than .5 seconds, whereas the synapse response time of most neurons of the brain is at least .005 seconds; therefore, the depth of these particular computations can be no more than 100. Nevertheless, in this small depth, many nontrivial functions are computed by the brain. Minsky and Papert were among the rst investigators to observe the relationship between the lower brain and constant depth Threshold Circuits 15 . In particular, they developed a model for a learning device, known as a Perceptron, which is essentially a threshold circuit with constant depth. There has been a considerable amount of renewed interest in models for the brain and for learning, and many of the recently proposed models are again essentially constant depth Threshold Circuits. Examples of these models include the Connectionist Models 5 and the Boltzmann Machine 1, 1 0 . Recently, P arberry and Schnitger proved that Boltzmann Machines can be simulated by constant depth Threshold Circuits 16 . This paper is a further theoretical investigation of bounded depth Threshold Circuits. In particular, we consider the following fundamental computational question:
What class of functions can be c omputed b y b ounded depth Threshold Circuits?
This paper is organized as follows: In x 2, we give de nitions of Threshold and Finite Field Circuits. In x 3, we give a precise statement of our results. In x 4, we give a simulation of Threshold Circuits by Finite Field Circuits. In x 5, we give simulations of polytime uniform Finite Field Circuits by polytime uniform Threshold Circuits, thus characterizing the functions computed by Threshold Circuits of depth Dn a s a certain class of multivariate polynomial functions computed by Finite Field Circuits of depth Dn. In x 6, we give similar simulation results for logspace constructible circuits. In x 7 w e prove a Hierarchy Theorem for size bounded Finite Field Circuits with increasing depth. In x 8, we conclude the paper with some open problems, conjectures, and some comments on how our theoretical results on Threshold Circuits might be applied to the construction of parallel arithmetic VLSI chips and to biological studies of learning in neuron nets by i n terpolation.
2. Circuit De nitions. 2.1. Circuits that Compute Boolean Functions. Fix a value domain . A function basis is a set F of functions over domain k , for each k 0. We assume a binary decoding function decode n;n 0 : f0; 1g n ! n 0 for decoding length n binary strings into n 0 values in , and an encoding function encode m 0 ;m : m 0 ! f 0 ; 1 g m , for binary encoding vectors of m 0 values in into binary strings of length m. W e will de ne circuits which take n binary values as input, decode these inputs to an n 0 -tuple of values in , make a computation using the functions in F, and then encode the outputs in binary.
A circuit C n over function basis F is an oriented, acyclic digraph with a list of input nodes v 1 ; ; v n 0 , a list of output nodes u 1 ; ; u m 0 , and a k-adic function in F labeling each noninput node with fanin k 0. Given a binary input string x 1 ; ; x n 2 f 0 ; 1 g n w e decode the input as decode n;n 0 x 1 ; ; x n = y 1 ; ; y n 0 where y 1 ; ; y n 0 2 n 0 , and assign each input node v i a v alue valv i = y i 2 , for i = 1 ; ; n 0 .F or each other node w, with say k predecessors w 1 ; ; w k , w e recursively assign w a v alue valw = f valw 1 ; ; v a l w k 2 , where f 2 F is the k-adic function that labels node w. C n nally outputs the binary string given by encode m 0 ;m valu 1 ; ; v a l u m 0 2 f 0 ; 1 g m where the output length m is xed for the circuit C n . Thus C n computes a boolean function from f0; 1g n to f0; 1g m .
We shall allow the circuits considered in this paper to have arbitrary fanin. The size of circuit C n is the number of edges of the circuit. The depth of circuit C n is the length of the longest path from any input node to an output node. A circuit family is an in nite list of circuits C = C 1 ; C 2 ; ; C n ; where C n has n binary inputs. C computes a family of boolean functions f 1 ; f 2 ; ; f n ; , where f n is the function of n binary inputs computed by circuit C n . Let the case of Threshold Circuits the value domain is = f0; 1g, so the number of input nodes is always the same as the number of boolean inputs, and decode and encode are simply the identity functions i.e., no decoding of inputs or encoding of outputs is required. We let T h S n ; D n denote the collection of boolean function families computed by polytime uniform Threshold Circuits of size OSn and simultaneous depth ODn. In addition, we will use the notation logspace uniform T h S n ; D n to denote the corresponding function families computed by logspace uniform Threshold Circuits. Note that with this notation, the class of all functions computed by Threshold Circuits having polynomial size and constant depth is T h n O 1 ; 1.
2.3. Finite Field Circuits. Let p be a prime number. For nite eld circuits, the value domain is Z p , the nite eld modulo p. W e will let F Z p denote the set of functions consisting of k-adic addition and multiplication taken modulo p for each k 1, as well as a constant function giving value y, for each y 2 Z p . A Finite Field Z p Circuit C n is a circuit over function basis F Z p . Let b = blog pc. Given binary inputs x 1 ; ; x n 2 f 0 ; 1 g , w e decode these inputs into n 0 = dn=be integer values decode n;n 0 x 1 ; ; x n = y 1 ; ; y n 0 , where the value y i 2 Z 2 b is the number with binary encoding x i,1b+1 ; x i , 1b+2 ; ; x minn;ib . Note that the decoding of binary inputs yields only numbers in the range f0; 1; ; 2 b ,1 g Z p . The circuit C n then makes a computation over F Z p as described in x 2. 3. Statement of Results. In the following we let Pn, Sn, and Dn b e a n y positive functions of n such that Sn n, and Pn is prime for all n.
We will rst give a simulation of polytime uniform Threshold Circuits by polytime uniform Finite Field Circuits: c n x , x 0 n over an interval jx , x 0 j where 0 1 , and the coe cients are r ationals c n = an bn where a n ; b n are integers of magnitude 2 n O1 . Then polytime uniform Threshold Circuits of polynomial size and simultaneous constant depth i.e., a function in T h n O 1 ; 1 can compute fx over this interval within accuracy 2 ,n c for any constant c 1.
Note that Corollary 3.3 follows directly from Theorem 3.2 since a Finite Field Z Pn Circuit of size n O1 and depth O1 with Pn = 2 n O 1 can simulate the rational arithmetic required to approximately evaluate fx. Corollary 3.3 implies see 18 that T h n O 1 ; 1 contains a surprisingly rich class of elementary functions which can be computed within accuracy 2 ,n c including: integer reciprocal, sine, cosine, exponential, logarithm, and square root, as well as exact computation of the following:
1. integer and polynomial quotient and remainder, 2. interpolation of rational polynomials, 3. banded matrix inverse, and 4. triangular Toeplitz matrix inverse. These problems can all be e ciently reduced to integer products; also see 3, 4, 12, 18 . Next, we will give a simulation of logspace uniform Finite Field Circuits by logspace uniform Threshold Circuits. Theorem 3.5. logspace uniform Z Pn Sn; D n logspace uniform T h Sn logPn O1 ; D n log loglog P n:
The proof of Theorem 3.5 uses techniques developed by Reif for integer division by uniform boolean circuits of bounded fanin, polynomial size, and Olog n loglogn depth 18 . Theorem 3.5 implies that logspace uniform T h n O 1 ; loglog n contains the various elementary functions listed above.
Finally, w e derive some lower bound results for Finite Field Circuits using algebraic degree arguments. 4.1. Proof of Theorem 3.1. Let C n be a polytime uniform Threshold Circuit of n binary inputs x 1 ; ; x n 2 f 0 ; 1 g n , where C n has size Sn and depth Dn. For any prime p = P n Sn, we will construct a Z p Circuit C 0 n which will also take n binary inputs x 1 ; ; x n 2 f 0 ; 1 g n . Let b = blog pc. By de nition see x 2.3, C 0 n must have n 0 = d n b e input nodes v 1 ; ; v n 0 which are assigned integers valv 1 = y 1 ; ; v a l v n 0 = y n 0 , where decode n;n 0 x 1 ; ; x n = y 1 ; ; y n 0 .
The rst di culty w e m ust overcome is to compute within C 0 n the boolean encoding x i,1b+1 ; x i , 1b+2 ; ; x minn;ib 2 f 0 ; 1 g of each i n teger y i i.e., these boolean values must be computed by C 0 n from the y i values using only addition and multiplication modulo p. By Lemma 4.1, there exists a polynomial f j y of degree p , 1 which when evaluated in Z p gives the boolean value of the jthbit of y 2 Z p , s o e a c h x i , 1b+j = f j y i can be computed in C 0 n using size Op logp and depth O1. The total size required here is Onp log p.
Next we m ust simulate in C 0 n a threshold function k; of k binary inputs, say and the boolean function computed by C 0 n is exactly the same as the function computed by C n . The constructed Z Pn Circuit C 0 n has OSn log Sn+nPn log Pn size and ODn depth, and C 0 n is polynomial time constructible thus completing the proof of Theorem 3.1.
Note that if C n is logspace uniform, then C 0 n is also logspace uniform.
5. Simulation of Finite Field Circuits by Threshold Circuits. 5.1. Computing Arithmetic Using Polytime Constructible Threshold Circuits. The problem of nding the sum of a set of numbers is called the iterated sum problem. Pippenger has given a constant depth threshold circuit for multiplication, and the method used is the straight-forward reduction to iterated sum i.e., the gradeschool method" of multiplication 17 . Looking at just the iterated sum circuit, we see that Pippenger's circuit for adding m values, each o f n bits, has size Onm 2 and depth O1. In the following lemma, we show h o w to produce a constant depth circuit for iterated sum with smaller size. Proof. Since m n O1 , it is trivial to show that the result of the iterated sum will have less than cn bits for some constant c.
To calculated the iterated sum, we build a computation tree with maximumfanout bm c and m leaves. Placing the m input values at the leaves, computation proceeds toward the root of the tree with each i n ternal node computing the sum of its children. After all computations, the root contains the sum of all m input values. It is easy to see that the desired tree has Om 1, i n ternal nodes, and a height o f O 1 . We use Pippenger's circuit at each i n ternal node for a node size of Onm 2 , so the total circuit size is Onm 1+ . Since the depth of each node in the tree is constant, the total depth of the circuit is the same as the height of the tree, or O 1 . Using this result, we can also construct small size circuits for discrete Fourier transform. Let DFT M denote the discrete Fourier transform of an M-vector. Proof. The circuits that we construct are actually for multiplying two N-bit numbers modulo 2 N +1, where N is a power of 2. For exact non-modular multiplication of N 0 bit numbers, we use the same circuit with N = 2 d log N 0 e+1 . It is easy to show that this will produce the exact answer.
We will denote the two input numbers by a and b, and their product by c. Since N i s a p o w er of 2, let N = 2 n , where n is an integer. Letting m = 2 b nc , w e can write any N-bit numbera as an m-vector of blocks of s = N m bits, a = a 0 ; a 1 ; ; a m , 1 ; a 0 is the block of least signi cant bits. We can view this vector as a vector of polynomial coe cients, and de ne the polynomial Ax = P m , 1 i =0 a i x i . Note that A2 s = a . De ning a polynomial for b in a similar way, the product polynomial Cx = A x B x will be such that C2 s = c . W e use discrete Fourier transforms for the polynomial multiplication, and since the product polynomial will have degree 2m , 2, we m ust calculate the transform of 2m-vectors. We could actually use wrapped convolutions on m-vectors, but nothing is gained over our asymptotic bounds. Looking at the straight-forward method of polynomial multiplication, it is easy to bound max 0i 2m fc i g m 2 2 s m 2 2s + 1.
Since m and 2 2s +1m ust be relatively prime, we can calculate the coe cients of Cx modulo both m and 2 2s + 1, and combine these results for the nal answer modulo m2 2s +1. This ring includes as a subset the range of all possible results, so the result of these modular calculations is also the exact non-modular answer. The calculations modulo m can be done using Lemma 5.1 and grade-school multiplication", with a total size of ON 1+ as long as 0:6. We will now concentrate on the cost of the calculations modulo 2 2s + 1 . W e will again use a divide and conquer tree with the root labeled as level 0. The fanout of the tree is 2m, and it should be obvious that on level i we are computing 4 .
Proof. The method of Chinese Remaindering is taken straight from 8 , using the multiplication circuit of Lemma 5.3. The proof of the size and depth of the circuit is also analogous to that found in 8 , and is not included in this paper.
The last basic problem we will look at is that of iterated product over a nite 5 . W e will perform the iterated product in a tree similar to the tree used for iterated sum. The tree will have fanout m , and will perform an iterated product of m values in Z p at each node. The iterated product at each node is computed by performing a Chinese Remainder step, followed by calculating the iterated product over each of the smaller elds using discrete logs, iterated sum, and powering, and nally a Chinese Remainering step to recover the full result. This produces the exact iterated product, and by m ultiplying by a n m log p bit approximation to 1=p, w e can nd the residue modulo p.
To insure there is no loss of information, we m ust be sure that Q p i is greater than the maximum possible result. Speci cally, w e m ust insure that
By basic number theoretic results, we can achieve this with s p s = m log p.
Obviously, s log p, so the condition of Lemma 5.4 is satis ed, and we m a y construct the required Chinese Remaindering circuit with size O 1 2 m 2 logp 1+ and depth O 1 4 . After performing the initial Chinese Remaindering step, we m ust perform an iterated product over each of the p i . Since for all prime p i , Z pi is a cyclic group, there is a not necessarily unique generator | call it g i | that generates the entire group. Let f i x = g x i ; due to the fact that g i is a generator, this function is one-to-one and onto over Z pi . W e make tables for f i x and f ,1 i x, each of size Op i logp i . Within a particular eld, there must be tables for all m input values, so the total size taken up by tables for p i is Om p i log p i . The iterated product is calculated by taking the discrete logarithm of all input values f ,1 i x, above, performing the iterated sum of these values modulo p i , 1, then raising the generator to the resulting power in Z pi this is just f i x, above. This is a fairly common method of performing iterated product see, for example, 2 . The only part we h a v en't examined here is the iterated sum. By Lemma 5.1, we can calculate the exact iterated sum of m numbers, each of log p i bits, in size Om + 2 log p i and depth O 1 . With an m log p i bit approximation to 1=p i ,1, we can reduce this exact result to the result modulo p i ,1 with a single multiplication. 4 depth. Since this must be done for all s prime elds, the total size complexity of iterated product of m numbers is s times the above v alue, plus the cost of Chinese Remaindering. Using the upper bounds for s and p i , the total size is O 1 m 5 log p 1+2 , and the total depth is O 1 4 . With an m log p bit approximation to 1=p, we can reduce this result the exact iterated product modulo p. The complexity of this multiplication is negligible compared to the rest of the circuit.
All the above results are for one node of the tree. Summing over all nodes and rewriting in terms of , the total size is O 1 2 m 1+5 log p 1+2 = O 1 2 m log p 1+ , and the depth is O 1 5 . Note: All Threshold Circuit families considered in this section can easily be seen to be constructed in polynomial time.
5.2. Proof of Theorem 3.2. Now w e are ready to prove Theorem 3.2. Consider any polytime uniform Z Pn Circuit C n with size Sn and depth Dn. We wish to simulate C n by a Threshold CircuitĈ n . W e will precompute an Sn log Pn bit approximation of the reciprocal of Pn so that a residue computation modulo Pn node of fanin k can be done by just Ok additions and multiplications on Olog P n bit binary numbers, followed by a residue computation using an Ok log Pn bit approximation to 1 Pn ; therefore, each iterated sum or iterated product required at a node of C n can be done by applying Lemmas 5.1 and 5.5 using only size O 1 2 k log Pn 1+ and depth O 1 5 . The total size of the Threshold CircuitĈ n is O 1 2 Sn log Pn 1+ , and the depth is O 1 5 Dn; furthermore, the circuit familyĈ is constructible in polynomial time, completing the proof of Theorem 3.2.
6. Log Space Uniform Threshold Circuit Simulation of Arithmetic and Finite Field Circuits. Let a 1 ; ; a m 2Z 2 n . Let Dm; n be the depth required to compute Q m i=1 a i mod 2 n + 1 using a logspace uniform Threshold Circuit of size mn O1 .
Lemma 6.1. Dm; n Dm; Omn 1=2 + O 1.
Proof. We use a reduction of Reif from iterated product to discrete Fourier transform 18 . Assume without loss of generality that n i s a p o w er of 2, and let n = Omn 1=2 also be a power of 2. Given a 1 ; ; a m 2Z 2 n , w e let a i;j for i = 1 ; ; mand j = 0 ; ; n , 1 be integers in Z 2 n=n such that a i = Pn ,1 j=0 a i;j 2 jn=n .
To calculate an iterated product, we rst compute in the vector g i;0 ; ; g i;n,1 = DFTa i;0 ; 2a i;1 ; ; 2 n , 1 a i;n,1 T for i = 1 ; ; m . By Lemma 5.2, we can easily compute these DFTs in polynomial size and constant depth. by rst computing the dm=n e iterated products of n integers and repeating this d log m log n e times, getting Dm; n d log m log n eDn ; n + O 1. Applying Lemma 6.1 and this recurrence a constant n umber of times, we get Dn ; n D n ; n 1 = 2 + O 1 Dn =2 ; n 1 = 2 + O 1: Finally, applying the above recurrence log logn times, we get Dn ; n O loglog n. Hence Dm; n = O logm logn Ologlog n; which is the bound claimed in the lemma.
6.1. Proof of Theorem 3.5. Note that Lemma 6.2 implies that iterated product of n O1 integers with n bits each is in logspace uniform T h n O 1 ; loglog n. Since computing the n bit approximation of the reciprocal of an n bit number reduces to simply computing the iterated sum of n iterated products of size n, w e can also compute residues modulo a number with n bits in logspace uniform T h n O 1 ; loglog n. Theorem 3.5 immediately follows, since we m ust compute residues, iterated sums and iterated products of n = Olog P bit numbers. We make the induction hypothesis that the lemma holds for all polynomials with k v ariables. Since fy 1 ; ; y k is nonzero, 9u 1 ; ; u k 2 Z p k such that fu 1 ; ; u k 6 = 0. Hence fu 1 ; y 2 ; ; y k = f 0 y 2 ; ; y k is not a zero polynomial, and by the induction hypothesis 9a 2 ; ; a k 2A k , 1 such that f 0 a 2 ; ; a k = f u 1 ; a 2 ; ; a k 6 = 0. Let gy 1 = f y 1 ; a 2 ; ; a k . gx is clearly a non-zero polynomial, so by the basis step there is an a 1 2 A such that ga 1 6 = 0, and we h a v e constructed a 1 ; a 2 ; ; a k 2 A k such that fa 1 ; ; a k 6 = 0 .
Note: A similar lemma for polynomial identity testing in in nite elds was proved by Ibarra and Moran 11 .
7.1. Proof of Theorem 3.6. Fix any positive i n teger functions Sn and Dn, where Dn = O S n c 0 for some constant c 0 1, and Sn n. N o w consider a sequence of primes fP1; P 2; ; P n ; gwhere 6Sn=Dn Dn P n 2 n .
We will construct a family of Z Pn circuits C = C 1 ; C 2 ; ; C n ; of size Sn and depth Dn. In particular, we let v 1 ; ; v n 0 be the input nodes of C n , where n 0 = dn=be and b = blog P nc n . W e also let w 0 = v 1 denote the rst input node. Each level L = 1 ; ; D n o f C n consists of a single product" node w L with bSn=Dnc edges entering w L from node w L,1 , so that valw L is the bSn=Dnc power of valw L,1 ; w Dn is the unique output node of C n . Let y 1 = valv 1 ; ; y n 0= valv n 0 be the input values, and letỹ = y 1 ; ; y n 0 . We h a v e constructed C n of size Sn and depth Dn so that its output is the d n = b S n =Dnc Dn degree polynomial f n ỹ = valw Dn 0 . The degree of h n ỹ is easily seen to be 3d n , and it is also obvious that h n ỹ i s not identically zero. Let A = Z 2 b , and since we know that degreeh n ỹ = 3d n 3 Sn Dn Dn Pn 2 jAj;
we can use Lemma 7.1 to see that h n ỹ 6 = 0 for at least one n 0 -tuple a 1 ; a 2 ; ; a n 0 2 Z 2 b n 0 . Theorem 3.6 follows immediately. show that Threshold Circuits of polynomial size and constant depth can compute high accuracy approximations to a large class of multivariate rational polynomials, and furthermore can interpolate rational polynomials with a constant n umber of variables. Learning by algebraic interpolation appears to be appropriate in certain constrained cases such a s l o w level vision 7 , and would likely be much more e cient than previously proposed methods for learning such as found in 1 and 9 , which are essentially brute force. Nevertheless, even making the apparently reasonable assumption that certain portions of the lower brain act essentially as Threshold Circuits of constant depth does not necessarily imply that the lower brain is wired so as to compute approximations or interpolations of multivariate polynomials. However, our theoretical results do provide strong evidence of the feasibility of neuron nets which evaluate and interpolate such polynomial functions. A neural biologist might, for example, make experimental tests to verify this by using a computer to monitor input-output response functions of neuron nets. Specifically, the lower brain very rapidly provides feedback control for certain muscles; this control appears to be smooth and nonlinear. Such easily observable responses would appear to be ideal to monitor and to interpolate. By using known randomized multivariate polynomial identity tests, such as those of Ibarra and Moran 11 , one can with very high likelihood verify that the input-output response of a neuron net is a speci c interpolated multivariate polynomial. DETERMINANT is not contained i n T h n c ; 1 for any constant c.
