Neural circuits for pattern recognition with small total wire length  by Legenstein, Robert A. & Maass, Wolfgang
Theoretical Computer Science 287 (2002) 239–249
www.elsevier.com/locate/tcs
Neural circuits for pattern recognition with small
total wire length
Robert A. Legenstein∗, Wolfgang Maass
Institute for Theoretical Computer Science, Technische Universitat Graz, Graz, Austria
Abstract
One of the most basic pattern recognition problems is whether a certain local feature occurs
in some linear array to the left of some other local feature. We construct in this article circuits
that solve this problem with an asymptotically optimal number of threshold gates. Furthermore it
is shown that much fewer threshold gates are needed if one employs in addition a small number
of winner-take-all gates. In either case the circuits that are constructed have linear or almost
linear total wire length, and are therefore not unrealistic from the point of view of physical
implementations. c© 2002 Elsevier Science B.V. All rights reserved.
Keywords: Neural networks; Total wire length; Circuit complexity; Winner take all; Pattern recognition
1. Introduction
Biological neural circuits can solve a number of complex pattern recognition tasks
very fast, in 100–150 ms, see [8]. Since the computational units of neural circuits are
relatively slow compared with a transistor, observation gives rise to some optimism
regarding the possibility to build arti;cial circuits, for example analog VLSI chips,
that solve complex real-world pattern recognition tasks in real-time. Classical circuit
complexity theory is of little help in the search for such super-e<cient circuit designs.
Apparently there are two reasons for this. The complexity of circuits is usually analyzed
in terms of their number of gates, and much of the existing work focuses on the
derivation of polynomial upper bounds for the number of gates. But most circuits
that appear to be feasible from this point of view cannot be practically implemented,
 Research for this article was partially supported by the Fonds zur F>orderung der wissenschaftlichen
Forschung (FWF), Austria, project P12153, and the NeuroCOLT project of the EC.
∗ Corresponding author.
E-mail addresses: legi@igi.tu-graz.ac.at (R.A. Legenstein), maass@igi.tu-graz.ac.at (W. Maass).
0304-3975/02/$ - see front matter c© 2002 Elsevier Science B.V. All rights reserved.
PII: S0304 -3975(02)00097 -X
240 R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249
especially if the number n of input variables is very large (like for example in vision
tasks where often n≈ 106). Furthermore even those circuit designs where one has been
able to derive linear or almost linear upper bounds for the number of gates can usually
not be implemented in VLSI because the required number of wires (= edges), or the
required length of wires grows too fast with the number n of input variables. Therefore,
we focus in this article directly on the total wire length (the de;nition is given below)
as the most salient complexity measure, the usually most restricted and hence arguably
most relevant complexity measure for the practical implementation of an abstract circuit
design.
Another obstacle for the application of classical circuit complexity theory to the
design of e<cient circuits for pattern recognition arises from the fact that most com-
plexity studies focus on arithmetic and graph-theoretic problems, rather than on those
computational tasks that typically arise in the context of pattern recognition. Both, in
common machine vision approaches and in biological neural circuits for vision, the
raw pixel image is ;rst preprocessed by an array of local feature detectors (e.g. for the
detection of edge segments, line segments, Gabor ;lters). Hence pattern recognition
problems in vision typically require to ;nd particular spatial arrangements of those
local features, that are reported by local feature detectors. The local feature detec-
tors are typically arranged in a 1- or 2-dimensional array that reIects the geometrical
relationship between their receptive ;elds in the sensory space. In order to initiate
a computational complexity analysis of algorithmic problems of this type we investi-
gate in this article the arguably most simple problem of this type. We assure that there
are two types of local feature detectors with binary output that are linearly arranged at
n positions: detectors a0; : : : ; an−1 for feature a and detectors b0; : : : ; bn−1 for feature b.
The pattern recognition task is to decide whether feature a is reported at a location i
to the left of some location j where feature b is reported. In other words, we analyze
the circuit complexity of the Boolean function PnLR from {0; 1}2n into {0; 1} with
PnLR(a1; : : : ; an; b1; : : : ; bn) =
{
1 if ∃i; j(i ¡ j and ai = bj = 1);
0 else:
We investigate in this article circuits that compute PLR with two types of gates that
are both frequently discussed in models for neural computation: threshold gates and
winner-take-all (WTA) gates. Both of these gates can be implemented very e<ciently
in analog VLSI, with an area that grows just linearly with the number k of inputs to
the gate, see [6,2]. A threshold gate computes a Boolean function T : {0; 1}k →{0; 1}
of the form T (x1; : : : ; xk)= 1⇔
∑k
i= 1 wixi¿w0. A winner-take-all gate with weights
w1; : : : ; wk computes a Boolean function W : {0; 1}k →{0; 1}k where for input x1; : : : ; xk
the ith output is 1 if and only if wixi¿wjxj for all j = i.
We propose the following abstract model for estimating the total wire length required
for the neural implementation of an abstract circuit design (which is formally de;ned
as a directed graph with nodes labeled by speci;c types of gates, or by input or output
variables):
Gates, input- and output-ports of a circuit are placed on di3erent nodes of a 2-
dimensional grid (with unit distance 1 between adjacent grid nodes). Connections
R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249 241
between them are represented by (unidirectional) wires that run through the grid-
plane in any way that the designer wants, in particular wires may cross and need not
run rectilinearly (wires are thought of as running in the 3-dimensional space above
the plane, without charge for vertical wire segments). 1 We de6ne the minimal value
of the sum of all wire lengths that can be achieved by any such arrangement as the
total wire length of the circuit.
We would like to make this model also applicable to cases where for k¿2
threshold-, or winner-take-all functions of k inputs are computed in analog VLSI
by e9cient subcircuits that employ a number of transistors, total wire length and
area that are all linear in k, with a setting time that is independent of k. 2 We model
such computational modules as “threshold gates” or “winner-take-all gates” of k in-
puts, that take one unit of time for their computation like all the other gates, but
which occupy each a set of k intersection points of the grid that are all connected
by an undirected wire (whose length contributes to the total wire length) in some
arbitrary fashion. 3
The attractiveness of this model lies in its mathematical simplicity, and in its gener-
ality (see [3,4] for a more detailed analysis of the complexity measure total wire length,
and results on the total wire length of circuits that solve two other pattern recognition
tasks). It provides a rough estimate for the cost of connectivity both in arti;cial (basi-
cally 2-dimensional) circuits and in neural circuits, where 2-dimensional wire crossing
problems are apparently avoided (at least on a small scale) since dendritic and axonal
branches are routed through 3-dimensional cortical tissue. We also give bounds on the
complexity of our circuit designs in the common abstract model for VLSI.
We refer to Section 12.2 in [7] for the precise de;nition of the abstract model for
VLSI-area to which the theorems in this article refer. One assumes there that gates,
input- and output-ports and wires cover rectilinear areas with a width and separation of
at least . Areas occupied by diLerent gates, input- and output-ports are not allowed to
intersect with one another. Areas occupied by wires may intersect with areas occupied
by gates, input- and output-ports and also with other wires, but there is a constant
bound  on the number of wire areas to which a point of the plane may belong. The
complexity measure induced by this model is the area of the smallest rectangle that
encloses the circuit. We follow [7] in assuming that in the VLSI-model one unit of
time is needed to transmit a bit along a wire (of any length), and also for each gate
switching. However in contrast to [7] we always assume that all inputs are presented
in parallel.
We will show in Theorem 1 that PnLR can be computed by a circuit consisting of
O(log n) threshold gates in depth 2, with a total wire length of O(n log n). Theorem 2
implies that no feedforward circuit can compute PnLR with fewer threshold gates. Finally
it is shown in Theorem 3 that PnLR can be computed by a circuit of depth 2 consisting
1 We will allow that a wire from a gate or input port may branch and provide input to several other gates.
For reasonable bounds on the maximal fan-out (104 in the case of neural circuits) this is realistic both for
neural circuits and for VLSI.
2 See [2].
3 Any one of these k nodes may be used to provide one of the k inputs or to extract one of the outputs
of the function.
242 R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249
of two winner-take-all gates and one threshold gate, with total wire length O(n). This
result demonstrates that winner-take-all gates can in some contexts be computationally
much more powerful than threshold gates, although they do not require much more
area in analog VLSI (see [5] for some more general results in this direction).
2. Global pattern detection in 1-dimensional maps
We start the analysis of this pattern recognition task by showing that PnLR can be
computed very fast by a circuit consisting of O(log n) threshold gates. We also give
bounds on the total wire length of this circuit and the area that it occupies in a VLSI
layout.
Theorem 1. PnLR can be computed by a feedforward circuit of depth 2, consisting of
2 log n+ 1 threshold gates with total wire length O(n log n) and area O(n log n) in
a VLSI layout.
Proof. Denote with a=(a0; : : : ; an−1) and b=(b0; : : : ; bn−1) the two vectors of in-
put features. It will be convenient to denote the position l of the leftmost occurring
feature a with min(a) and the position r of the rightmost occurring feature b with
max(b). Note that these functions are not de;ned if there is no feature a respec-
tively b present. The following precise de;nition eliminates this ambiguity. We de;ne
min(a)=min{i|ai =1} if a =(0; : : : ; 0) and min(a)= n− 1 otherwise. Furthermore we
de;ne max(b)=max{i|bi =1} if b =(0; : : : ; 0) and max(b)= 0 otherwise. Note that
with this simple de;nition, PnLR(a; b)= 1⇔min(a)¡max(b). We construct a threshold
circuit which computes the binary encoding of min(a) and max(b) in its ;rst layer.
Let us call the function that maps a onto the binary representation of min(a) Min-
Mux and the function that maps b onto the binary representation of max(b)MaxMux
respectively. The comparison of their outputs yields the desired output of PnLR.
For convenience, let n=2k for some natural number k. The precise de;nitions of
the functions MinMux and MaxMux are as follows.
MinMuxn: {0; 1}n → {0; 1}k is de;ned by
MinMuxn(a) =
{
binary encoding of min{i|ai = 1} if ∃i(ai = 1);
binary encoding of n− 1 otherwise:
MaxMuxn: {0; 1}n → {0; 1}k is de;ned by
MaxMuxn(b) =
{
binary encoding of max{i|bi = 1} if ∃i(bi = 1);
binary encoding of 0 otherwise:
This comparison of the two log n-bit binary numbers represented by MinMux and
MaxMux can be carried out by an additional threshold gate with weights linear in n.
In the following, we construct a circuit consisting of log n threshold gates that
computes MinMux. Note that, for any input assignment, setting an−1 = 1 does not
R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249 243
change the value of the function. We will use this trick to make sure that the output
of the circuit is the binary encoding of n− 1 if there is no feature a present.
Let mj denote the jth output bit of MinMuxn (06j6k−1), such that min(a)=
∑k−1
j=0
2jmj. The jth bit of the binary encoding of some natural number x is 1 if 
x=2j≡1 mod
2 and 0 otherwise.
This leads to the following threshold function for mj:
mj(a0; : : : ; an−1) =
{
1 if
∑n−1
i=0 ai2
n−i(−1)1+(i=2j mod 2)¿1;
0 otherwise:
Let l=min(a) and suppose that 
x=2j≡ 0 mod 2. It follows that
n−1∑
i=0
ai2n−i(−1)1+(i=2j mod 2)
6 2n−l(−1) +
n−1∑
i=l+1
2n−i
6 −2n−l +
n−l−1∑
i=1
2i 6 −2 ¡ 1
and the output of the threshold gate is 0. Suppose that 
x=2j≡ 1 mod 2. It follows
that
n−1∑
i=0
ai2n−i(−1)1+(i=2jmod 2)
¿ 2n−l(+1)−
n−1∑
i=l+1
2n−i
¿ 2n−l −
n−l−1∑
i=1
2i ¿ 1
and the output of the threshold gate is 1. Hence, mj is the jth bit of the binary
representation of min(a).
MaxMux can be constructed in a similar manner. Hence, each mj can be computed
by one threshold gate and the depth and size of the circuit given in Theorem 2 follow.
The VLSI-layout of the circuit for P4LR is shown in Fig. 1a. We place the gates for
MinMux on rows beneath a0; : : : ; an−1 and the gates for MaxMux on rows beneath
b0; : : : ; bn−1. Since the circuit consists of log n gates for MinMuxn and log n gates
for MaxMuxn this occupies O(log n) rows. The comparison gate can be placed in
the column between those gates. Hence, the layout of the circuit occupies O(n log n)
area. A layout to estimate the total wire length is similar. The layout of the circuit for
P4LR in is shown in Fig. 1b. Simply replace a threshold gate of k inputs by k nodes
that are connected by a common wire to sum up the inputs. This results in a wire
length of O(n) within each gate. The wire from an input port to its successor gates
244 R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249
P
LR
4
b                    b bb0 1             2                3
(a)
P
LR
4
a a a a1 2 30 b                       b bb0 1                2            3
(b)
a a a a1 2 30
Fig. 1. (a) The VLSI-circuit layout for P4LR. The gates for MinMux and MaxMux are placed on rows
beneath the inputs. The area used by this layout is O(n log n). (b) A layout to estimate the total wire length
of the circuit. A threshold gate of k inputs is represented by k nodes that are connected by a wire (wires
without arrows). Such gates are indicated by a dashed rectangle. The total wire length is O(n log n).
may spread and hence is O(log n) in length. The comparison gate has a total wire
length of O(log n). Summing up those lengths, results in a total wire length of O(n
log n).
The following lower bound result shows that the number of threshold gates used by
the circuit of Theorem 1 is asymptotically optimal:
Theorem 2. Any feedforward circuit consisting of threshold gates needs to have at
least O(log n) gates for computing PnLR.
We use the gate-elimination method to prove Theorem 2. The gate-elimination
method was used widely in classic circuit complexity theory. It was used in the context
of threshold circuits in a paper by Georg Schnitger and Bhaskar DasGupta (see [1]).
In our case we have to exhibit some properties of PLR that allow us to assign constants
to inputs of a circuit Sn that computes PnLR, such that the circuit computes PLR on the
remaining non-constant variables. Furthermore, we use these properties to show that
at least one threshold gate computes a constant after the assignment of constants to at
most 63n=64 of its input variables. We use this restriction to construct a circuit that
computes Pn=64LR and has at least one gate less than Sn. Hence, the size of Sn is at least
Sn=64 +1, which we use as an induction step. The induction hypothesis is that a circuit
Sn that computes PnLR consists of at least 
log64 n threshold gates. 4
4 x denotes the Ioor of x, which is x=max{y∈N∪{0}|y6x}.
R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249 245
Proof. We will at ;rst exhibit the three properties of PLR that will be the basis for the
proof. Then we will show, how to eliminate one threshold gate in a circuit computing
PLR by assigning constants to a ;xed fraction of its inputs. Finally, we will use this
gate-elimination to give an inductive prove of the lower bound.
The properties of PLR given below are easy to verify.
Property 1.
PnLR(a0; : : : ; ai−1; 0; ai+1; : : : ; an−1; b0; : : : ; bi−1; 0; bi+1; : : : ; bn−1)
= Pn−1LR (a0; : : : ai−1; ai+1; : : : ; an−1; b0; : : : ; bi−1; bi+1; : : : ; bn−1)
for all i∈{0; : : : ; n− 1}:
Property 2.
PnLR(0; : : : ; 0; ak+1; : : : ; an−1; 1; : : : ; 1; bk+1; : : : ; bn−1)
= Pn−kLR (ak+1; : : : ; an−1; bk+1; : : : ; bn−1) for all k ∈ {0; : : : ; n− 2}:
Property 3.
PnLR(a0; : : : ; an−1−k ; 1; : : : ; 1; b0; : : : ; bn−1−k ; 0; : : : ; 0)
= Pn−kLR (a1; : : : ; an−1−k ; b1; : : : ; bn−1−k) for all k ∈ {1; : : : ; n− 1}:
Let Sn be a threshold circuit computing PnLR. We show how to eliminate one gate
in Sn by exploiting the properties of PLR given above. We assume that n is a power
of 64. If it is not, use Property 1 to obtain a threshold circuit such that the number of
non-constant inputs to the circuit is the next lower power of 64.
Let g be a gate in Sn which does not have an output of a gate as one of its inputs.
Then g computes the function
g(a; b) =
{
1 if
∑n
i=0 uiai +
∑n
i=0 vibi¿t;
0 else:
First, we need all the weights for a to have same sign and all the weights for b to have
same sign, where sign(x) :R→{−1;+1} is +1 for all x∈R+ ∪{0} and −1 otherwise.
More formally, we want
sign(ui) = sign(uj) for all i; j;
sign(vi) = sign(vj) for all i; j:
This can be achieved by setting at most 3n=4 variables in a and at most 3n=4 variables
in b to constant zero. By Property 1, the circuit computes Pn=4LR on the remaining
non-constant variables. We renumber the remaining m= n=4 variables in a, the n=4
remaining variables in b (we preserve the order) and the corresponding weights. Let
246 R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249
1 =
∑m=2−1
i= 0 ui, 2 =
∑m−1
i=m=2 ui,  1 =
∑m=2−1
i= 0 vi and  2 =
∑m−1
i=m=2 vi. We consider four
cases:
Case 1: sign(ui)=+1, sign(vi)=−1 for all i=0; : : : ; m− 1.
Case 1.1: | 1|¿2 − t.
We set a0 = · · · = am=2−1 = 0 and b0 = · · · = bm=2−1 = 1. By Property 2
of PLR, the circuit computes P
m=2
LR on the remaining non-constant variables.
It follows that
−| 1|+
m−1∑
i=m=2
|ui|ai −
m−1∑
i=m=2
|vi|bi ¡ t − 2 + 2 −
m−1∑
i=m=2
|vi|bi ¡ t:
Hence g(a; b)= 0 for all possible values of am=2; : : : ; am−1 and bm=2; : : : ;
bm−1.
Case 1.2: | 1|62 − t:
We set am=2 = · · · = am−1 = 1 and bm=2 = · · · = bm−1 = 0. By Property 3
of PLR, the circuit computes P
m=2
LR on the remaining non-constant variables.
It follows that
m=2−1∑
i=0
|ui|ai −
m=2−1∑
i=0
|vi|bi + 2 ¿
m=2−1∑
i=0
|ui|ai − | 1|+ | 1|+ t ¿ t:
Hence g(a; b)= 1 for all possible values of a0; : : : ; am=2−1 and b0; : : : ;
bm=2−1.
In case 1, there remain 2 ·m=2=2 · n=8 non-constant variables after the restriction.
Case 2: sign(ui)=− 1, sign(vi)= + 1.
We can treat this case in a similar manner as case 1.
Case 3: sign(ui)= + 1, sign(vi)= + 1
Case 3.1:  1¿t:
We set a0 = · · · = am=2−1 = 0 and b0 = · · · = bm=2−1 = 1. By Property 2
of PLR, the circuit computes P
m=2
LR on the remaining non-constant variables.
Furthermore it follows that g(a; b)= 1 for all possible values of non-
constant inputs.
Case 3.2: 2¿t:
We set am=2 = · · · = am−1 = 1 and bm=2 = · · · = bm−1 = 0. By Property 3
of PLR, the circuit computes P
m=2
LR on the remaining non-constant vari-
ables. It follows that g(a; b)= 1 for all possible values of non-constant
inputs.
After any of these restrictions, 2 · n=8 non-constant variables remain and the circuit
computes Pn=8LR . For the following restriction, we can assume  1¡t and 2¡t. In
this case, the weights for the second half of the remaining inputs to g are small.
So our aim will be to eliminate variables with large weights in g. Then, the sum
of the remaining inputs to g will be too small to reach the threshold and the gate
will output a constant for all possible values of non-constant inputs. In a ;rst step,
R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249 247
we set all inputs that contribute to 1 and  1 constant zero. The eLect is that all
weights of a’s are small. We use another restriction to maintain small weights for
non-constant b’s. Then we set the inputs that have largest weights constant zero.
We need to do this for at most 3=4 of the remaining variables to let the gate output
zero for all possible values of non-constant inputs.
Case 3.3:  1¡t, 2¡t:
We set a0 = · · · = am=2−1 = b0 = · · · = bm=2−1 = 0. By Property 1 of PLR,
the circuit computes Pm=2LR on the remaining non-constant variables. Let
l =m=2= n=8. Again, renumber the non-constant variables and correspond-
ing weights of g, so that
g(a; b) =
{
1 if
∑l−1
i=0 uiai +
∑l−1
i=0 vibi ¿ t;
0 else:
Let ′1 =
∑l=2−1
i=0 ui, 
′
2 =
∑l−1
i= l=2 ui,  
′
1 =
∑l=2−1
i= 0 vi and  
′
2 =
∑l−1
i= l=2 vi.
Since ′1 + 
′
2 = 2¡t, we have 
′
1¡t. If  
′
1¿t, case 3.1 applies and
l=2= n=16 variables remain. Finally we consider weights such that ′1¡t
and  ′1¡t. In this case, we set ai = bi =0 for i= l=2; : : : ; l − 1 to elimi-
nate the second half of the inputs (Property 1 of PLR applies). Then, by
Property 1 of PLR, we set those l=4 remaining variables in a to zero that
have maximal weights. We also eliminate those l=8 remaining variables
in b with maximal weights. It follows that the overall sum of the re-
maining variables cannot reach t and g(a; b)= 0 for all possible values of
non-constant inputs. There will remain at least 2 l=8=2 n=64 non-constant
variables.
Case 4: sign(ui)=− 1, sign(vi)=− 1:
We can treat this case in a similar manner as case 3.
We have constructed a threshold circuit that has at least one gate less and computes
Pn=64LR .
We use this property of Sn to give an inductive proof of the lower bound. The
inductive hypothesis is, that size(Sn)¿
log64 n. Since we use the Ioor of log64 n in
the bound, we can use induction on n for all n of the form n=64m for some natural
number m.
In the basis case, we have n=64. Use Property 1 to obtain a circuit that computes
P2LR. Since, P
2
LR(a0; a1; b0; b1)= a0 ∧ b1, a circuit that computes P64LR consists of at least
one threshold gate. Hence, the hypothesis holds for the induction basis. For the induc-
tion step, consider a threshold circuit Sn that computes PnLR. We show that, if the size
of Sn is small, then we can construct a circuit Sn=64 with a smaller size than is possible.
Suppose that size(Sn)¡ log64 n. Construct a circuit Sn=64 that computes P
n=64
LR by elimi-
nating one gate in Sn. Then, size(Sn=64)6size(Sn)−1¡ log64 n−1= log(n=64). This is
a contradiction. Hence, size(Sn)¿ log64 n.
248 R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249
In analog VLSI the area occupied by a subcircuit that implements a winner-take-all
gate is comparable to that for a threshold gate. Hence the next theorem demonstrates
a drastic gain in e<ciency if one employs modules for computing winner-take-all in
addition to threshold gates. It combines the minimal possible computation time of 2
with a linear total wire length.
Theorem 3. PnLR can be computed by a feedforward circuit of depth 2, consisting of
two winner-take-all gates and one threshold gate, with total wire length and area
O(n).
Proof. Denote with a=(a0; : : : ; an−1) and b=(b0; : : : ; bn−1) the two vectors of input
features. We construct a circuit that consists of two winner-take-all gates in the ;rst
layer and one threshold gate in the second layer. One winner-take-all gate marks the
position of the leftmost occurring feature in a and the other winner-take-all gate marks
the position of the rightmost occurring feature in b. In the second layer, a single
threshold gate with linear weights can compute the value of PnLR(a; b).
Let a′=(a′0; : : : ; a
′
n−1)=WTA(w0 · a0; : : : ; wn−1 · an−1) denote the output vector of
a winner-take-all gate with the inputs a0; : : : ; an−1 weighted by integer weights w0; : : : ;
wn−1. Set the weights of the winner-take-all gate such that:
a′ = WTA((n+ 1) · a0; n · a1; (n− 1) · a2; : : : ; 2 · an−1; 1):
If a=(0; : : : ; 0), a′n wins (i.e. a
′
n is the only non-zero output of the gate). Otherwise,
a′i wins if and only if i=min{j|aj =1}, for 06i6n− 1. Furthermore, set the weights
of the second winner-take-all gate such that:
b′ = WTA(2 · b0; 3 · b1; 4 · b2; : : : ; (n+ 1) · bn−1; 1):
If b=(0; : : : ; 0), b′n wins. Otherwise, bi wins if and only if i=max{j|bj =1}, for
06i6n − 1. A simple threshold gate with a′ and b′ as its inputs can be used to
compute the value of PnLR(a; b):
PnLR =


1 if
∑n−1
i=0 (b
′
i · (i + 1)− a′i · (i + 1))− b′n · (n+ 1)
−a′n · (n+ 1)¿ 1;
0 otherwise:
If there is no feature a present, a′n wins and the gate outputs 0. The same holds for the
case that no feature b is present. Otherwise, since there is exactly one a′i and exactly
one b′j non-zero, if a
′
i =1 and b
′
j =1 and i¡j, the weighted sum is above the threshold
and the gate outputs 1. The sum is beyond the threshold for i¿j and the gate outputs
0.
Any gate can be implemented with linear wire length in our model. So the total
wire length is O(n). A similar VLSI layout uses linear area.
In contrast to the threshold circuit of Theorem 1 just linear size integer weights are
needed for this circuit.
R.A. Legenstein, W. Maass / Theoretical Computer Science 287 (2002) 239–249 249
3. Discussion
We have shown that the basic pattern recognition problem whether a certain local
feature a occurs to the left of some other local feature b can be solved by circuits
that require very little total wire length, and hence can potentially be implemented in
analog VLSI. Furthermore it was shown that a circuit with O(log n) threshold gates can
solve this problem, and that this number of threshold gates is asymptotically optimal.
Finally it was demonstrated that the same problem can be solved more e<ciently if
winner-take-all gates are employed in addition to a threshold gate. This gives rise to the
question which other concrete computational tasks can be carried out more e<ciently
by circuits that use winner-take-all gates besides (or instead of) threshold gates.
References
[1] B. DasGupta, G. Schnitger, Analog versus discrete neural networks, Neural Comput. 8 (1996) 819–842.
[2] J. Lazzaro, S. Ryckebusch, M.A. Mahowald, C.A. Mead, Winner-take-all networks of O(n) complexity,
in: Advances in Neural Information Processing Systems, Vol. 1, Morgan Kaufmann, San Mateo, 1989,
pp. 703–711.
[3] R.A. Legenstein, W. Maass, Foundations for a circuit complexity theory of sensory processing, in:
Advances in Neural Information Processing Systems, Vol. 13, MIT Press, Cambridge, MA, 2001, pp.
259–265.
[4] R.A. Legenstein, W. Maass, Total wire length as a salient circuit complexity measure for sensory
processing, 2001, submitted for publication.
[5] W. Maass, On the computational power of winner-take-all, Neural Comput. 12 (11) (2000) 2519–2536.
[6] C. Mead, Analog VLSI and Neural Systems, Addison-Wesley, Reading, MA, 1989.
[7] J.E. Savage, Models of Computation: Exploring the Power of Computing, Addison-Wesley, Reading,
MA, 1998.
[8] S. Thorpe, D. Fize, C. Marlot, Speed of processing in the human visual system, Nature 381 (1996)
520–522.
