with some xed constant probability, an N-input multibutter y can route log N permutations between some set of (N) inputs and (N) outputs in O(log N) time. (For a description of these and related results see 6] .)
The Leighton-Maggs strategy for tolerating faults consists of two parts. First the network is recon gured. Recon guring a network consists of identifying those parts that contain too many faults to be useful for routing, and removing them from the network. The goal is to leave intact as much of the working hardware as possible, while maintaining the important structural properties of the network. In the case of the multibutter y, the crucial property is expansion (de ned in Section 2). The Leighton-Maggs recon guration algorithm reduces the expansion of the network somewhat. However, as long as the recon gured network has some expansion it is possible to apply a routing algorithm that was designed to run on a fault-free multibutterfy. Thus, the second part of the strategy is to apply an o -the-shelf multibutter y routing algorithm, such as Upfal's permutation routing algo- rithm 12] .
One of the drawbacks of the Leighton-Maggs recon guration algorithm is that it is performed by an o -line computer with knowledge of the state of the entire routing network. This paper presents an on-line algorithm for recon guring the network in O(log N) time. The algorithm is performed entirely by the network, even though many of its switches may be faulty.
The remainder of this paper consists of two sections. In Section 2 we describe butter y and multibutter y networks. In Section 3 we review the recon guration algorithm of Leighton and Maggs and then describe the online algorithm.
Butter ies and Multibutter ies
An example of an 8-input butter y network is illustrated in Figure 1 . The nodes in this graph represent switches, and the edges represent wires. Each node in the network has a distinct label (r; l), where r is the row, and l is the level. In a butter y with N inputs, the row is a log N-bit binary number and the level is an integer between 0 and log N. The nodes on level 0 and log N are called the inputs and outputs, respectively. For l < log N, a node labeled (r; l) is connected to nodes (r; l + 1) and (r l ; l + 1), where r l denotes r with bit l complemented (bit 0 is the most signi cant, bit log N ? 1 the least).
In a butter y, messages are typically sent from the switches on level 0, called the inputs, to those on level log N, called the outputs. In a one-to-one routing problem, each input is the origin of at most one message, and each output is the destination of a most one message. One-to-one routing is also called permutation routing.
Dilated butter ies
Because message congestion is a common occurrence in real networks, the wires in butter y networks are typically dilated, so that each wire is replaced by a channel consisting of 2 or more wires. In a d-dilated butter y, each channel consists of d wires. Because it is harder to congest a channel than it is to congest a single wire in a butter y, dilated butter ies are better routing networks than simple butter ies 3, 4, 10, 11].
Splitter networks
Butter y and dilated butter y networks belong to a larger class of networks called splitter networks. The switches on each level of a splitter network can be partitioned into blocks. All of the switches on level 0 belong to the Figure 2 . In a splitter network, each input and output are connected by a single logical (up-down) path through the blocks of the network. The path is determined by the address of the output. At the ith level of the network, an up edge is taken if the ith bit of the output's address is 0. Otherwise a down edge is taken. As an example, Figure 3 shows the logical path from any input to output 011, which consists of an up edge followed by two down edges. In a butter y, this logical path speci es a unique path through the network, since only one up and one down edge emanate from each switch. Splitters with expansion are known to exist for any d 3, and they can be constructed deterministically in polynomial time 2, 9, 12], but randomized wirings typically provide better expansion. A discussion of the tradeo s between and in randomly-wired splitters, can be found in 7] and 12]. For the purposes of this paper, two facts are needed. First, for xed d and su ciently small , the expansion, , of a randomly-wired splitter will be close to d ?1 with probability close to 1. Second, for xed and su ciently large d, will be close 1=2 (the best possible) with probability close to 1.
It is not known if it is possible for to be close to d ? 1 and for to be close to 1=2 simultaneously.
A multibutter y with ( ; )-expansion is good at routing because one must block k splitter outputs in order to block k splitter inputs. In classical networks such as the butter y, the reverse is true: it is possible to block 2k inputs by blocking only k outputs. When this e ect is compounded over several levels, the e ect is dramatic. In a butter y, a single fault can block 2 l switches l levels back, whereas in a multibutter y, it takes l faults to block a single switch l levels back.
Routing around faults
In this section, we present an O(log N) time on-line algorithm for recon guring a multibuttery network in the presence of faults. We begin in Section 3.1 by describing the fault model. In Section 3.2 we review the o -line algorithm of Leighton and Maggs. Next, in Section 3.3, we describe the online algorithm. To simplify the presentation of the algorithm, we augment the multibutter y with some additional edges. These edges increase the size and VLSI layout area of the network by at most a constant factor. As it turns out, this additional hardware is not really necessary. We conclude in Section 3.4 by explaining how to implement the algorithm without using these extra edges.
The fault model
The recon guration algorithm and the routing algorithms in 6] tolerate static, non-malicious faults in the switches. In the static fault model, some faulty switches may be produced by the manufacturing process but once the network has been manufactured, no working switch ever fails, and no faulty switch ever begins to work. We shall assume that failures are non-malicious in the sense that a working switch can query any one of its neighbors and determine if that neighbor is faulty in constant time.
The Leighton-Maggs algorithm
The Leighton-Maggs algorithm consists of two parts. The rst part, called erasure, removes some of the outputs from the network. The second part, called fault propagation, removes some of the inputs. The goal of the recon guration algorithm is to leave intact a large working subnetwork in which every input can reach every output, and in which the splitters have ( ; 0 )-expansion, where 0 may be less than , but must be greater than one. The proof that a fault-free multibutter y can route log N permutations in O(log N) time holds for any expansion greater than one 6, 12] . By a similar argument, if 0 > 1, the subnetwork also can route any log N permutations between its inputs and outputs in O(log N) time 6] .
The erasure part of the algorithm consists of removing those splitters that contain too many faults. This step requires some o -line computation. Each splitter in the multibutter y is examined, and if more than an " fraction of its input switches are faulty, where " = 2 ( 0 ?1) and 0 = ?b d 2 c, then the splitter is \erased" from the network as are all of the switches that can be reached from the splitter, including outputs on level log N. In the next section, we will present an on-line algorithm for counting the number of faults in each splitter.
The second part of the algorithm, fault-propagation, is executed by the switches themselves. Working from level log N backwards, each switch checks if at least half of its upper output edges lead to faulty switches that have not been erased, or if at least half of its lower output edges lead to faulty switches that have not been erased. If so, then it declares itself to be faulty (but does not erase itself). Such a fault is called a propagated fault.
Finally, all of the remaining faulty switches are erased. This section presents an algorithm for determining which switches to remove from the network in O(log N) time. The algorithm is on-line in the sense that the computation is performed entirely by the switches, without the aid of any o -line computation. As in Section 3.2, the recon guration of the network consists of two parts. First, each splitter must determine if the number of faults in its input block exceeds an " fraction, and, if so, then it must erase itself. This part is di cult because each splitter must count its own faults and distribute the count to its working switches, even though the splitter itself may contain many faults. Second, faults are propagated from the outputs of the network towards the inputs. A switch is declared faulty if more than d=2 of its upper or lower output edges lead to switches that are faulty, but not erased. After erasure and fault propagation, all of the remaining faulty switches are erased.
The erasure part consists of two tasks. First, we must identify those blocks that contain too many faults and must be erased. Then for each splitter, each input switch must be told whether either of the two output blocks in the splitter have been erased. To help with these tasks, we add some edges to the network. In particular, each switch is connected in a random fashion to d 2 other switches in the same block. These edges will be used solely for the purpose of counting, and not for routing. They increase the VLSI layout area of the network by at most a constant factor. For su ciently small, but xed, 2 > 0, with probability close to 1, every set S of k 2 M switches in a block of size M will have at least 2 k neighbors 7, 12]. We will choose 2 to be small so that 2 will be close to d 2 ? 1. 1) by induction, we have a contradiction.
Blocks with too many faults fall asleep
The next lemma shows that if any working switch is awake after step , then it is connected to many nearby working switches. Proof: If r is still awake at the end of step i, then r must have had at least 0 2 = 2 ? bd 2 =2c neighbors that were awake at the end of step i ? 1. In turn, these neighbors must have had at least 02 2 neighbors that were awake at the end of step i ? 2, provided that 0 2 2 M. Otherwise, these switches had at least 2 0 2 M neighbors that were awake. In general, r can reach a set of minf 0j 2 ; 2 0 2 M g switches that were awake at the end of step i ? j. Furthermore, there is a path of length j from r to any switch in this set that passes only through working switches. Choosing j such that minf and choosing i = completes the proof.
Input blocks count awake switches in output blocks
Now the erasure algorithm must determine, for each splitter, whether its output blocks contain too many faults, and it must inform each input switch if either of the two output blocks must be erased. In order to count the number of faulty switches in the output blocks, the switches in each input block organize themselves into trees. Suppose that some switch r in a block of size M remains awake for steps. We call r a ruler. Each ruler attempts to form a depth-breadth-rst spanning tree of the working switches that it can reach with itself as the root. If r were the only ruler, then the number of steps required to form the spanning tree would be at most . However, since each ruler simultaneously attempts to form a spanning tree, there will be con icts when the trees overlap. To resolve these con icts, we will assume that each switch in the block possesses a distinct label. Each time a switch is added to a spanning tree, it is given the label of the spanning tree's ruler. If several trees attempt to add the same switch, then the one with the smallest label succeeds, even if the switch must be removed from another tree. Since the growth of the spanning tree with the smallest label is unimpeded by the other spanning trees, after steps, it will contain at least 2 0 2 M switches. Next, in steps, each ruler counts the number of switches in its spanning tree. If the total is at least 2 0 2 M, then it broadcasts a message to the switches in the tree, telling them that they belong to a large tree. Now each large tree makes an underestimate of the number of switches that are awake in the upper and lower output blocks. In order to perform this task, a third set of edges is added to the graph. For each input switch, d 3 edges are added to switches in both the upper and lower output blocks of outputs at the next level. The edges are inserted at random so that each set of k 3 M switches in a block of size M has at least 3 k neighbors in both the upper and lower blocks, where 3 3 < 1=2. These edges increase the VLSI layout area of the network by at most a constant factor. We will choose 3 = 2 0 2 so that a tree of size 2 0 2 M will have at least 2 0 3.3.3 Output blocks that are not to be erased are marked If any large tree in the input block counts more than (1 ? 2(" 2 + 2 ))M=2
switches that are awake in an output block of size M=2 then it marks all of those switches. As we shall see, a block will be erased unless it contains a marked switch, in which case it will not be erased. The following lemma bounds the number of network outputs that are erased.
Lemma 3.5 Let f denote the number of faults in the entire network. Then the total number of erased network outputs is at most f=" 2 .
Proof: If an output block of size M=2 has fewer than " 2 M=2 faults, then by Lemma 3.3, after steps it will have at most (" 2 + 2 )M=2 faulty and asleep switches. Since the switches in each large tree have at least (1? (" 2 +
2 ))M=2 neighbors in each output block, at least (1 ? 2(" 2 + 2 ))M=2 of those neighbors must be awake. These neighbors will all be marked and the block will not be erased. Thus, if an output block is erased, then it must have had at least an " 2 fraction of faulty switches to begin with.
Input switches are told whether output blocks are erased
After the large trees have marked switches in the output blocks that are not to be erased, the rest of the input switches that are awake must be informed. First, every working input switch (awake or asleep) queries its Proof: If any switch in an upper output block of size M=2 is marked, then at least 3 M = 2 0 2 3 M of them are marked, which for 2 0 2 3 > 1=4 is more than half of the switches in the block. By Lemma 3.4, each input switch in a block of size M that is awake can reach at least 2 0 2 M other working switches via paths of length at most . These switches have at least 2 0 2 3 M neighbors in the upper output block, which for 2 0 2 3 > 1=4 is more than half of the switches in the block. If more than half of the switches in the upper output block are marked, and more than half of the switches are neighbors, then at least one neighbor is marked.
The last step before fault propagation is to declare any switch that is asleep to be faulty. The following lemma shows that the blocks that are not erased contain at most a 2(" 2 + 2 ) fraction of faulty and asleep switches. Proof: If an output block of size M=2 has more than (" 2 + 2 )M faulty or asleep switches, then every large tree in the corresponding input block has at most (1 ? 2(" 2 + 2 ))M=2 neighbors in the output block that are awake, and none of those neighbors will be marked.
The algorithm for propagating faults from the outputs to the inputs in O(log N) time is the same as the propagation algorithm from the LeightonMaggs algorithm. It consists of log N stages, numbered 1 through log N. At stage i, each switch on level log N ?i counts the number of faulty neighbors it has on level log N ? i + 1. If more than half of its upper or lower outputs lead to faulty switches that have not been erased, then the switch declares itself to be faulty. Otherwise it does nothing. We will choose 2(" 2 + 2 ) < ", so that each unerased block has at most an " fraction of faulty switches. As a consequence, we can apply Lemmas 3.1 and 3.2 to bound the number of faults that propagate to the inputs. Removing the rst type of edges is more problematic. The basic idea is to simulate them using the d routing edges. We begin by observing that a randomly-wired splitter is likely to have expansion both from the inputs to the outputs and from the outputs to the inputs 7, 12]. Let ( 4 ; 4 ) be the expansion property from the output blocks (taken together) to their input block. Then 4 4 < 1, and 4 will be close to 2d ? 1, provided that 4 is su ciently small. A set of k M input switches in a block of size M has at least 2 k output neighbors (counting those in both the upper and lower blocks). These outputs in turn have at least 2 4 k input neighbors, provided 2 k 4 M. Thus, as long as none of the output neighbors are faulty, they can be used to simulate expansion 2 4 within the block. This expansion can be used in place of 2 in the algorithm of Section 3.3. What if some of these output neighbors are faulty? This problem can be solved by declaring a switch to be faulty if any of its output neighbors are faulty (without propagating any faults) before the recon guration process begins. This trick may multiply the number of faults in the network by a factor of 2d, but if a switch survives then all of its output neighbors were initially working. 
A nal look at the constants

