This paper rst presents some general properties of product networks pertinent to parallel architectures and then focuses on three case studies. These are products of complete binary trees, shu e-exchange, and de Bruijn networks. It is shown that all of these are powerful architectures for parallel computation, as evidenced by their ability to e ciently emulate numerous other architectures. In particular, r-dimensional grids, and r-dimensional meshes of trees can be embedded e ciently in products of these graphs, i.e. either as a subgraph or with small constant dilation and congestion. In addition, the shu e-exchange network can be embedded in r-dimensional product of shu e exchange networks with dilation cost 2r and congestion cost 2. Similarly, the de Bruijn network can be embedded in r-dimensional product of de Bruijn networks with dilation cost r and congestion cost 4. Moreover, it is well known that shu e-exchange and de Bruijn graphs can emulate the hypercube with a small constant slowdown for \normal" algorithms. This means that their product versions can also emulate these hypercube algorithms with constant slowdown. Conclusions include a discussion of many open research areas.
Introduction
Interconnection networks with small diameter, small vertex degrees, and large bandwidth are well suited for massively parallel computation. The hypercube is a well known example of a network with small diameter and large bandwidth, but the vertex degree of hypercube grows logarithmically with the number of vertices, making it hard to build scalable architectures. Grids have larger diameter than hypercubes, but due to their xed and small vertex degrees their popularity has been increasing in recent years. A small vertex degree implies that the system can be implemented with a small hardware cost spent for the communication channels. A xed vertex degree implies that the system can be expanded without having to modify the individual nodes. Although these are important advantages o ered by the grid network, besides its ability to e ciently compute certain classes of algorithms, it is not well suited for other classes of computations, including divide-and-conquer, ascend-descend, parallel merge, etc.
Interconnection networks with small, xed vertex degrees, and logarithmic diameters exist, including for instance binary trees, meshes of trees, shu e-exchange, and de Bruijn networks, and these networks are good for those computations were the grid is not. However they are ine cient for other computations where the grid interconnection is e cient. The power of hypercube is due to its ability to emulate all of these and other architectures e ciently 4, 12, 5, 7] . If we exclude the hypercube from consideration due to its large vertex degree, then we are faced with the challenge of designing a network which performs as well as the hypercube in these areas and which has a small and xed vertex degree. As shown in this Figure 1 : Two dimensional product of complete binary trees. The grid points shown top-left are connected in the binary tree pattern for each row (center-left), and for each column (center-right). The nal product network is shown top-right.
paper, product networks built from xed degree networks can potentially serve as viable candidates for this purpose.
In simple terms, the r-dimensional product of N-node graph G is obtained from the r-dimensional N r -node grid by replacing the linear connections of the grid for the interconnection pattern of G. As an example, Figure 1 shows the 2-dimensional product of 7-node binary trees. The notion of \dimension" in product graphs will be made more precise in the next section, but for now it su ces to think of the r-dimensional product of graph G as a generalization from the r-dimensional grid, where each dimension is connected in the pattern of G.
Some special cases of product graphs were studied by other researchers. For example, Rosenberg 16] showed that two dimensional products of de Bruijn networks can e ciently emulate butter ies, grids, and two dimensional meshes of trees. Ganesan and Pradhan 11] studied the product graph obtained from crossing hypercubes with de Bruijn networks. The resulting network is analogous to the two dimensional grid whose connections in the rst dimension are replaced for the hypercube connections, while the second dimension connections are replaced for that of de Bruijn graph. They showed that the resulting network has better embedding properties than the hypercube. Youssef studied some general properties of product networks, including connectivity, diameter and average distances, permutation routing, etc., and gave examples from crossing the hypercube with various other networks, including tree networks, banyan networks, linear arrays, and rings 19] . Other well known product networks are grids where G is a linear array of N nodes, and hypercubes where N = 2. The generalized hypercube 3] can be also considered as the r-dimensional product of complete graphs. This paper focuses on \homogeneous" product networks of r dimensions for r 2; homogeneous in the sense that for every dimension, the pattern of interconnection is de ned by the same graph G. The more general case of di erent interconnection patterns at di erent dimensions is certainly worthy of investigation. The choice of homogeneous products in this paper is not completely arbitrary however, since it allows the investigation of certain relationships between a \factor" network and its r-dimensional product versions more easily. That is, it is relatively easier to state and prove statements of the form \... if G has the property A, then the r-dimensional product of G has the corresponding property B..." Not only this type of analyses give a clear picture of any improvement (or lack of it as the case may be) attained by product networks, but also certain facts of this type can be easily generalized for \heterogeneous" products. Three case studies are presented in this paper, including products of complete binary trees, shu eexchange, and de Bruijn networks. This selection is based on two reasons: First, these are already known to be powerful networks for parallel computation, and second, they have small and xed vertex degrees. As shown in this paper, product networks based on these architectures are even more powerful computationally. On the negative side, the maximum vertex degree of product networks considered here grows faster (by a constant factor) than that of the hypercube or grid, with the increasing number of dimensions. However, there is little motivation for implementing them with a large number of dimensions. This is partly because a small number of dimensions can perform most of the known parallel algorithms, and partly because these product networks can grow without increasing the vertex degree. This latter factor gives them an advantage over the hypercube which cannot grow without increasing the vertex degree.
The speci c contributions of this paper are the following: After presenting the basic de nitions and notations in the next section, the general properties of product networks are presented in Section 3. These discussions are focused on those properties of product networks which are considered important for parallel computation; including the vertex degrees, partitionability, connectivity, diameter, bisection width, embedding properties, and routing.
The products of complete binary trees are considered in Section 4. Based on the results in Section 3, it is shown that r-dimensional N N N product of complete binary trees has diameter D = 2r(Log(N + 1) ? 1) , maximum vertex degree 3r, and bisection width at least (N r?1 ). It can emulate the r-dimensional N r -node torus with dilation 3 and congestion 2, and contains the rdimensional mesh of N-node trees as a subgraph.
The products of shu e-exchange graphs are considered in Section 5. Again from the analyses of Section 3, it is noted that the r-dimensional N N N product of shu e-exchange networks has diameter D = r(2LogN?1), maximum vertex degree 3r, and bisection width (N r =LogN). It contains the r-dimensional N r -node grid as a subgraph, and emulates the r-dimensional mesh of (N ? 1)-node trees with dilation cost 2 and congestion cost 2. It is also shown that N r -node pure shu e-exchange graph can be embedded in the r-dimensional N N N product of shu e-exchange networks with dilation cost 2r and congestion cost 2. This dilation can be considered as constant when r is xed. Moreover, reverse embedding of the product of shu e-exchange graphs on the pure shu eexchange graph is shown to require a logarithmic dilation cost, which suggests that the product version of shu e-exchange network is computationally more powerful than the pure shu e-exchange network itself.
Finally the products of de Bruijn networks are considered in Section 6. The r-dimensional N N N product of de Bruijn networks has diameter D = r(LogN), the maximum vertex degree 4r, and bisection width is (N r =LogN). It contains the r-dimensional N r -node torus as well as the r-dimensional mesh of (N ? 1)-node trees as a subgraph. These are signi cant advantages over the other two product networks examined for a small increase in the vertex degree. It is further shown that N r -node de Bruijn graph can be embedded in the r-dimensional N N N product of de Bruijn networks with dilation cost r and congestion cost 4. Again, this dilation can be considered as constant when r is xed. Moreover, reverse embedding of the product of de Bruijn graphs on the pure de Bruijn graph is shown to require a logarithmic dilation cost, which again suggests that the product version of de Bruijn network is computationally more powerful than the pure de Bruijn network itself.
The conclusion section discusses some of the open research areas.
De nitions and Notations
We mostly use undirected graphs to model interconnection networks, while occasionally taking advantage of directed edges to shorten certain proofs when no loss of generality occurs. It will often be important to indicate the number of vertices, so we use G(N) to denote the N-node graph G. The r-dimensional product of G(N) is denoted PG r (N), with the subscript r representing the number of dimensions. These notations will be maintained for consistency throughout the paper, with a few exceptions when no confusion can arize due to the context of discussion.
In this paper, we let u; v; w denote the vertices of G(N), and x; y; z denote the vertices of product graphs obtained from G(N). Since G(N) has N-vertices, the labels u; v; w take values 0; ; (N ? 1). For the r-dimensional product graph PG r (N), the vertex labels x; y; z are strings of r symbols where each symbol is drawn from f0; ; (N ? 1)g. For example, x is in the form x = u r?1 u i u 0 . where u i is a N-valued symbol.
Additional notation will be introduced as needed. As a reminder to the reader, the de nition of product graphs is provided rst, and illustrated in Figure 2 . (This particular de nition is frequently referred to as \cross product," as opposed to other product operations in the literature. We just use \product" to mean the cross product. From the symmetry in this de nition, note that the product operator is commutative and associative. That is:
The formal de nition of r-dimensional product graphs is given as follows:
De nition 2 Given a graph G(N), the r-dimensional product, denoted PG r (N), is 1. a single vertex without any edges and no labels when r = 0, 2. PG r (N) = G(N) PG r?1 (N), when r > 0. At a more intuitive level, the construction of PG r (N) from PG r?1 (N) can be described as follows: First, place the vertices of PG r?1 (N) along a straight line as shown in Figure 3 . Then, draw N copies of PG r?1 (N) such that the vertices with identical labels fall in the same column. Next extend the vertex labels, so that vertex label x becomes ux, for u 2 f0 (N ? 1)g. Finally, connect the columns in the interconnection pattern of the labeled graph G(N), such that ux is connected to u 0 x if and only if (u; u 0 ) is an edge in G(N). From this, the edges of PG r (N) can be characterized as follows. This can be easily veri ed. Suppose two labels x and y di er in just one symbol position. From Observation 1, we can reorder the symbols in the labels so that the di ering symbols become the leftmost symbol. Then the claim can be veri ed from De nition 2. If two or more symbols di er, one of them can be made the leftmost symbol and the claim can again be veri ed from De nition 2.
The General Properties of Product Networks
This section presents some computational properties which are common for all product graphs.
Among The ability to recursively partition a graph into distinct copies of its smaller versions is another important property, since it allows assigning the parts of a recursive computation to di erent subnetworks, or shows a way to share the system between many users. Product graphs contain a variety of subgraphs which are isomorphic copies of product graphs of lower dimensions. Let PG ?i r (N) denote the subgraph of PG r (N) induced by removing the symbol at position i. This corresponds to erasing the connections at dimension i. Then, for r > 0, PG ?i r (N) is isomorphic to N disjoint copies of PG r?1 (N) for all i 2 f0; (r ? 1)g. For i = r ? 1 (i.e. the leftmost symbol index) the partionability is immediate from De nition 2 and Figure 3 . For arbitrary i, we can use Observation 1 to reorder the symbols in the address labels so that the ith symbol becomes the leftmost symbol, and then refer to De nition 2.
This can be applied recursively and any number of symbols can be removed from the vertex labels to obtain product graphs of smaller dimensions.
The diameter of a network is another important property. Several papers were devoted to developing networks with small diameter and small vertex degree, and bounds have been derived on diameter as a function of vertex degree 13]. In general, computation of exact diameter for a given graph may be di cult, but for homogeneous product graphs we are able to state simple rules to calculate the diameter. We say that a network is \self-routing" if messages can be delivered to their destinations through shortest paths without an external controller. The basic idea in Theorem 2 is to apply the routing algorithm of G(N) in each dimension of PG r (N) where the source and destination addresses di er. The basic idea in Theorem 1 is to observe that there exist pairs of nodes in PG r (N) which di er in every symbol position. Moreover, each di ering symbol pair may correspond to a distance as much as the diameter of G(N). Finding such a pair yields both a lower bound and an upper bound for the diameter of the product graph.
The embedding results of this research are among the most important results since they show a way of emulating one network by another. In the context of product networks, the utility of embedding results is further emphasized by the fact that many of the existing popular architectures can be modeled as product networks. An embedding of a \guest" graph G in a \host" graph H is a mapping of the vertices of G into the vertices of H and the edges of G into paths in H. The main cost measures used in embedding e ciency are:
Load of the embedding is the maximum number of vertices of G mapped to any vertex of H. Dilation of an embedding is the maximum path length in H representing an edge of G. Congestion of an embedding is the maximum number of paths (that correspond to the edges of G) that share any edge of H. This theorem and its extensions have many signi cant implications. In particular, the next two results will be used frequently in the following sections. The following is useful when proving embedding results. It has been used quite frequently, and often implicitly by many researchers, and this makes it di cult to attribute it to a single researcher.
Proposition 1 Let G 0 be a subgraph of G. We note that these statements are equally true for the congestion of embedding also. That is, we could replace the word \dilation" for \congestion" and the statements would continue to hold.
The bisection width of a network determines its bandwidth, and has important implications about the VLSI layout complexity bounds 18]. We rst give a de nition which will be used in the statement and proof of the following theorem.
De nition 3 We say that the \maximal congestion" of a connected graph G(N) is C if after mapping the vertices of the N-node directed complete graph onto the vertices of G(N) in a one-to-one manner there is a mapping of the edges of the complete graph into paths in G(N) such that no edge of G(N) has congestion more than C.
Note that the maximal congestion is an intrinsic parameter of a graph just like the chromatic number, crossing number, etc. are intrinsic parameters of a graph. Lower bound: Suppose there is an estimated lower bound value L on the bisection width of G(N).
As a rst step consider how one could verify that L is indeed a lower bound for the bisection width of G(N). We do this by a method due to Leighton 15] Example 2: To compute a lower bound L PK on the bisection width of PK r (N) by this method, we rst map the nodes of N r -node directed complete graph, K(N r ), onto the nodes of PK r (N) one to one. We then map the edges of K(N r ) to shortest paths in PK r (N). Consider an edge (x; z) of K(N r ) mapped to a path in PK r (N). Let (x; y) be the rst edge of the path from x to z. If y di ers from x in dimension`, then z must also di er from x in dimension`, and the edge (x; y) must be used for all possible values of z. Since z can take at most N r?1 di erent values (including y itself), the congestion of the edge (x; y) is at most C PK = N r?1 . Then the lower bound is L PK = N 2r =2 C PK , where N 2r =2 is the bisection width of K(N r ). This yields L PK = N r+1 =2, which is equal to U PK above.
These two examples establish that the exact bisection width of PK r (N) is N r+1 =2. We now use this to prove the theorem. Above, we showed that each edge of PK r (N) represents N r?1 edges of K(N r ). 
Embedding Properties
Despite their simple structures, products of binary trees have very interesting embedding properties. For instance, while tori and meshes of trees are powerful architectures, they have di erent strengths and weaknesses. It is shown in this section that the product of binary trees can emulate both of these architectures very e ciently. It is further shown that PT r (N) can emulate a comparable size complete binary tree e ciently, while the reverse emulation of the PT r (N) architecture by a complete binary tree requires logarithmic dilation cost. Proof: Due to corollaries 1 and 2, it su ces to show that N-node cycle can be embedded in N-node complete binary tree with dilation cost 3 and congestion cost 2. This follows from a theorem due to Leighton 15] which states that the N-node cycle can be embedded in any N-node connected graph with dilation cost 3 and congestion cost 2. This means, r-dimensional product of any connected graph can emulate the corresponding r-dimensional torus with the claimed dilation and congestion.
The PT r (N) graph contains not just the mesh of trees, but a hierarchy of meshes of trees as shown next.
Theorem 6 For all i = 1; ; Log(N + 1) ? 1, PT r (N) contains the mesh of (N + 1)=2 i -leaf trees. Proof: Figure 5 shows the two dimensional meshes of trees contained in PT 2 (7) . Note that in this gure there are two meshes of trees contained; one with (N + 1)=2 = 4 leaves for each tree (shown in dark nodes), and one with (N + 1)=4 = 2 leaves for each tree (shown in empty nodes). In general, the Proof: For r = 2, the embedding of 5-level complete binary tree in PT 2 (7) is shown in Figure 6 . Note in particular that the tree in the middle row constitutes the highest 3 levels of the tree. The leaves of this row tree correspond to the roots of column trees. This pattern can be recursively repeated for larger values of N in two dimensions. Assuming that the claim is true for PT r?1 (N), the embedding proof for r dimensions follows from the recursive construction of PT r (N).
Note that for r = 2, the tree embedded by the above method is the largest tree possible. The next result shows that complete binary tree cannot emulate its comparable size product network with less than logarithmic dilation.
Theorem 8 Any embedding of PT r (N) in the large enough complete binary tree requires dilation cost (Log(rLogLogN)).
Proof: Referring to Proposition 1, we show that PT r (N) contains a subgraph G 1 , and there exists a graph G 2 which contains the complete binary tree as a subgraph, such that embedding of G 1 in G 2 requires the claimed amount of dilation. 
It is rst shown that products of binary trees can be embedded in the products of shu e-exchange graphs with dilation cost 2 and congestion cost 2. While this result carries all the embedding properties of PT r (N) to the PS r (N) graph, it may be better to nd direct embeddings for some cases. For instance, r-dimensional grids are subgraphs of the PS r (N). Next, it is shown that the N r -node shu e-exchange graph can be embedded in the PS r (N) graph with dilation cost 2r and congestion cost 2. For an implementation with a xed number of dimensions, this embedding can be considered as constant dilation, particularly because N can be independent of r. Moreover, it is shown that PS r (N) cannot be embedded in the N r -node shu e-exchange graph with less than logarithmic dilation cost. This makes the product network more powerful than the shu e-exchange network itself.
Theorem 9 PT r (N ? 1) can be embedded in PS r (N) with dilation cost 2 and congestion cost 2.
Proof: Due to corollaries 1 and 2, it su ces to show that the (N ?1)-node binary tree can be embedded in the N-node shu e-exchange graph with dilation cost 2 and congestion cost 2. The level order labeling of the (N ?1)-node complete binary tree as shown in Figure 8 induces the desired embedding. The root is assigned the label 1, and successively lower levels are assigned the remaining labels left-to-right.
The following results are now immediately observed:
Corollary 3 As in Theorem 6, a hierarchy of meshes of trees can be embedded in PS r (N) with dilation cost 2 and congestion cost 2.
Corollary 4 The r-dimensional N r -node grid is a subgraph of PS r (N). Proof: It was shown in 9] that the shu e-exchange network contains a hamiltonian path. Hence the result follows from Theorem 3.
The next two results consider the embedding of shu e-exchange graph in its product version, and the reverse embedding of product network in the pure shu e-exchange graph.
Theorem 10 For r > 1, the N r -node shu e-exchange graph can be embedded in PS r (N) with dilation cost 2r and congestion cost 2.
Proof: First consider the case for r = 2. Both S(N 2 ) and PS 2 (N) are labeled by 2LogN-bit strings. For the product graph, the rightmost LogN bits determine the \row address," while the leftmost LogN bits determine the \column address." We show that whenever (u; v) is an exchange edge in S(N In the PS 2 (N) graph, u has an exchange neighbor w in its row, whose address is obtained by complementing the rightmost bit of the address. That is w = v. In fact, it is true for arbitrary r that whenever (u; v) is an exchange edge of the N r -node shu e-exchange graph, it is also an exchange edge in the PS r (N) graph. Therefore, the rest of this proof only needs to consider the shu e edges. Now suppose (u; v) is a shu e edge in S(N 2 ). If u is as above, v must be: v = u 2n?2 u n+1 u n u n?1 ju n?2 u 1 u 0 u 2n?1 : For the PS 2 (N) graph, the row neighbors of u are w e;r = u 2n?1 u 2n?2 u n+1 u n ju n?1 u n?2 u 1 u 0 and w s;r = u 2n?1 u 2n?2 u n+1 u n ju n?2 u 1 u 0 u n?1 where the superscripts \e; s; r" stand for \exchange," \shu e," and \row," respectively. The column neighbors of u, indicated by the superscript \c," are w e;c = u 2n?1 u 2n?2 u n+1 u n ju n?1 u n?2 u 1 u 0 and w s;c = u 2n?2 u n+1 u n u 2n?1 ju n?1 u n?2 u 1 u 0 In the following discussion, subscripts`and r are used to denote the left-hand half of a label, and the right-hand half of a label. For example, w s;c denotes the left-hand half of the vertex w s;c above.
There are two cases to consider: Case 1: u 2n?1 = u n?1 . In this case the reader can easily verify that v = w s;c jw s;r r . This means that one can go from u to v in PS 2 (N) in two steps; by moving to the shu e neighbor of u in the column and then the shu e neighbor in the row. Alternatively, one can move to the shu e neighbor in the row rst, and then in the column. where the \+" sign denotes sequencing of the two moves. That is, w s;c r + w e;c r denotes moving to the shu e neighbor in the column, followed by moving to the exchange neighbor in the column.
Since v = v`jv r , a sequence of four moves yields the desired vertex label.
To extend these arguments for r > 2, since PS r (N) = S(N) PS r?1 (N), a vertex of PS r (N) can be written as u = u rn?1 u rn?2 u (r?1)n jS 0 , where S 0 is a vertex in PS r?1 (N). For the discussion below, only the leftmost bit of S 0 is relevant, so we can write S 0 = sS. That is; u = u rn?1 u rn?2 u (r?1)n jsS: In the N r -node shu e-exchange graph, the shu e neighbor is v = u rn?2 u (r?1)n sjSu rn?1
For the product network, u has a shu e neighbor x s , where x s = u rn?2 u (r?1)n u rn?1 jsS which in turn has an exchange neighbor x e , where x e = u rn?2 u (r?1)n u rn?1 jsS Let x`denote the leftmost LogN-bit substring of x (i.e. the part to the left of \j" above). Then, observe that v`= ( x s if u rn?1 = s; x è otherwise That is, x e is at a distance of two from u, and going from u to x e corrects just the leftmost LogN bits of the address towards v. Since the next set of LogN bits can be corrected by the same method as above, 2(r ? 1) additional steps are needed to reach v. This completes the proof that dilation of embedding is 2r.
To study the congestion, consider two vertices u; u 0 of S(N r ). For the discussion below, we focus on the leftmost LogN bits and the rightmost LogN bits of these vertices and use S 0 = sS to denote a vertex in PS r?2 (N). is reached. Since no other paths contain these edges the congestion thus far is 2.
If u n?1 = u rn?1 , then the vertex reached is v and the edge (u; v) has been completely mapped. However, the path from u 0 to v 0 still needs to traverse an exchange edge to invert its rightmost bit. The path only shares this edge with the exchange edge (v; v 0 ) in S(N r ), and the congestion of the edge is 2.
If u n?1 = u rn?1 , then the vertex reached is v 0 , and the path from u to v still needs to traverse an exchange edge to invert its rightmost bit as before. Thus the congestion is again 2 and the proof is complete.
Theorem 11 Any embedding of PS r (N) in S(N r ) requires dilation (Log(rLogN)).
Proof is deferred until after Theorem 15.
Products of de Bruijn Networks
The rLogN and bisection width is (N r =LogN) . 6 . There exists a shortest path routing algorithm for the PD r (N) graph based on the de Bruijn routing algorithm.
Comparing to the product networks in the previous subsections, the vertex degree increases by 25%, while PD r (N) has better properties in other respects. Diameter reduces by 50%, and the minimum number of parallel paths between an arbitrary pair of vertices doubles. It also has better embedding properties as will be shown below.
Embedding Properties
It is well known that shu e-exchange and de Bruijn networks are computationally equivalent. That is, every computation which can be performed on one of them, can be also performed on the other with constant slowdown. It is therefore reasonable to expect that their product versions would also be computationally equivalent. This result is formally stated by the following lemma. This means, by following one of the outgoing edges from u at the highest dimension, we correct the leftmost LogN bits of the address towards v. Since the next set of LogN bits can be corrected by the same method as above, r ? 1 additional steps are needed to reach v. This completes the proof that dilation of the mapping is r.
To study the congestion, rst note that the rst edge of the path from u to v and the rst edge of the path from u to w are the same (depending on the value of s the correction of the leftmost LogN bits of u takes both paths to either x or y.) Furthermore, the paths from u to v and form u to w share all the edges except for the last one, where the rightmost LogN bits are corrected.
Similarly, there exists edges in D(N r ) from the node u 0 = u rn?1 u rn?2 u (r?1)n jsS to v and w. The paths in PD r (N) from u 0 to v and from u 0 to w have a common rst edge, that depending on s takes the paths to either x or y. From there they share all the remaining edges except for the last one. The paths from u to v and from u 0 to v share all the edges except for the rst one, and same is true with the paths from u to w and from u 0 to w. Therefore, the rst and last edges of the paths are traversed by two of the four paths and the internal edges of the paths are traversed by the four paths identi ed above. Since the edges traversed by these four paths are not traversed by any other path, we can conclude that the congestion of the embedding is at most four.
When r = 2 the paths have length 2 and there are no internal edges, the congestion in this case is only 2.
Earlier, it was shown in 10] that D(2 k ) can emulate D(2 k+j ) with unit dilation cost (actually the authors of 10] called it the \4-pin shu e graph"). In the resultant emulation, each vertex of D(2 k ) is assigned exactly 2 j nodes. The proof is based on the observation that, by erasing the rightmost j bits from vertex labels of D(2 k+j ), we obtain a graph isomorphic to D(2 k ). By the same observation the following results can be stated. Therefore, for xed r, a small size PD r (N) architecture can easily emulate larger size machines with proportional slowdown in the running time. These last two results are interesting because a small hypercube cannot emulate a larger hypercube with constant congestion. In the case of hypercube the congestion increases by the same amount as the load. For de Bruijn graphs and their products, the emulation of larger graphs of their kinds require no increase in the congestion.
Finally, the next result shows that products of de Bruijn graphs are more powerful than the pure de Bruijn graphs. (This is an extension of a similar result in 16] given for two dimensions.) Theorem 15 Any embedding of PD r (N) in D(N r ) requires dilation (Log(rLogN)). Proof: From Proposition 1 it su ces to show that PD r (N) contains a subgraph which cannot be embedded in D(N r ) with dilation cost less than Log(rLogN). From Theorem 12, we know that rdimensional N r -node array is a subgraph of PD r (N). It is shown in 2] that any embedding of M node k-dimensional array, for k 2, requires dilation cost (LogLogM). Since M = N r , the claim follows.
Proof of Theorem 10: We know from Lemma 1 that PS r (N) and PD r (N) are computationally equivalent. If S(N r ) could emulate PS r (N) with dilation less than (Log(rLogN)), it would imply that S(N r ) could also emulate PD r (N) with dilation less than this amount, implying that shu e-exchange network is more powerful than de Bruijn network.
Discussions and Conclusions
Product networks inherently bridge the gap between many useful topologies due to their dimensionoriented de nitions. It is interesting too that we are able to cite large classes of computations as being in the domain of product networks, built from a graph G, even without looking at the topology of G. For instance, the r-dimensional product of any connected graph G can emulate the r-dimensional torus with dilation cost 3 and congestion cost 2 or better. Products of all networks which can emulate the complete binary tree with a given level of e ciency can also emulate the mesh of trees with the same level of e ciency. Additional advantages could be o ered if G were to have other features which can be exploited at higher dimensions. Three case studies were presented in this paper, and some of the special advantages o ered by each were analyzed in detail. Figure 10 compares di erent product networks with each other as well as with their non-product versions.
Here the columns labeled as \product" denote the PG r (N) graphs built from G(N), while the columns labeled \pure" denote the G(N r ) graph. The binary tree appears to bene t the most from the product de nition. Its bisection width increases from 1 to (N r?1 ), and the number of parallel paths increases from 1 to r. Its product version can e ciently emulate grids and mesh of trees.
In all these cases the diameter of the r-dimensional product graph is comparable with (or same as) the diameter of the corresponding size pure graph. This is because the diameters of the graphs studied here are logarithmic. From Theorem 1, the reader can easily check that, if G(N r ) has more than logarithmic diameter, the diameter of PG r (N) must be less than its corresponding size pure version. Conversely, if the diameter of G(N r ) is less than logarithmic, the diameter of PG r (N) will be larger. There are similar relationships for the bisection width. If the bisection width of G(N r ) is O(N r ), then this bisection width is preserved (within a constant factor) in PG r (N). Larger bisection widths are reduced, while smaller bisection widths are increased. Consideration of these factors can help predict the expected performance improvement from product de nition of a given graph. On the other hand, it was shown in 6] that by crossing certain edges of the hypercube, the diameter can be reduced by half, while the bisection width does not change. A similar crossing method for product networks may be applicable also, and deserves further research. While routing for product networks is brie y addressed (see Theorem 2), other forms of communication are not considered in this paper. A detailed investigation of various forms of data communication in product networks is a rich area awaiting investigation.
VLSI layout of product networks is brie y addressed in 16], where it is shown that two dimensional products of de Bruijn networks require a VLSI area which is more than the corresponding size de Bruijn network by a modest factor. However, there is no general result which predicts the VLSI area of PG r (N), given that VLSI area is known for G(N). Deriving such a relationship would be valuable as a general model of area complexity for product networks.
Finally, product networks do not have to be built with a complete G(N) for each dimension. If G(N) is a partitionable network, or it admits di erent values of N within its class de nition, then it may be possible to build product networks with di erent sizes at di erent dimensions. All the investigations of data communication, VLSI area, and other relevant factors could be addressed for these networks also.
