Some studies on the multi-mesh architecture. by Afroz, Nahid
University of Windsor 
Scholarship at UWindsor 
Electronic Theses and Dissertations Theses, Dissertations, and Major Papers 
2004 
Some studies on the multi-mesh architecture. 
Nahid Afroz 
University of Windsor 
Follow this and additional works at: https://scholar.uwindsor.ca/etd 
Recommended Citation 
Afroz, Nahid, "Some studies on the multi-mesh architecture." (2004). Electronic Theses and Dissertations. 
3538. 
https://scholar.uwindsor.ca/etd/3538 
This online database contains the full-text of PhD dissertations and Masters’ theses of University of Windsor 
students from 1954 forward. These documents are made available for personal study and research purposes only, 
in accordance with the Canadian Copyright Act and the Creative Commons license—CC BY-NC-ND (Attribution, 
Non-Commercial, No Derivative Works). Under this license, works must always be attributed to the copyright holder 
(original author), cannot be used for any commercial purposes, and may not be altered. Any other use would 
require the permission of the copyright holder. Students may inquire about withdrawing their dissertation and/or 
thesis from this database. For additional inquiries, please contact the repository administrator via email 
(scholarship@uwindsor.ca) or by telephone at 519-253-3000ext. 3208. 




Submitted to the Faculty of Graduate Studies and Research through the 
School of Computer Science in Partial Fulfillment of the Requirements for 
the Degree of Master of Science at the 
University of Windsor
Windsor, Ontario, Canada 
2004
© 2004 Nahid Afroz
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.







395 Wellington Street 
Ottawa ON K1A 0N4 
Canada
395, rue Wellington 
Ottawa ON K1A 0N4 
Canada
Your file Votre reference 
ISBN: 0-612-96116-8 
Our file Notre reference 
ISBN: 0-612-96116-8
The author has granted a non­
exclusive license allowing the 
Library and Archives Canada to 
reproduce, loan, distribute or sell 
copies of this thesis in microform, 
paper or electronic formats.
The author retains ownership of the 
copyright in this thesis. Neither the 
thesis nor substantial extracts from it 
may be printed or otherwise 
reproduced without the author's 
permission.
L'auteur a accorde une licence non 
exclusive permettant a la 
Bibliotheque et Archives Canada de 
reproduire, preter, distribuer ou 
vendre des copies de cette these sous 
la forme de microfiche/film, de 
reproduction sur papier ou sur format 
electronique.
L'auteur conserve la propriete du 
droit d'auteur qui protege cette these. 
Ni la these ni des extraits substantiels 
de celle-ci ne doivent etre imprimes 
ou aturement reproduits sans son 
autorisation.
In compliance with the Canadian 
Privacy Act some supporting 
forms may have been removed 
from this thesis.
While these forms may be included 
in the document page count, 
their removal does not represent 
any loss of content from the 
thesis.
Conformement a la loi canadienne 
sur la protection de la vie privee, 
quelques formulaires secondaires 
ont ete enleves de cette these.
Bien que ces formulaires 
aient inclus dans la pagination, 
il n'y aura aucun contenu manquant.
Canada
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Abstract
In this thesis, we have reported our investigations on interconnection network 
architectures based on the idea of a recently proposed multi-processor architecture, Multi- 
Mesh network. This includes the development of a new interconnection architecture, 
study of its topological properties and a proposal for implementing Multi-Mesh using 
optical technology.
We have presented a new network topology, called the 3D Multi-Mesh (3D MM) that is 
an extension of the Multi-Mesh architecture [DDS99]. This network consists of n3 three- 
dimensional meshes (termed as 3D blocks), each having n3 processors, interconnected in 
a suitable manner so that the resulting topology is 6-regular with n processors and a 
diameter of only 3n. We have shown that the connectivity of this network is 6. We have 
explored an algorithm for point-to-point communication on the 3D MM. It is expected 
that this architecture will enable more efficient algorithm mapping compared to existing 
architectures.
We have also proposed some implementation of the multi-mesh avoiding the electronic 
bottleneck due to long copper wires for communication between some processors. Our 
implementation considers a number of realistic scenarios based on hybrid (optical and 
electronic) communication. One unique feature of this investigation is our use of WDM 
wavelength routing and the protection scheme. We are not aware of any implementation 
of interconnection networks using these techniques.
Keywords: Multiprocessor Architecture, Interconnection Network, Network Parameters, 
Mesh Network, Multi-Mesh, 3D MM, Diameter, Connectivity, Routing, Optical 
Network, Optical Communication, WDM wavelength routed optical network, Optical 
Implementation of a network, Fault tolerant.
iii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
To my parents, brothers and sisters
iv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Acknowledgements
I would like to take this opportunity to thank my thesis advisor Dr. Subir Bandyopadhyay 
for his direction, patience and cooperation throughout my graduate studies. I am grateful 
to him for the countless hours that he spent with me to help me understand the areas I 
studied during my research. Without his guidance, support, suggestions and pointers this 
work wouldn’t be achieved. Special thanks to Dr. Bhabani P. Sinha who was a research 
scientist visiting our optical lab in summer 2003. He introduced me to the concept of 
interconnection network and provided all kinds of help whenever I needed. It is always a 
privilege to work with an excellent group of Dr. Subir Bandyopadhyay and Dr. Bhabani 
P. Sinha.
I wish to express my thanks to Dr. Alioune Ngom and Dr. Narayan Kar for their valuable 
time, cooperation and thoughtful suggestions that improved the quality of this thesis.
My special gratitude to my husband, Rabiul for his loving support and continuous 
encouragement. My little son Sindeed, I hope Sindeed will forgive me for the time I have 
spent on my studies when I was forced to leave him in day-care.
Last but not the least, I wish to thank and dedicate this thesis to my beloved parents, my 
brothers and sisters who offered their best wishes throughout my whole life.
v





List of Tables.......................................................................................................................... ix
List of Figures......................................................................................................................... x
Chapter 1.................................................................................................................................. 1
Introduction.............................................................................................................................1
1.1 Multiprocessor Architecture.......................................................................................... 1
1.2 Interconnection Network................................................................................................1
1.3 Optical Communication................................................................................................ 4




2.1 Multiprocessor Architecture.......................................................................................... 8
2.2 Interconnection Network............................................................................................. 10
2.3 Network Parameters.....................................................................................................10
2.4 Types of Interconnection Network............................................................................ 11
2.5 Examples of Some Simple Multiprocessor Architecture.......................................... 13




2.5.5 Fully connected network...................................................................................... 15
2.5.6 Hypercube............................................................................................................. 16
2.5.7 Mesh network........................................................................................................17
2.6 Optical Technology and Optical Communication..................................................... 23
2.6.1 Optical fiber.........................................................................................................24
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.6.2 Optical couplers.................................................................................................... 24
2.6.3 Optical multiplexers and demultiplexers.............................................................25
2.6.4 Passive star coupler.............................................................................................. 25
2.6.5 Routers..................................................................................................................26
2.6.6 WDM network...................................................................................................... 27
2.6.7 Wavelength routed network................................................................................ 28
2.6.8 Single-hop network.............................................................................................. 29
2.6.9 Multi-hop network................................................................................................ 29
2.6.10 Routing and wavelength assignment.................................................................29
2.6.11 Fault tolerant optical network........................................................................... 29
2.7 Use of Optical Technology in Interconnection Network Design............................. 30
2.7.1 Advantages of optical interconnects in multi-computers systems....................31
2.7.2 Free space optical interconnects.......................................................................... 31




Topology of 3D Multi-Mesh............................................................................................... 35
3.1 Description of a 3D Block........................................................................................... 36
3.1.1 Intra-block connection.......................................................................................... 37
3.1.2 Categorization of processors................................................................................38
3.1.3 Inter-block connections........................................................................................39
3.1.4 Rules for inter-block connections:...................................................................... 40
3.2 Topological Properties of the 3D Multi-Mesh Network...........................................42
3.2.1 Diameter................................................................................................................42
3.2.2 Connectivity of Multi-Mesh network.................................................................. 50
3.2.3 Connectivity of three dimensional Multi-Mesh (3D MM) network.................60
3.3 Message Routing in the 3D Multi-Mesh.................................................................... 66
3. 4 Summation/Average/Minimum/Maximum in the 3D Multi-Mesh......................... 71
Chapter 4................................................................................................................................74
Optical Implementation of Multi-Mesh Links................................................................. 74
4.1 Physical Topology for Optical Communication in a Multi-Mesh............................ 78
4.1.1 Physical topology using unidirectional links......................................................83
4.1.2 Physical topology using bidirectional link..........................................................83
4.2 Logical Topology for a Fault-Free Multi-Mesh.........................................................84
4.2.1 Logical topology using unidirectional links........................................................85
4.2.2 Logical topology using bidirectional links..........................................................87
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.3 Robust Logical Topology for a Multi-Mesh............................................................87
Chapters................................................................................................................................96
Conclusions and Future Directions...................................................................................96
5.1 Summary of Work Done............................................................................................. 96
5.2 Suggestions for Future Works.....................................................................................97






Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
List of Tables
Table 3.1: Diameter of Hypercube, Multi-Mesh and 3D MM...................................50
Table 3.2: An example of Diameter of Hypercube, Multi-Mesh and 3D MM 50
Table 4.1: Primary lightpaths in the fibers of different groups..................................92
Table 4.2: The backup lightpaths.................................................................................... 94
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
List of Figures
Figure 1.1: A multi-processor architecture........................................................................2
Figure 1.2: Tree interconnection network......................................................................... 3
Figure 2.1: (a) Communication between CP and DP, (b) A node in a network...........9
Figure 2.2: Types of interconnection topologies.............................................................. 12
Figure 2.3: Linear array......................................................................................................13
Figure 2.4: Ring.................................................................................................................... 13
Figure 2.5: Star network......................................................................................................14
Figure 2.6: Tree interconnection network....................................................................... 15
Figure 2.7: Fully connected network.................................................................................16
Figure 2.8: Hypercube topologies of different dimensions............................................ 17
Figure 2.9: (a) Two-Dimensional mesh (8x8), (b) 8-connected 2D mesh..................... 18
Figure 2.10: Torus Network................................................................................................19
Figure 2.11: A Multi-Mesh network with 3 X 3  meshes  ...................................21
Figure 2.12 3D mesh.............................................................................................................22
Figure 2.13: (a) Splitter, (b) Combiner and (c) Coupler.............................................. 24
Figure 2.14: (a) Multiplexer (b) Demultiplexer...............................................................25
Figure 2.15: A 4 x 4 passive star coupler.......................................................................... 26
Figure 2.16: Router.............................................................................................................26
Figure 2.17: A 4 x 4 passive Router...................................................................................27
Figure 2.18: A wavelength routed network..................................................................... 28
Figure 2.19: Free-Space: (a) Space-Variant (b) Space-Invariant.................................32
Figure 2.20: A (4 ,4 ,3) OMMH network with 128 nodes.............................................. 33
Figure 2.21: An example of OTIS-mesh network with 16 nodes..................................34
Figure 3.1: A 3D block of order 3 ......................................................................................36
Figure 3.2: 3D MM network of order 3............................................................................ 37
x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 3.3 Interconnections along the y-coordinate....................................................... 41
Figure 3.4: 3D MM network of order 2............................................................................ 42
Figure 3.5: Three imaginary planes divide the source block into 8 octants...............46
Figure 3.6: Possible four disjoint paths from source to destination (Case 1).............54
Figure 3.7: Possible four disjoint paths from source to destination (Case 2 ).............57
Figure 4.1: 4 X 4 Blocks of a Multi-Mesh network of order 4 ......................................75
Figure 4.2: A Block of a Multi-Mesh network of order 4 .............................................. 75
Figure 4.3: Connections between Routers in a Multi-Mesh network of order 4 ....... 78
Figure 4.4: Outputs of Multiplexers are connected to the inputs of router................79
Figure 4.5: Inputs of the Demultiplexers are connected to the output of router....... 81
Figure 4.6: Connection between router R ll  and Block B l l .........................................82
Figure 4.7: A MM network based on unidirectional links............................................ 83
Figure 4.8: A MM network based on bidirectional links............................................... 84
Figure 4.9: Inserting the (N+l)th node in a unidirectional ring..................................86
Figure 4.10: A faulty link in a multi-mesh of order 8 ................................................... 91
Figure 4.11: A faulty multi-mesh of order 8 .................................................................... 93
xi




One of the prime objectives in designing computers has always been to build faster and 
more powerful machines. An obvious way to solve a problem faster is to use a network of 
a large number of processing units or computers, where the different processors solve a 
problem by working simultaneously on different parts of that problem [Ak89]. With the 
advances in technology, the cost and size of processors have been reduced tremendously 
so that it is now possible to use several thousands to millions of computers to build up a 
multi-processor system [Ak89]. The challenge on the hardwire side is to determine how 
these processors should be connected together for optimum performance.
1.2 Interconnection Network
A crucial part of designing a multi-computer architecture is to assure faster data 
communication to allow efficient sharing of data between the processors. Data 
communication is accomplished by sending messages through the computers by using 
shared memory or interconnection network [To94], [Ak89]. In this work, we will 
consider only interconnection networks.
1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The architecture of an interconnection network defines exactly which processors are 
connected to each other.
In an interconnection network, each processor has a memory and is interconnected to 
other processors with respect to a given topology. Figure 1.1 shows a general architecture 
of a multi-processor architecture where a number of processors and memory modules are 
connected by an interconnection network.
The architecture of the interconnection network in a multi-processor system has a crucial 
role in the performance of the multiprocessor system -  both in terms of the speed of 
communication and the time to run an application.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 1.1: A multi-processor architecture
2
Figure 1.2 shows an example of a simple multiprocessor architecture called tree, where a 
number of processors PI, P2,..., P7 are connected by interconnection network.
In the last few decades, there has been a lot of effort in developing efficient 
multiprocessor interconnection architectures. Two-dimensional mesh [HwBr83], [St83], 
[Le92] is one of the most popular architectures due to its inherent simplicity and ease of 
algorithm mapping. Many variants of the two-dimensional mesh structure, e.g., torus, 
Illiac IV [HwBr83], [Le92], multi-dimensional mesh [Le92] have also been proposed in 
the literature in order to have a topology for more and more efficient computation in a 
parallel/distributed environment. Efficient mapping of many fundamental and most 
frequently used algorithms on variations of the mesh structure have been reported 




Figure 1.2: Tree interconnection network
3
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
In this thesis, we investigate the Multi-Mesh architecture [DDS99] - a recent proposal for 
multi-processor interconnection architecture that uses the 2-dimensional mesh as a basic 
building block. With the same number of processors and the same number of links as in 
the case of a torus, the Multi-Mesh (MM) topology has a much smaller diameter 
[DDS99] so that processors can communicate with each other quickly.
1.3 Optical Communication
Optical networks are used for data communication where signals carrying the data are in 
the form of light waves [ChKr93]. In an optical network, optical fibre is used as the 
media of transportation. Recently, there has been growing interest in developing optical 
networks to support the increasing bandwidth demands of multimedia applications, such 
as video conferencing and World Wide Web browsing [BBRM97]. According to 
Chamberlain and Krchnavek [ChKr93] optical networks have made significant 
contributions to the state of the art for long distance communications, including high 
reliability, low interference, security benefit and very high bandwidth. Traditionally, 
metal-based electrical connection has been used to realize interconnection networks. 
There are a number of limitations in this approach that we will review in chapter 2. For 
high-speed communication in interconnection networks, optical technology has been 
proposed as abetter alternative to copper based communication [LoSu94a], [LoSu94b].
4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1.4 Work Reported in this Thesis
In this thesis, we report our investigations on mesh type interconnection networks based 
on the concept of the Multi-Mesh architecture. The main results are as follows:
1) We have proposed a new architecture that uses the 3-dimensional mesh as its 
building block rather than a 2-dimensional mesh as done in the Multi-Mesh 
[DDS99]. We have shown that our architecture has better topological properties 
compared to the Multi-Mesh architecture and that a number of algorithms can be 
efficiently mapped on the 3D MM network.
2) We have explored a number of possible approaches for implementing the Multi- 
Mesh architecture using opto-electronic technologies. There are two novel 
features of our approach:
a. We have shown that WDM wavelength-routed networks may be used to 
realize some of the links.
b. We have shown that single faults may be handled easily without 
increasing the number of optical paths used.
5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Our new topology which we call the 3D Multi-Mesh (3D MM) consists of «3 three- 
dimensional meshes (termed as 3D blocks), each having n3 processors, interconnected in 
a suitable manner so that the resulting topology is 6-regular with n6 processors and a 
diameter of only 3n. We have shown that the connectivity of this network is 6 and the 
diameter is only O ( JV1/6) in contrast to O ( N V3) on a 3-dimensional torus with the same 
node degree of 6. In this thesis, we have proposed an optical implementation for the inter­
block connections of the Multi-Mesh, where we use the advantages of wavelength 
division multiplexing (WDM).
For effective use in parallel processing, it is essential that the delay along each link is 
small and uniform (O (1)). Since the inter-block links used in the 3D MM are relatively 
long, optical links for such inter-block connections may be used to ensure a small 
uniform delay link. The intra-block links, however, can always be kept electronic since 
they introduce short links of constant length. In recent years, the optical interconnection 
system has also drawn much attention among researchers because of its superior power, 
speed and crosstalk properties compared to the electronic links when the interconnection 
distance is more than a few millimeters [ChKr93], [LoSu94b].
6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1.5 Thesis Organization
In chapter 2, we provide a brief overview of the related fields of our research. First of all 
we discuss the concept of multiprocessor architecture, interconnection networks, network 
parameters, and some examples of static multiprocessor architectures. Then we present 
some optical technology and optical network components that are used to optical 
implementation of the hybrid networks and some examples of hybrid networks.
In chapter 3 we describe our proposed 3D MM network topology, studied its various 
topological properties and provide a table of comparison to compare the topological 
properties of our proposed network with other similar networks. In this chapter we also 
discuss the communication algorithm for routing on the 3D MM network and the 
fundamental algorithm on 3D MM network.
In chapter 4, we propose how the Multi-Mesh architecture may be implemented using 
optical technology and we describe a number of possible approaches for designing optics- 
based interconnections for the Multi-Mesh.
We provide summary of works, proposed possible future directions and concluded in 
chapter 5. In Appendix A we give some more path calculations for different source and 
destinations and in Appendix B we provide glossary of important terms. Finally we give 
the bibliography.
7
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 2
Literature Review
In this chapter we have provided a brief overview of the related fields of our research. 
First of all we discuss the concept of multiprocessor architecture, interconnection 
networks, network parameters, and some examples of static multiprocessor architectures. 
Then we present some optical components that are used to optical implementation of the 
hybrid networks and some examples of hybrid networks that are similar to our proposed 
network.
2.1 Multiprocessor Architecture
A multiprocessor architecture and/or distributed computer consists of a number of 
processing units or computers that are also called nodes that work simultaneously and/or 
independently to solve a given task. A fundamental problem in any multiprocessor 
system is to maintain an efficient data communication between the processors of the 
network. In a multi-processor network each node consists of a data processor (DP) 
executes and a communication processor (CP) [Zo96] as shown in Figure 1.1(b). The 
data processor executes algorithms and the communication processor is responsible for 
routing and point-to-point communication mechanism shown in Figurel.l (a). In the 
network, the hardware that is used to move the messages is known as routers that are 
situated in the CP.
8
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Communication processors also contain a buffer that contains the messages to be sent to 
the next node. To get full parallelism, nodes are required to get the data to the right place 
within a reasonable amount of time [Le92]. Data communication and sharing occurs by 
sending messages to each other. Since a node is not directly connected to all other nodes, 









Figure 2.1: (a) Communication between CP and DP, (b) A node in a network
9
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.2 Interconnection Network
In a multi-computer architecture, there can be situations where thousands of processing 
units work simultaneously to solve a given problem. These processors may need to share 
data or to send messages to each other. Since the processors are not directly connected to 
all other processors, it is important to ensure that any processor may communicate with 
any other processor simply and quickly. An important requirement of an interconnection 
network is that any pair of processors should be able to communicate with each other as 
fast as possible.
2.3 Network Parameters
Interconnection networks are characterized by a number of parameters. Some of the most 
important parameters are given below-
• Network size: Total number of nodes in a network
• Node degree: The degree of a node is the total number of incoming and outgoing 
links [Be73]. The node degree represents the cost of a node from the 
communication point of view and hence a network topology with fixed and low 
node degree is favorable [SFK97].
10
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
• Diameter: The diameter of a graph G is the maximum of the shortest distance 
(hops) between any two nodes [Be73]. For a multi-processor architecture, the 
diameter is an important attribute and is related to information transfer delay. In 
order to achieve faster data communication, diameter should be kept as small as 
possible.
• Connectivity: Connectivity is the minimum number of arcs that have to be 
removed from the network to cut the network into two disconnected networks 
[Be73], [SFK97]. A graph with a connectivity of C can tolerate up to C-l edge- 
faults, since any pair of fault-free nodes can still find a path between the fault-free 
nodes. In other words, a network with a higher connectivity is preferable from the 
point of view of fault tolerance.
• Cost: The total number of communication links required by the network defines 
the network cost [SFK97].
2.4 Types of Interconnection Network
There are two basic types of interconnection network, static and dynamic.
• Static interconnection network
In case of a static interconnection network, all connections among the processors are 
fixed meaning that the processors are wired directly [SFK97]. Static interconnection 
networks are better where the problems are uniform and the communication pattern is 
predictable [SFK97].
11
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
• Dynamic interconnection network
The connections between the processors can be changed as the processors are connected 
by switch instead of direct wire. Dynamic interconnection is expensive.
Here we are only interested about the static interconnection network.
Static interconnection topologies
The way that nodes are interconnected is called the network topology [To94]. Static 
interconnection topologies can be classified according to their dimensions [SFK97]:
• One-dimensional topologies
• Two -  dimensional topologies
• Three-dimensional topologies
• Multidimensional topologies e.g. Hypercube, De Bruijn, Kautz etc.













Figure 2.2: Types of interconnection topologies
12
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.5 Examples of Some Simple Multiprocessor Architecture
2.5.1 Linear array
Linear array is the simplest and cheapest way to connect the processors of a parallel 
computer. Each processor has direct connection with two other processors except the 
boundary processors that have one. Figure 2.3 shows an example of such network. This 
network topology has worst diameter that is n-1 and arc connectivity that is only 1 
[SFK97].
Figure 2.3: Linear array
2.5.2 Ring
If the boundary nodes of a linear array are connected to each other, then the network 
topology is called ring topology. Figure 2.4 shows the ring network from a linear array. 
In a ring network all the nodes have two connections. It improves the connectivity and 




Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.5.3 Star
In a star topology there is one central node, to which all other nodes are connected as 
shown in Figure 2.5. Central node has n -  1 connections where all other nodes have only 
one connection.
Star network is a simple topology but not suitable for large configuration, as the number 




Figure 2.5: Star network
14
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.5.4 Tree
Nodes are interconnected in a tree structure as shown in Figure 2.6. It has smaller 
diameter (log n) and the degree of the nodes are 1 for leaf nodes, 2 for root node and 




Figure 2.6: Tree interconnection network 
2.5.5 Fully connected network
In this network topology, all the nodes are directly connected to each other. This topology 
is ideal from the point of view of network diameter that is 1 but node degree is n-1 for all 
the nodes. So the cost of this network is extremely high and it is not scalable to massive 
parallel computer.
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.





Figure 2.7: Fully connected network
2.5.6 Hypercube
One of the most popular topologies is the hypercube topology. The hypercube is one 
example of a multidimensional mesh of processors. A d-dimensional binary hypercube 
has 2d nodes. The hypercube topology is attractive for its small diameter (log n) and arc 
connectivity (log n). In binary hypercube each node has an address - a number between 0 
and 2d -  1. Two processors whose binary representations differ in exactly one bit are 
connected together. This property greatly facilitates the routing of messages through the 
network [LoSu94a]. The major disadvantage of hypercube is that its node degree is log n 
and hence the node degree grows as n increases [SFK97].
For a d-dimension hypercube, each node is connected to d nodes. Figure 2.8 shows an 
example of hypercube network. The hypercube and other related networks suffer from 
lack of scalability [LoSu94a].
16





Figure 2.8: Hypercube topologies of different dimensions
2.5.7 Mesh network
The term mesh has been used by various investigators in different ways. Following 
Ullman, we will use the term mesh to denote a square grid of processors [U184], so that 
the mesh network is a two dimensional arrangement of nodes in a Manhattan Street 
architecture. Among the static interconnection networks, the two-dimensional mesh is 
one of the most popular architectures as it has a very regular and simple architecture 
[HwBr83], [St83], [Le92]. Due to the constant node degree, the mesh network is highly 
scalable [LoSu94a]. Researchers have proposed several variations of the mesh 
architecture using techniques such as wrap-around, diagonal interconnections among the 
nodes. Some popular networks based on the mesh architecture are called two dimensional 
(2D) mesh, two-dimensional wrap around mesh (also called Torus) and three- 
dimensional (3D) mesh.
17
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2-D Mesh
The most popular mesh is the so-called two-dimensional (2D) mesh where nodes are 
arranged in a grid pattern as shown in Figure 2.9 [Ra92]. Except for the boundary 
processors every other processor is connected to its neighbours to the left, right, above 
and below through bi-directional links [Ra92]. Mesh networks represent a good 
compromise among the contradictory requirements of static network parameters 
[SFK97]. It has, relatively speaking, a short diameter and arc connectivity [SFK97]. In a 
2-dimensional mesh, it is convenient to identify nodes by the x-y coordinate values of 
their positions. Meshes are easy to implement and extend. Variations of the mesh 
topology are possible, depending on whether there is any wrap-around or diagonal 
interconnections among the nodes [SFK97].
Figure 2.9(a) and 2.9(b) show two different types of two-dimensional mesh network.
(aJItoa-D im fiaa^M esh Netwadk (8 x8 ) {&) 2D Meeh
Figure 2.9: (a) Two-Dimensional mesh (8x8), (b) 8-connected 2D mesh
18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Torus
A torus is defined as a mesh with wrap-around links as shown in Figure 2.10. We refer to 
a processor in row i and column j as Py, 0 < i, j < n. Processor Pi, o is connected to Pi, n-i 
and P0j is connected with Pn-ij [SFK97]. The 2D mesh and the torus network topologies 
are attractive because of simplicity, regularity, scalability and efficient use of space for 
their VLSI layouts [Le92], [SFK97], [Ra92],
Figure 2.10: Torus Network
Multi-Mesh
The Multi-Mesh (MM) interconnection network topology was proposed by D. Das, M. 
De and B. P. Sinha [DDS99]. The MM has been proposed as an efficient topology for 
optical networks [Le92] and peer-to-peer networks. The MM topology of order n uses
multiple n x n(two dimensional) meshes as the basic building blocks, n2 meshes are again 
arranged in the form of n x n  matrix and each matrix is termed as a block. A processor 
inside the block can be identified by specifying its x and y coordinates in its matrix. 
Similarly a block can also be identified by its x and y coordinates. In our notation B (a, P)
19
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
identifies a block where a, p (1 < a, p < n) are the x and y coordinates identifying the 
block and P (a, P, x, y) identifies a processor where the first two coordinates represent the 
position of its block and the last two coordinates represent the location of the processor 
within the block. If two processors are within the same block and are connected by an 
edge, we will call the two processors to be neighbors.
Based on the neighborhood, processors within a block are categorized into the following 
three classes-
1) The processors with two neighbors (the processors on the comers of the 
block) -
>  x = 1 or x = n,
>  y = 1 or y -n
We will call such processors as corner processors. It is obvious that in a 2D mesh 
there are exactly four such processors.
2) The processors with three neighbors -  the processors on the sides of the 
block (but not on comers) are characterized by x and y values such that exactly 
one of these coordinates are 1 or n. Such processors have
> (1 < x < n, y = 1 or y = n) or
> (x = 1 or x = «, 1 <y < n).
We will call such processors as boundary processors. There are 4(n-2) such 
processors.
3) The processors with four neighbors -  the processors each having four 
neighbors are called internal processor. There is exactly (n-1)2 such processors in 
a 2D block.
20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
D. Das et al [DDS99] have defined interconnection rules so that the proposed network 
topology of order n contains n processors by interconnecting n2 two dimensional 
meshes, each with n2 processors.
We show a MM network order 3 in Figure 2.11 [DDS99]. We have not shown all the 
interconnecting for clarity.
Figure 2.11: A Multi-Mesh network with 3 X 3  meshes
The Multi-Mesh topology corresponds to a regular graph1. With the same number of 
processors and the same number of links as in the case of a torus, the Multi-Mesh 
topology [DDS99] has the advantage of offering a much lower diameter. The authors 
have shown that the time complexities of different basic operations mapped on it are 
considerably less than those for many existing mesh-type topologies [DGS97], [DDS99]. 
Because of this property, it has also been proposed as an efficient topology for optical 
networks [SFK97].
1 In a graph if each node has same node degree, then the graph is called a regular graph [Be73].
21
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3D Mesh
The three-dimensional mesh or the 3D mesh can be thought of as n layers of 2D meshes 
arranged in the third dimension or the z direction [LiFiOl]. As a result, there are n x n 2 = 
n3 processors in a 3D mesh. Figure 2.12 shows an example of a 3D mesh. The three 
dimensional (3D) mesh improves the diameter (from A 1 2 to A 13) and arc connectivity 
(from 2 to 3) compared to the 2D mesh.
r
V ProcessorY Intra-blocklink
Figure 2.12 3D mesh
In a 3D mesh, n3 processors are arranged along three orthogonal dimensions, say x, y 
and z, so that a processor at coordinates (x, y, z) (which we will denote by P(x, y, z)) is 
connected to six other neighbouring processors at P(x+1, y, z), P(x-1, y, z), P(x, y+1, y, 
z), P(x, y-1, z), P(x, y, z+1) and P(x, y, z-1), when they exist, for all integer values of x, y 
and z, 1 < x, y, z < n. Some processors have 3, 4 or 5 neighbours depending on their 
position in the 3D-mesh while the remaining processors have all 6 neighbours.
22
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.6 Optical Technology and Optical Communication
The primary bottleneck in today’s metal-based interconnection networks is the very 
limited bandwidth of long copper lines, which results in limited communication speed 
[LoSu94b]. Optical interconnects offer high-speed computers key advantages over metal 
interconnects which includes: (1) high spatial and temporal bandwidths, (2) high-speed 
transmission, (3) low crosstalk independent of data rates, and (4) high interconnect 
densities [LoSu94b].
In this section, we will describe the following topics on optical devices and optical 





> Wavelength routed Network
> Use of optical technology in interconnection network design
Due to lack of space we will not review details of optical technology such as optical 
amplifiers, receivers and transmitters, filters and gratings [So03], [Mu97], [StBa99], 
[BBRM97].
23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.6.1 Optical fiber
Optical fiber is the medium of data transmission in an optical network. Optical fiber is a 
thin filament of glass, which acts as a wave-guide [BBRM97]. Fiber is attractive as a 
communication medium due to the following advantages [So03], [Mu97]:
>  High speed,
> Huge bandwidth
> High security
> Low bit error rate,
> No electromagnetic interference,
> Low power requirement and
> Low signal attenuation.
2.6.2 Optical couplers
Coupler is a general term that covers all devices that combine beams of light into or split 
into beams of light out of a fiber [Mu97]. A splitter is a coupler that divides the optical 
signal on one fiber to two or more fibers. Combiners are the reverse of splitters, and when 
turned around, a combiner can be used as a splitter [Mu97]. The following figure [Mu97] 
shows an example of these devices-
<a) <b) <c)
Figure 2.13: (a) Splitter, (b) Combiner and (c) Coupler
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.6.3 Optical multiplexers and demultiplexers
Optical multiplexers are used to combine several independent signals at different 
wavelength into one fiber. A demultiplexer works exactly the opposite way that is 
splitting the signals at different wavelengths. Figure 2.14 shows an example of a 
multiplexer and a demultiplexer.
De multiplexerMultiplexer
(a) (b)
Figure 2.14: (a) Multiplexer (b) Demultiplexer
2.6.4 Passive star coupler
One type of optical networks using multi-wavelength fiber links is to use a passive star 
coupler, the star coupler is a “broadcast” device, so that an optical signal transmitted 
using a given wavelength from a node in the network will be communicated to all other 
nodes in the network. This means that the power of the transmitted signal will be equally 
divided among all the output ports connected to the coupler [Mu97]. Figure 2.15 [Mu97] 
shows an example o f a passive star coupler where a signal using wavelength XI from 
input fiber 1 and another on wavelength X4 from input fiber 4 are broadcast to all output
25
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ports. There is a problem in using star coupler that is, a collision may occur when two or 
more signals from the input fibers are simultaneously launched into the star on the same 
wavelength [Mu97].
A 1 , . ..  A 4
A 1 , . ..  A 4
Figure 2.15: A 4 x 4 passive star coupler
2.6.5 Routers
In an optical network, a router is a device that is connected to a number of fibers, some
carrying incoming optical signals to the router and the others carrying outgoing optical
signals. A router determines how the incoming signals will be directed to outgoing fibers.
Figure 2.16 shows a router with 3 fibers carrying incoming signals and 3 fibers carrying
outgoing signals.















The control settings on the router determine the actual routing. For example in figure 
2.16, the signal at wavelength Xi on fiber 1 may need to be directed to fiber 5. Figure 
2.17 [Mu97] shows an example of a passive router -  wavelengths XI, X2, X3 and X4 
incident on Input fibers 1, 2, 3 and 4 respectively [Mu97]. By using this device we can 
reuse the wavelengths.
o u t p u t lI n p u t l A 2
A1
A 4
o u t p u t 4I n p u t 4
Figure 2.17: A 4 x 4 passive Router
Figure 2.17 shows how a number of MUX/DEMUX allows us to define routers.
2.6.6 WDM network
The huge bandwidth of optical fiber allows a tremendous amount of data transmission 
rate. It is technologically impossible to exploit all of that bandwidth using a single high- 
capacity channel [StBa99] . Due to the fact that this is enormously more than the speed 
of electronic communication, wavelength-division multiplexing (WDM) is a promising 
approach that can be used to exploit the huge bandwidth of optical fiber [Mu97] 
[BBRM97], [StBa99]. In WDM, the optical transmission spectrum is divided into a
27
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
number of non-overlapping wavelength (or frequency) bands, with each wavelength 
supporting a single communication channel operating at peak electronic speed 
[BBRM97].
2.6.7 Wavelength routed network
A WDM network using passive coupler is not viable when the network contains a large 
number of nodes due to the power requirements of such a broadcasting network [Mu97]. 
A wavelength routed WDM network is a network where each end-node (the source or 
destination of data) is connected to a router and each router is connected to other routers. 
Figure 2.18 shows a small wavelength routed WDM network where a square represents 
an end node and oval represents a router. The advantage of such network is that the data 
is not broadcast to all the end-nodes. The settings of the routers determine which end- 
nodes will be connected by a lightpath (all-optical path through which the information 
flows in a wavelength-routed optical network).
R2
R1
Figure 2.18: A wavelength-routed network
28
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.6.8 Single-hop network
A network in which a packet is sent directly (in one hop) from it’s source processor to the 
destination processor without routing through any intermediate processor [Mu97].
2.6.9 Multi-hop network
A network in which a packet may travel through zero or more intermediate processors 
before it reaches to its final destination [Mu97].
2.6.10 Routing and wavelength assignment
Given a network topology and a set of lightpaths (to be determined), routing the 
lightpaths in the network and assigning wavelengths to these lightpaths is referred as the 
routing and wavelength assignment (RWA) problem [Mu97].
2.6.11 Fault tolerant optical network
With WDM optical network each physical fiber link is able to support many lightpaths.
As network grows in size and complexity the amount of lightpaths become more, so the 
failure of a fiber link may causes to significant data losses. In order to have a fault 
tolerant WDM network, it is very important to handle these types of fiber faults. Since 
single fiber failures are the major form of failures in optical networks, in this thesis our 
focus is on the single faults.
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.7 Use of Optical Technology in Interconnection Network Design
In realizing VLSI multiprocessor systems, an obvious approach for creating the links 
between processors is the use of VLSI fabrication technology for example using the 
metal 1 or metal2 layer and has been done for some time [U184]. It is well known that 
implementing copper base connections to realize complex interconnection is problematic 
since long copper wires are needed for such complex topologies [U184]. A problem of 
metal interconnect technology is that long copper wire accentuates problems like skin 
effect, crosstalk, interference, wave reflections, electrical noise due to current changes, 
and dielectric imperfections [LoSu94b]. These problems can cause severe pulse 
distortions and attenuation, clock skew, and random propagation delays [StCo91].
According to Louri and Sung [LoSu94b] multiprocessor systems based on metal 
interconnects experience the technological limitations of communication bandwidth 
constraints, low interconnect density, long network latencies, and high power 
requirements. Metal-based communications between subsystems and chip has become the 
limiting factor in high-speed computing; maturing optics-based technologies offer 
advantages that may unplug this bottleneck [LoSu94b].
As optical technology has evolved in the last decade, an obvious approach to this 
bottleneck is to use optical technology.
30
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.7.1 Advantages of optical interconnects in multi-computers systems
A summary of advantages of using optical interconnections are given below [LoSu94b], 
[ChKr93], [So03]:
> Optics allows inherent parallelism
> Optical Communication has higher bandwidth
> Optical signal propagate in parallel channels without interference
> There is less signal crosstalk in optical communication
> Optical communication is inherently immune from electromagnetic 
interference and ground loops
> There is lower signal and clock skew and lower power dissipation in 
optical communication
>  Propagation speed for optical signals is, for short distances, essentially 
independent of communication distance
> There is potential for reconfigurable interconnects
2.7.2 Free space optical interconnects
Free-space optical interconnects exploits air space for optical signal propagation 
[LoSu94b]. In order to provide communication channel for free space interconnection, 
lenses and holograms are used as optical elements.
Free space interconnects are classified into two categories [LoSu94b]:
1. Space-variant
2. Space-invariant
A totally space-invariant network has a regular structure where each node has same 
connection patterns shown in Figure 2.21 [LoSu94b] whereas in totally space-variant
31
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
network there is no regular pattern (arbitrary interconnection) between the nodes. Figure 
2.21 shows such interconnection.
i m
m(«>
Figure 2.19: Free-Space: (a) Space-Variant (b) Space-Invariant
2.8 Interconnection Networks based on Opto-Electronic Technology
In this section we present two recently proposed high-throughput hybrid optical 
multiprocessor architectures.
2.8.1 OMMH
Optical multi-mesh hypercube (OMMH) network topology for multiprocessor network is 
proposed by Louri and Sung [LoSu94a]. The OMMH network uses meshes and 
hypercube as the basic building blocks. This network topology combines the advantages 
of meshes (constant node degree and scalability) and hypercubes (small diameter, high 
connectivity, symmetry, simple control, routing and fault tolerance) and avoids the 
disadvantages of the lack of scalability of hypercube and the large diameters of meshes.
32
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
This network can maintain constant node degree regardless of the increase in the network 
size [LoSu94a]. The authors claim that the flexibility of the OMMH network makes it 
well suited for optical implementations. The OMMH network uses a three dimensional 
optical design based on ffee-space optics. The analysis and simulations results in 
[LoSu94a] show that the OMMH network is scalable, efficient in communication and 
highly fault-tolerant. Optical implementation of the network is possible with the existing 
hardware. Figure 2.22 [LoSu94a] shows an example of an OMMH network.
¥TY?! • m £ : 2 z : : : :
i i i i i 1 i i ! ! ! ! ! ! ! !  n  m  H i  H i U i  i i•bodafgb 1jklanep qi(tuv«s ys01234S
Figure 2.20: A  (4, 4, 3) OMMH network with 128 nodes
33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2.8.2 OTIS-Mesh
Optical Transpose Interconnect System (OTIS) is proposed by Marsden et. al 
[MMHE93]. OTIS architecture is an example of a hybrid architecture in which the 
processors are partitioned into groups where processors within a group are connected by 
electronic links and processors situated on the different groups are interconnected by 
optical links.
OTIS-Mesh is a type of OTIS computers where a number of well known algorithms can 
be efficiently mapped on OTIS-Mesh architecture [OsOO], [SaWa97], [WaSaOO], 
[ZMPEOO]. OTIS-Mesh is also a hybrid architecture that uses the same idea of OTIS 
computer. Figure 2.21 [WaSaOl] shows an OTIS-Mesh containing 16 processors where 
small square boxes denote processor and large square boxes represents a group of 
processors. The groups are arranged in two-dimensional arrays.
<0,0> (0,1) 
group O group 1
1, 1
O, 0,3 X. 3
3,0 3,1
2.3 3,2 3,3
group 2 group 3
(1, 0) <1. 1)
Figure 2.21: An example of OTIS-mesh network with 16 nodes
34
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 3 
Topology of 3D Multi-Mesh
In this chapter we introduce a new network topology, called the 3D Multi-Mesh (3D 
MM) for multiprocessor architecture. As discussed in chapter 2, the 2D Multi-Mesh 
architecture [DDS99] uses a n x n  mesh of processors as its basic building block and 
each processor in a n x n  mesh may be identified by specifying a x-coordinate value (say 
x) and a y-coordinate value (say y). In a block, the processors having x = 1, x = n or 
having y = 1 or y = n have less than 4 connections to other processors within the same 
block. We have seen that these processors are connected to processors in other blocks of 
n x n  processors in a particular pattern resulting in a network with attractive topological 
properties. It is well known that a n x n x n  mesh has better diameter and connectivity 
compared to a n x n  mesh [LiFiOl], [SFK97]. It is therefore reasonable to extend the idea 
of interconnecting blocks of 2-dimensional (i.e., n x n )  meshes of processors to the idea 
of interconnecting 3-dimensional (i.e., n x n x n )  meshes of processors. This is the topic 
that we will explore in this chapter. Our proposed network consists of n3 three- 
dimensional meshes, each having n3 processors, interconnected in a suitable manner so
that the resulting topology is 6-regular with n6 processors. We will call such a network a 
3D Multi-Mesh (3D MM) of order n. In this chapter we introduce the 3D MM topology, 
analyze its architectural properties and compare it to other network architectures.
35
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3.1 Description of a 3D Block
The basic building block of the proposed 3D MM of order n is the n x n x n  Mesh, which 
we will call a 3D block, consists of n3 processor nodes. We may visualize a 3D block of 
«3 processors as consisting of n planes of n x n  two-dimensional meshes of processors. 
We show an example of a 3D block of order 3 (n = 3) in Figure 3.1. The 3D MM of 
order n consists of n3 such 3D blocks arranged in a three-dimensional n x n x n  array, so 
that there are altogether N  = n6 processors in a 3D MM network. We show a 3D MM 
network of order 3 in Figure 3.2. An n x n x n  3D block has (n-2) x (n-2) x (n-2) 
processors in the block, each having 6 links to other processors inside the same block. 
Each of the remaining processors lies on the six faces1 of the block and has 3, 4 or 5 
links, depending on the position of the processor in the block. Extending the idea used in 
the Multi-Mesh [DDS99] architecture, we connect the processors on the six surfaces of a 
3D block to the processors on the faces of other 3D blocks according to the inter-block 






Figure 3.1: A 3D block of order 3
1 A  face o f  a cube represents the first or the last plane o f  3D mesh. A  processor P(x, y, z) on the face o f  a 
cube have the value o f  1 or n for at least one o f  the coordinates x, y  or z.
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3D Block
Figure 3.2: 3D MM network of order 3 
3.1.1 Intra-block connection
We arrange the three dimensional mesh (forming the basic building block of the proposed 
3D Multi-Mesh network) consisting of n3 processors along the three orthogonal 
dimensions, say x, y and z, so that a processor within a 3D block is uniquely identified by 
three coordinates x, y, z. A processor identified by the coordinates x, y, z is connected to 
six other neighboring processors (processors within the block that are connected by 
links), when they exist. These neighboring processors are identified by -
> (x + i,y,z),
> ( x - i , y , z ) ,
> (x ,y+l ,z) ,
> (x, y-1, z),
> (x, y, z+1) and
> (x, y, z-1)
Figure 3.1 shows how neighboring processors are connected by the intra-block links.
37
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3.1.2 Categorization of processors
It is important to note that all 6 neighboring processors may not always exist. We will
term a processor, which has all of its six neighbors in all three dimensions (for 1 < x, y, z
< n), as an internal processor since all its connections are to other processors within the
same block. However, the processors on the six faces of the block identified by x = 1, x
= n, y = l , y  = «, z = 1 and z = n, will have less than six neighbors each. We categorize
these processors as follows:
1) The processors with three neighbors -  the processors on the comers of the 
block have
> x = 1 or x = n,
>  y = 1 or y = n,
> z = 1 or z = n.
We will call such processors as corner processors. It is obvious that we will have 
exactly eight such processors.
2) The processors with four neighbors -  the processors on the sides of the block 
(but not on comers) are characterized by x, y and z values such that exactly two of 
these coordinates are 1 or n. Such processors have
> (x = 1 or x = n, y = 1 or y = n, 1 < z < n) or
>  (x = 1 or x = n, 1 < y < n, z = 1 or z = n) or
>  (1 < x < n, y = 1 or y = n, z = 1 or z = n).
We will call such processors as boundary edge processors. We will have exactly
8(n-2) such processors.
38
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3) The processors with five neighbors -  the processors on the faces of the block 
(but not on sides or comers) are characterized by x, y and z values such that 
exactly one of these coordinates is either 1 or n. Such processors have
>• (x = 1 or x = n, 1 < y < n, 1 < z < n) or
> (1 < x < n, y = 1 or y = n, 1 < z < n) or
> (1 < x < n ,  1 <y < n ,  z =  1 o rz = «).
We will call such processors as face-centered processors. We will have exactly
6(n-2) such processors in a 3D block.
3.1.3 Inter-block connections
Our 3D Multi-Mesh is an interconnection of n3 3D blocks arranged along the three 
orthogonal dimensions as shown in Figure 3.2. We designate with the symbols a, p and y 
respectively (to make them distinct from x, y and z) the coordinate values along the three 
orthogonal dimensions. Thus, we now have a total of n6 processors where each 
processor can be uniquely identified by its six coordinate values a, P, y, x, y, z that we 
will denoted by P (a, P, y, x, y, z). We will characterize any particular 3D block by a 
given set of values for a, P and y coordinates and we will denote a block by B (a, P, y). 
We connect all the processors on the six faces of each 3D block to the processors on the 
faces of other 3D blocks by one or more inter-block links so that each processor 
eventually has exactly six links to other processors (either in the same 3D block or in 
other 3D block(s)).
39
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
We describe below the rules for the inter-block links.
3.1.4 Rules for inter-block connections
Inter-block Rule 1: (Links from y = 1 and y - n  planes)
The processor P (a, (3, y, x, 1, z) is connected to the processor P (a, x, y, (3, n, z) by a 
symmetric link for all a, y, z where 1 < a, (3, y, x, z< n . We denote this by 
Va, y, z, P (a, p, y, x, 1, z) <-> P (a, x, y, P, n, z)
Such links allow us to interchange only the values of p and x and we will refer to 
these links using the notation Va, y, z (P <-> x). We note that the value of z is not 
changed for the processors connected by these links.
Inter-block Rule 2; (Links from x = 1 and x = n planes)
The processor, P (a, P, y, 1, y, z) is connected to the processor P (z, P, y, n, y, a) by a 
symmetric link for all Vp, y, y where 1< a, P, y, y, z < n. We denote this by 
Vp, y, y, p (a, p, y, 1, y, z) P (z, p, y, n, y, a)
Such links allow us to interchange only the a  and z values and we will refer to these 
links using the notation VP, y, y (a  z). We note that the value of y is not changed 
for the processors connected by these links.
Inter-block Rule 3: (Links from z = 1 and z = n planes)
The processor P (a, P, y, x, y, 1) is connected to the processor P (a, P, y, x, y, n) by a 
symmetric link for all a , p, x, where 1< a,P, y, x, y < n. We denote this by 
Va, p, x, P (a, p, y, x, y, 1) <-> P (a, p, y, x, y, n)
Such links allow us to interchange only the y and y values and we will refer to these 
links using the notation Va, P, x (y <-> y). We note that the value of x is not changed 
for the processors connected by these links.
40
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
From the inter-block connection rules, the following properties follow immediately. 
Property 1:
Starting from a given 3D block, identified by coordinates (al, [31, yl), we can always 
find a suitable processor on one o f its faces, from which we can reach, using only one 
inter-block link, any other 3D block, identified by (a.2, [32, y2), provided exactly 2 o f  
the coordinates o f  (al, y31, yl) are identical to the corresponding coordinates o f  (a2, 
[32, y2).
Property 2:
The 3D Multi-Mesh corresponds to a regular graph where each processor is 
connected to exactly 6 other processors.
The connections are somewhat complicated; to simplify the situation, in Figure 3.3, we 
are showing only the blocks having a  -  1 and y = 1 and we show only the inter-block 
connections along the y-axis for the processors having z = 1.
   &
~3----—ft-
Figure 3.3 Interconnections along the y-coordinate
41
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
In Figure 3.4 we show an example of a 3D MM network of order 2 (n = 2), where we 
show all the inter-block connections for the processors in the block having a  =1 , (3=1, 
y = 1. All other links are not shown.
Figure 3.4: 3D MM network of order 2
3.2 Topological Properties of the 3D Multi-Mesh Network
3.2.1 Diameter
In a graph G, the diameter is the maximum possible value of the length of the shortest 
path between any two nodes of G [Be73]. This is a very important metric for any 
interconnection network. In this section we show that the diameter of the 3D Multi-Mesh 
of order n is 3n. To prove this we have to show that, in a 3D Multi-Mesh of order n, it is
42
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
always possible to define a path from any source to any destination having a length of 3n 
or less.
We consider the source processor S = P (a l, p i, yl, x l, yl, z l) and the destination 
processor D = P (a2, p2, y2, x2, y2, z2), 1 < a l ,  p i, yl, x l, yl, z l, a2, P2, y2, x2, y l, z l  
< n, so that the 3D block corresponding to S is B (a l ,  p i, yl) and that corresponding to D 
is B (oc2, P2, y2). There are three situations to consider:
Situation 1: In this case, exactly two of the coordinates of the source block B (a l ,  p i, 
yl) have the same value as those of the corresponding coordinates in the destination 
block B (a2, P2, y l ) .  In this case, it may be readily verified, from the interconnection 
rules given above, that there exists a direct link between the source block and the 
destination block.
Situation 2: In this case, exactly one of the coordinates of the source block B (a l ,  p i, 
yl) has the same value as that of the corresponding coordinate in the destination block B 
(ot2, P2, y l ) .  In this case, it may be readily verified, from the interconnection rules given 
above that there exists an intermediate block B (a3, P3, y3), such that there is a direct 
link between the source block B (a l ,  p i, yl) and the intermediate block B (a3, P3, 
y3) and a direct link between the intermediate block B (a3, p3, y3) and the destination 
block. There are 3 possible choices for the values of (a3, P3, y3) - (a l ,  p i, y2), (a l ,  p2, 
yl), (a2,p l ,y l ) .
43
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Situation 3: In this case, none of the coordinates of the source block B (a l ,  (31, yl) have 
the same value as that of the corresponding coordinate in the destination block B (oc2, (32, 
y2). In other words, a l 4  a2, pi 4 P2 and yl 4 y2. In this case, there exist two 
intermediate blocks B (a3, P3, y3) and B (a4, p4, y4) such that there is a direct link 
between
- the source block B (al, p i, yl) and the intermediate block B(a3, (33, y3),
- the block B(a3, P3, y3) and the block B(a4, P4, y4) and
- the block B (o4, P4, y4) and the destination block B(a2, P2, y2).
There are a number of ways in which we may choose the intermediate blocks B (a3, P3, 
y3) and B (a4, P4, y4). For example, we could select B (a2, (31, yl) and B (a2, p i, y2) 
as intermediate nodes.
Since we use 6 coordinates to denote a processor, it is convenient to consider a 6- 
dimensional space where we have a point in that space, representing a processor 
whenever we specify all the 6 coordinates, (a, P, y, x, y, z). A number of processors that 
share 5 of these components must lie on a line. In other words, we may visualize a line of 
processors by specifying any 5 of these 6 components. For example, in figure 3.5, by 
specifying (a, p, y, 2, 2, *) we are specifying the processors (a, P, y, 2 , 2, 1), (a, p, y, 2, 
2 , 2), (a, P, y, 2, 2 , 3) which are next to one another and forms a line of processors. 
Extending the idea, if we specify any 4 of these 6 components, we define a plane.
44
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
For example by specifying (a, P, y, 2, *, *) we are specifying the following lines of 
processors:
- (a, p, y, 2, 1, 1), (a, P, y, 2, 1, 2), (a, p, y, 2, 1, 3)
- (a, p, y, 2 ,2 ,1 ), (a, p, y, 2 ,2 ,2), (a, p, y, 2 ,2, 3),
- (a, p, y, 2 ,3 ,1), (a, p, y, 2 ,3 ,1), (a, p, y, 2, 3,1),
Theorem 1: There always exists a path of length < 3n from any processor P (a l ,  p i, yl,
x l, y l, z l) to any other processor P (a2, P2, y2, x2, y2, z2).
Proof:
If we divide the source block by three imaginary planes - (a l ,  p i, yl, P2, *, *), (a l ,  p i, 
yl, *, y2, *) and (a l ,  p i, yl, *, *, a2) as we show in figure 3.5, we get 8 octants in the 
source block which we will denote as SOI, S02, S03, S04, S05, S06, S07 and S08. 
Similarly by dividing the destination block by three other imaginary planes (a2, P2, y2, 
p i, *, *), (a2, P2, y2, *, yl, *) and (a2, P2, y2, *, *, a l )  we’ll get 8 octants- DOl, D02, 
D03, D04, D05, D06, D07 and D08 in the destination block.
45







Figure 3.5: Three imaginary planes divide the source block into 8 octants
Since the source (destination) node S = P (a l, p i, yl, x l, y l, z l) (respectively D = P (a2, 
p2, y2, x2, y2, z2)) may be in any one of the 8 octants in the source (destination) block, 
we have to consider 64 possible octet pair combinations for the source destination pair (S, 
D). To illustrate our approach, we will only consider the case where the source 
(destination) node is in the octet SOI (DOl). In other words, 1 < x l < P2, 1 < yl < y2, 1 
< zl < a l,  1 < x2 < p i, 1 < y2 < yl, 1 < z2 < a l .  A possible path PT1 from the source 
node (which is in the block (a l, p i, yl)) to the destination node (in the block (a2, P2, 
y2)) using the intermediate blocks (a 2, p i, yl) and (a 2, p2, yl) may be formulated as 
follows-
Path PT1:
P (a l, p i, yl, x l, y l, z l) -> ... P (a l ,  p i, yl, 1, yl, a2) -> P (a2, p i, yl, n, 
y l, a l )  -» ... -> P (a2, p i, yl, p 2 ,1, a l )  -» P (a2, P2, yl, p i, n, a l )  -» ... -»
P (a2, p2, yl, p i, y2, n) -» P(oc2, p2, y2, p i, yl, 1) -» ... -> P (a2, p2, y2, x2,
46
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
y2, zl).
The length of this path is Lpxi where
Lpti = (xl - 1) + (oc2 -  z l) + 1 + (n- (32) + (yl -  1) + 1 + (n - 72) + (n - a l )  + 1 + (Pl- 
x2) + (yl -  y2) + (z2 - 1)
= x l- l+ a 2-z l+ l+ «-p2+ y l -  l + l +  n -  y2 + w - a l  + l + pi-  x2 + y l -  y l  + z2 -1 
= 3n + xl  + yl - zl - a l  + pi + yl - x2 - y2 + z2 + a2 - P2 - y2.
In a similar way, a possible path PT2 from the source node to the destination node using 
the intermediate blocks (a l, p i, y2) and (a l, P2, y2) maybe formulated as follows-
Path PT2:
P (a l, p i, yl, x l, y l, z l) ^ P  (a l ,  p i, yl, x l, y2, 1) -> P (a l ,  p i, y2, xl, 
yl, n) ... P (a l, p i, y2, P2, n, n) -» P (a l ,  P2, y2, p i, 1, n) -> ... P (a l, 
P2, y2, n, 1, a2) -» P (a2, P2, y2, 1, 1, a l )  -» ... h> P (a 2, P2, y2 , x2, y l, zl).
The length of this path is Lpt2 where
Lpt2= (y2 - y l) + (zl - 1) + 1 + (p2 - xl) + in - yl) + 1 + (n - p i) + (n - a2) + 1 + (x2 - 1) 
+ (y2 -  1) + (a l-  z2)
= y2-yl+ z 1 -1+ 1+P2 - x l + w - y l  + l +  w - p i + w  - a 2 + 1 + x2 -1  + y2 — 1+ a l-z 2 
= 3« -  x l -  yl + zl + a l  - p i - yl + x2 + y2 -  z2 - a2 + P2 + y2.
It may be readily verified that the sum of these two path lengths are Lpxi + Lpx2 = 6«. 
Therefore the smaller of these two paths must be 3n or less.
For other 63 possible cases of source and destination processor locations in various 
octants have been checked in the similar way.
47
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Next we show that there exists at least one source and destination pair in the network 
whose minimum distance is 3n.We have to consider two situations - n is even and n is
odd
1. If n is even, let us consider the source processor P ( l,l ,l , 1, 1,1,) and the
Yl Yl Yl Yl Yl Yl
destination processor P(— + 1, — + 1, —+ 1, —+ 1, —+ 1, —+ 1).
2 2 2 2 2 2
2. If n is odd, we consider the source P (1 ,1 ,1 ,1 ,1 ,1)  and the destination
n f n + 1 n + 1 n + 1 n + 1 n + 1 n + 1^
\  I  5 I  5 ”  5 ”  5 I  5 “  /  *2 2 2 2 2 2
Situation 1: P (1,1,1,1,1,1,) t o P ( —+ 1, —+ 1, —+ 1, —+ 1,—+ 1, —+ 1), nis  even
2 2 2 2 2 2
p ( u , i , i , i , i ) - » p  a , i ,  1 , 1 , 1 ,  | + i ) - >  p  ( f + i ,  1 , 1 , i , i ) - » p  ( f  +  i ’ i ’ i ’ §
+ 1, 1, l ) - »  P ( ^ + l ,  ^ + 1 ,  1, 1, n, 1 ) - > P ( ^ + 1 ,  ^ + 1 , 1 ,  1, ^ + 1, 1) - > p
Z  Z  z  z  z
( —+1, —+1, —+1, 1, 1, «) —> P ( —+1, — +1, — +1, — +1, — +1, —+1)
2 2 2 2 2 2 2 2 2
The cost of this path is-
( ^ + l - l ) + l + ( » - ^ - l ) + l + ( n - ^ - l ) + l + ( | + l  -1) + (-^ + 1- 1) + (w- ~  - 1)
_ n + 2 + 2n - n - 2 + 2 + 2n - n - 2 + 2 + n + « + 2n - n - 2
2




Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
_ ..................... . „ , n  + 1 n + 1 n + 1 n + 1 n + 1 n + 1 . . , ,
Situation 2: P ( l ,  1,1,1,1,  l ) t o P ( —— — , —— , —— , n is odd
z  z  z  z  /> z>
P (1, 1, 1, 1, 1, 1) -+ P (1, 1, 1, 1, 1 ,^ ± 1 ) ( ^ ± i ,  1, 1, n, 1, 1) ^ P  ( ^ ± 1 ,  1,
1 W +  1 1 u  +  l  n  +  l  ! 1N t» / W +  1 W +  1 1 1 n  +  l  1 \  .1, —  , 1 , 1 ) - » P ( — , l , n ,  1 ) - » P (  — , 1 , 1 , —  , 1)~>
„ . n  + l n + 1 n + 1 . , . „ . n  + l n + 1 n + 1 n + 1 n + 1 n + 1 .
P (  s s  , 1, 1, ») - > P  ( -------, ------, --------. -------->------ » -------- )2 2 2 2 2 2 2 2 2
The cost of this path is:
.n + 1 .. , , n + 1 . . , n + 1 . , .n + 1 .n + 1 .. . n + 1 .
(—  - l )  + l +  ( n - —  ) + 1 + (n — — ) + 1 + ( — —  1) + ( —  1) + (« — —  )
n + l - 2 + 2 + 2n - n - l + 2 + 2n - n - l + 2 + n + l - 2 + n + l - 2 + 2n - n - l
_ 9n -  3n + 9 -  9 
2
= 3 n
The following path shows an example from comer to comer- the source processor is P (1, 
1 ,1 ,1 ,1 ,1)  and the destination processor is P (n, n, n, n, n, n) and
Path: P (1, 1, 1, 1, 1, 1) —» P (1, 1, 1, 1, 1, n) -+ P (n, 1, 1, n, 1, 1) -+ P (n, n, 1, 1, n, 1)
—» P (n, n, n, 1, 1, n) —» P (n, n, n, n, n, n)
The length of this path is (n -  1) + 1 + 1+ 1 + (n -1) + (n-1) = 3n
The diameter of the 3D MM is only O ( N 1/6) in contrast to O ( JVI/3) on a 3-dimensional
toms with the same node degree of 6. We note that the Multi-Mesh has a diameter of O 
( TV1; 4) with a node degree of 4 that was shown to be attractive with respect to other 
topologies [DDS99], [HwBr83], [Le92].
49
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The node degree and the diameters of the Hypercube, the Multi-Mesh and the 3D MM 
containing N processors given in table 3.1.
Table 3.1: Diameter of Hypercube, Multi-Mesh and 3D MM








log2 N l0g2N 4 N iy4 6 N 1/6
As an example, Table 3.2 shows a comparison between the diameter of a hypercube, 
Multi-Mesh and 3D MM network for different total number of nodes (A7).
Table 3.2: An example of Diameter of Hypercube, Multi-Mesh and 3D MM
# of 
nodes








64 6 6 4 6 6 6
4096 12 12 4 16 6 12
256K 18 18 4 44 6 24
16M 24 24 4 126 6 48
Thus, for N  = 4096, the diameter of both the 3D MM network and the binary hypercube 
is equal to 12, but the node degree of the corresponding hypercube is 12, while that of the 
3D MM network is only 6. In other words the diameter for 3D MM networks with 4096 
processors is less and the node degree is constant.
3.2.2 Connectivity of Multi-Mesh network
According to D. Sima, T.Fountain, P.Kacsuk [Be73], [SFK97], the connectivity of a 
graph is defined as the minimum number of arcs of a connected graph that have to be 
removed in order that the resulting sub-graph consists of two disconnected sub-graphs. It 
is also well known that if  the connectivity of a graph is C, we can always find C node 
disjoint paths between any pair of nodes [SBS01].
50
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
An interconnection network with a higher connectivity is preferable since higher 
connectivity implies better fault tolerance and higher capability for load balancing.
The Multi-Mesh (MM) network that we described in chapter 2 is a regular graph [Be73] 
where the node degree of each processor in the network is four. As a result, the upper 
bound of connectivity of any MM is four. In this section we will prove that the 
connectivity of MM is exactly four.
As we described in Chapter 2, the two-dimensional mesh is the basic building block in a 
Multi-Mesh network. In [DDS99], based on the position of a processor within a block, 




Within a block, an internal processor has exactly four neighbor processor (connected by 
intra-block links), a boundary processor has three neighbors and a comer processor has 
two neighbors. D. De, D and B.P. Sinha [DDS99] shows how the inter-block links of a 
MM network ensure that each processor in a MM network has exactly four links.
51
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Theorem 2:
The connectivity of a Multi-Mesh network is 4.
Proof
In order to prove this, we have to show that, regardless of the position of the source and 
the destination, we can always find 4 edge-disjoint paths EDI, ED2, ED3 and ED4. The 
source and the destination may be in the same block or in different blocks. We will 
discuss only the case where they are in different blocks since that is the more challenging 
task.
We need to consider 9 possible combinations of source and destination processor 
categories. We will consider the following two cases -
• C asel: The source and the destination are both internal processors,
• Case 2: The source is a boundary processor and the destination is an internal 
processor.
The remaining 7 can be handled in the similar way.
Case 1: The source and the destination are both internal processors.
If the source and destination are both internal, then the following conditions are hold:
i) 1 < xl < p2 and 1 < yl < a 2
ii) 1 < x2 < p i and 1 < y2 < a l
Since both the source node and the destination node are internal processors, they both 
have 4 neighbors. We now show how we may create four edge disjoint paths EDI, ED2, 
ED3 and ED4 from the source to the destination node.
52
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
For each path, we
a) first give the path at the block level where we only specify the blocks 
used in the path,
b) then give a short description of the path,
c) finally give a detailed description of the path used.
In giving a short description of a path, we have used the notation X —>* Y to denote that 
we have used a number of intra-block edges to go from processor X to processor Y.
Path EDI:
a) At the block level the path is as follows:
B (a l ,  p i)  -> B (oc2, p i) B (oc2, p2).
b) A short description of the path is as follows:
P (a l, p i, x l, y l) V  P (a l, p i, 1, a2) -4  P (a2, p i, n, a l)  V  P (a2, p i, p2, n)
- 4  P (a2, p2, p i, 1) V  P (oc2, p2, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, x l, y l) -+ P (a l ,  p i, xl-1, y l) -+ ... -4  P (a l ,  p i, 1, yl) - 4  
P (a l, p i, 1, yl + 1) -> ... P (a l, p i, 1, a2) -> P (a2, p i, n, a l )
P (a2, p i, n, a l  + 1) —>... —̂ P (oc2, p i, n, n) —> P (oc2, p i, n - 1, n) —>... —>
P (a2, p i, P2, n) -» P (a2, p2, p i, 1) -» P (a2, p2, p i — 1,1) —» ... -»
P (a2, p2, x 2 ,1) -> P (a2, p2, x2, 1+1) -4 . . .-4 P (a2, p2, x2, y2).
Path ED2:
a) At the block level the path is as follows:
B (a l, p i) ->• B (a2, p i) -> B (a2, p2)
b) A short description of the path is as follows:
P (a l, p i, x l, y l) - 4* P (a l , p i, n, a2) -4  P (oc2, p i, 1, a l )  V  P (a2, p i, p 2 , l )
53
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
-> P (o2, p2, p i , «) * -> P (a2, p2, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, x l, y l) —> P (a l, p i, x l+  1, y l) P (a l, p i, n, y l) -»
P (a l, p i , n, yl + 1) -> ... ->P (a l, p i, n, a2) P (a2, p i, 1, a l )
P (a2, pi, 1, a l-1 ) P (a2, pi, 1 ,1) -» P (a2, pi, 1+ 1,1) - > ...
P (a2, pi, p2 ,1) -» P (a2, p2, pi, n) - » P (a2, p2, pi -  1, n) -» ... -»










Figure 3.6: Possible four disjoint paths from source to destination (Case 1)
Path ED3:
a) At the block level the path is as follows:
B (a l, p i) ^  B (a l, p2) B (02, p2)
b) A short description of the path is as follows:
P (a l, p i, x l, yl) ->* P (a l, p i, p 2 ,1) -> P (a l, p2, p i, n) -> * P (a l, p2, n, a 2 )
54
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- » P (a2, p 2 ,1, a l )  -»* P (a2, (32, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, x l, y l) -+ P (a l, p i, x l, y l - 1) P (a l, p i, x l, 1) -»
P (a l, p i, x l+ 1 ,1) P (a l, p i, p 2 ,1) - + P (a l, p2, p i, n) ->
P (a l, (32, pi+1, n) —>... —> P (a l, P2,n, n) -+ P (a l, P2, n, n-1) —>... -+
P (a l, p2, n, a2) —> P (a2, p 2 ,1, a l )  -» P (a2, p 2 ,1, a l-1 )  -> ... -»
P (02, p 2 ,1, y2) -+ P (a2, p 2 ,1 + 1, y2) P (a2, P2, x2, y2).
Path ED4:
a) At the block level the path is as follows:
B (a l, p i)  -> B (a l ,  p2) -> B(oc2, p2)
b) A short description of the path is as follows:
P (a l, p i, x l, y l) -+* P ( a l , p l , p 2 , w ) - » P ( a l , p 2 , p i ,  1)-+ *P(a l ,  (32, l , a 2 )
-> P (a2, P2, n, a l )  ->* P (a2, p2, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, x l, y l) - » P (a l ,  p i, x l, yl+  1) P (a l, p i, x l, n)
P (a l, p i, xl+1, n) P (a l, p i, p2, n) -> P (a l , p2, p i, 1) -»
P (a l, p2, p l - 1 ,1) P (a l, p 2 , 1,1) -> P (a l ,  p 2 ,1,1+1) -» ... ^
P (a l, P 2 ,1, a2) P (a2, P2, n, a l )  P (a2, P2, n, a l-1 )  -+ ... -+
P (a2, p2, n, y2) -» P (a2, P2, n-1, y2) -+ ...-+  P (a2, P2, x2, y2).
Case 2: Source processor is a boundary processor and destination processor is an internal 
processor. In this case there are three links to other neighboring processor and the other 
link is with a processor situated on the other block.
55
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
We will consider the situation where the following conditions hold:
i) xl = 1 and x l < P2 and 1 < yl < a 2
ii) 1 < x2 < p i and 1 < y2 < a l
Remaining situations can be solved in the similar way. We now show how we may create 
four edge disjoint paths EDI, ED2, ED3 and ED4 from the source to the destination 
node.
Path EDI:
a) At the block level the path is as follows:
B (a l, p i) -»  B ( a2, p i) -+ B (oc2, p2)
b) A short description of the path is as follows:
P (a l, p i, 1, y l) ->* P (a l , p i, 1, a2) -> P (a2, p i, n, a l )  V  P (a2, p i, p2, n)
-+ P (a2, p2, p i, 1) -+* P (a2, p2, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, 1, yl) -> P (a l , p i, 1, yl + 1) -> ... ->P (a l, p i, 1, a2) ->
P (a2, p i, n, a l )  -+ P (a2, p i, n, a l  + 1) -» ... -+P (a2, p i, n, n) -»
P (02, p i, n -1 , n) ->...->P (a2, p i, p2, n) -» P (a2, p2, p i, 1) ->
P (02, P2, p i -  1,1) —> ... ->P (oc2, p2, x 2 ,1) -> P (a2, p2, x2, 1+ 1) ... -»
P (02, p2, x2, yl).
Path ED2:
a) At the block level the path is as follows:
B (al, p i)  -> B (o2, p i) -+ B(a2, p2)
b) A short description of the path is as follows:
P (a l, p i, 1, yl) -»* P (a l, p i, n, a2) ^  P (a2, p i, 1, a l )  -»* P (a2, p i, p 2 , l )
56
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 4 P (cx2, (32, p i, n) -4 *  P (<x2, p2, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, 1, yl) -» P (a l ,  p i, 1+ 1, yl) P (a l, p i, n, y l) ->
P (a l, p i, n, yl + 1) —>... -4  P (a l, p i, n, a2) -4  P (a2, p i, 1, a l )  -4  
P (a2, p i, 1, a l-1 )  -» ... -4P (a2, p i, 1,1) -4  P (a2, p i, 1+ 1,1) -> ... -> 
P (a2, p i, p 2 ,1) -4 P (a2, p2, p i, n) -4  P (a2, p2, p i -  1, n) -> ... -4  








Figure 3.7: Possible four disjoint paths from source to destination (Case 2)
Path ED3:
a) At the block level the path is as follows:
B( a l ,  p i) -> B( a l ,  p2) -4  B(a2, p2)
b) A short description of the path is as follows:
57
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
P (a l, p i, 1, y l) V  P (a l ,  p i, p 2 ,1) -» P (a l, p2, p i, n) V  P (a l , p2, n, a l)  
-> P (oc2, P 2 ,1, a l )  -»* P (a2, p2, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, I, y l) -> P (a l ,  p i, 1, y l - 1) -> ... -> P (a l, p i, 1,1) ->
P (a l, p i, 1+1,1) P (a l, p i, p 2 ,1) - » P (a l, p2, p i, n) ->
P (a l, (32, pi+1, n) ->... -»P (a l, P2, n, n) P (a l, P2, n, n-1) -» ... ->
P (a l, p2, n, a l)  - » P (a2, p 2 ,1, a l )  -> P (a2, p 2 ,1, a l-1 )  -> ... -+
P (02, p2, 1, y l)  -» P (a2, p2, 1 + 1, y2) ... -» P (a2, p2, x l ,  yl).
Path ED4:
There is an inter-block link connecting the processors P (a l , (31, I, y l) and P (yl, (31, n, 
a l) .  If yl = a l  we have a self loop. Otherwise we will reach another block B (yl, (31).
If it is on another block then obviously we’ll get another distinct path.
So there are two situations-
i) yl * a l
ii) yl = a l
For situation 1, if  yl /  a l ,  then in order to get PT4 an edge-disjoint path we will take the
following paths-
a) At the block level the path is as follows:
B( a l ,  p i) -+ B( y l, p i) B(yl, P2) -> B(a2, p2)
b) A short description o f the path is as follows:
P (a l, p i, 1, yl) -+ P (yl, p i, n, a l )  ^  * P (yl, p i, P 2 ,1) -> P (yl, P2, p i, n) -> *
P (yl, p 2 ,1, a 2 )-» P (a2, p2, n, y l) V  P (a2, p2, x l, yl).
c) A detailed description of the path used is as follows:
58
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
P (a l, p i, 1, yl) -> P (yl, p i, n, a l )  - + P (yl, p i, n-1, a l )  -> ... -»
P (yl, p i, p2, a l )  ^  P (yl, p i, p2, a l - 1) P (yl, p i, p 2 ,1) -+
P (yl, p2, p i, n) P (yl, P2, p l-1 , 1) P (yl, P 2 ,1, n) P (yl, P2, 1, n -1)
-> ... -> P(yl, p 2 ,1, oc2) -> P(a2, p2, n, y l ) -+ P(a2, P2, n -1, y l ) -> ...
P (a2, p2, x2, yl) -+ P (a2, p2, x2, yl-1) -> ... -> P (a2, P2, x2, y2).
For situation 2, if  yl = a l ,  then in order to get PT4 an edge-disjoint path we will take the 
following paths-
a) At the block level the path is as follows:
B( a l ,  p i)  -+ B( a l-1 ,  p i) -> B (a l-1 , p2) -> B(a2, p2)
b) A short description of the path is as follows:
P (a l, p i, 1, a l )  -> P (a l, p i, n, a l )  -+ P (a l , p i, n, a l-1) -+ P (al-1 , p i, 1, a l )  
->* P ( a l - l , p l , p 2 , 1) —» P (a l-1 , P2, p i, n) —>* P ( a l - l , p 2 , n ,  a2)
-+ P (a2, p 2 ,1, a l -1 )  -+* P (a2, p2, x2, y2).
c) A detailed description of the path used is as follows:
P (a l, p i, 1, a l )  -+ P (a l , p i, n, a l )  -» P (a l ,  p i, n, a l-1) -+ P (al-1 , p i, 1, a l )
-> P (al-1 , p i, 1, a l-1 ) P (al-1, p i, 1,1) -> P (al-1, p i, 1+1,1) -> ... -+ P
(al-1 , p i, p2, 1) -> P (a l-1 , p2, p i, n) P (a l-1 , p2, pl-1, «)->... -+ P (a l-1 , 
p 2 ,1, n ) - » P (a l-1 , p 2 ,1, «-l ) - > . . . - >  P (a l-1 , p 2 ,1, a2 ) -> P  (a2, p2, n, 
a l-1 )  - » P (a2, p2, n, a l - 1  +1) P (a2, P2, n, y2) -> P (a2, p2, n-1, y2)
—> ... —> P (a2, p2, x2, y2).
It may be readily verified that these paths are edge disjoint.
59
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3.2.3 Connectivity of three dimensional Multi-Mesh (3D MM) network
Our proposed network is a three dimensional Multi-Mesh (3D MM) where we defined 
the interconnection rules in such a way so that we will get a regular graph where the node 
degree of each processor in the network is six. As a result, the upper bound of 
connectivity of 3D MM is six. In this section we will prove that the connectivity of 3D 
MM is exactly six.
As we discussed, the basic building block in a 3D MM is a three dimensional mesh. 
Depending on the position of a processor within a block, the processor was classified into 
the following categories (discussed in the section 3.2.2):
1) internal processor,
2) face-centered processors
3) boundary edge processor,
4) comer processor and
Within a block, an internal processor has exactly six links to other neighbors, a face- 
centered processor has five neighbors, a boundary edge processor has four links to its 
neighbors, a comer processor has three links to its neighbors. By defining the inter-block 
links we ensure that every processor has exactly six links to other processors.
Theorem 3:
The connectivity of a three dimensional multi-mesh (3D MM) network is 6.
Proof
In order to prove this, we have to show that, regardless of the position of the source and 
the destination, we can always find 6 edge-disjoint paths EDPT1, EDPT2, EDPT3,
60
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
EDPT4, EDPT5 and EDPT6. The source and the destination may be in the same block 
or in different blocks. We will discuss only the case where they are in different blocks 
since that is the more challenging task.
We need to consider 16 possible combinations of source and destination processor 
categories. We will only show the following case -  where source and destination both are 
internal processors.
We will consider the situation where the following conditions hold:
i) 1 < xl < (32,1 < yl < y2 and 1 < zl < a l
ii) 1 < x2 < p i , 1 < y2 < a l  and 1 < z2 < a l
Since both the source node and the destination node are internal processors, they both 
have 6 neighbors. We now show how we may create six edge disjoint paths EDPT1, 
EDPT2, EDPT3, EDPT4, EDPT5 and EDPT6 from the source to the destination node. 
For each path, we
a) first give the path at the block level where we only specify the blocks used in 
the path,
b) then give a short description of the path,
c) finally give a detailed description of the path used.
In giving a short description of a path, we have used the notation X —>* Y to denote that 
we have used a number of intra-block edges to go from processor X to processor Y.
Path EDPT1:
a) At the block level the path is as follows:
B (a l, (31, yl) B (a2, p i, yl) B (a2, P2, yl) -» B (a2, p2, y2)
61
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
b) A short description of the path is as follows:
P (a l, p i, yl, x l, y l, z l) V  P (a l, p i, yl, 1, yl, a2) -> P (oc2, p i, yl, n, y l, a l)
V  P (a2, p i, yl, p2, 1, a l  ) -> P (a2, p2, yl, p i, n , a l )  ->* P(a2, p2, yl, p i, 
y2, n) P(a2, p2, y2, p i, yl, 1) V  P(a2, p2, y2 , x2, y2, z2).
c) A detailed description of the path used is as follows:
P (a l , p i, yl, x l, y l, z l) P (a l ,  p i, yl, xl-1, y l, zl) ... P (a l ,  p i, yl, 1,
y l, zl ) -> P (a l, p i, yl, 1, y l, zl+1) -> ... P (a l, p i, yl, 1, y l, a2) -> P(a2, 
Pi, yl, n, y l, a l )  -> P(a2, p i, yl, n, y l - 1, a l )  ... -» P(a2, p i, yl, n, 1, a l )  
P(a2, p i, yl, n -1, 1, a l )  -> ... -> P(a2, p i, yl, p2, 1, a l)  -> P(a2, p2, yl, p i, n , 
a l )  -> P(a2, P2, yl, p i, n , a l+  1) -> ... P(a2, P2, yl, p i, n , n) -» P(a2, p2,
yl, p i, n - 1 , n) —»... —> P(a2, p2, yl, p i, y2, n) -> P(a2, p2, y2, p i, yl, 
1) -> P(a2, p2, y2, pi -1 , yl, 1) -> ... ^ P (a 2 , P2, y2, x2, yl, 1) -> P(a2, p2, y2, 
x2, yl -  1, 1) -> ... —>P(a2, p2, y2, x2, y2, 1) -» P(oc2, p2, y2, x2, y2, 1 + 
1) —> ... —> P(a2, P2, y2, x2, y2, z2).
Path EDPT2:
a) At the block level the path is as follows:
B (a l, p i, yl) B (a2, p i, yl) B (a2, p i, y2) -»  B (oc2, p2, y2)
b) A short description of the path is as follows:
P (a l, p i, yl, x l, y l, z l) -»* P (a l , p i, yl, n, y l, a2) -> P(a2, p i, yl, 1, y l, a l )  
->*P(a2, p i, yl, 1, y2, n) -> P(oc2, p i, y2, 1, yl, 1) V  P(a2, p i, y2, p2, n, 1) -> 
P(a2, p2, y2, p i, 1,1) V  P(a2, p2, y2 , x2, y2, z2).
c) A detailed description of the path used is as follows:
P (a l, p i, yl, x l, y l, z l) -> P (a l, p i, yl, xl + 1, y l, z l ) —>... -» P (al, p i, yl, n, 
y l, zl ) —̂ P (a l, p i, yl, n, y l, zl + 1 ) -» ... -> P (al, p i, yl, n, y l, a2) -> P(a2, 
p i, yl, 1, y l, a l )  -» P(a2, p i, yl, 1, yl, al+1 ) -4  ... P(a2, p i, yl, 1, yl, n)
62
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
-> P(a2, p i, yl, 1, yl + 1, n) —> ... -> P(a2, p i, yl, 1, y2, n) -» P(a2, p i, y2, 1,
yl, 1) -+ P(a2, p i, y2, 1 , yl + 1, 1) —» ... -+ P(a2, p i, y2, 1 , n, 1) P(a2, p i,
y2, 1 + 1 , n, 1) —>... —» P(a2, p i, y2, p2, «, 1) -> P(a2, p2, y2, p i, 1, 1) -> P(a2, 
p2, y2, p i - 1 ,  1, 1) -> ... -> P(a2, p2, y2, x2, 1, 1) -> P(a2, p2, y2, x2, 1, 1 + 
1) ... -+ P(ot2, p2, y2, x2, 1, z2) -> P(a2, p2, y2, x2, 1+1, z2) -+ ... P(a2, p2,
y2, x2, y2, z2).
Path EDPT3:
a) At the block level the path is as follows:
B ( a l ,  pi ,  y l ) -> B ( a l ,  p2, y l ) -> B ( a2, p2, y l ) -> B ( a2, p2, y2)
b) A short description of the path is as follows:
P (al, p i, yl, x l, y l, z l ) -+ P (al, p i, yl, p 2 ,1, z l) -+ P (al, P2, yl, p i, n, zl)
-+ P (a l, P2, yl, n, n, a 2 ) -+ P(a2, p2, yl, 1, n , a l )  -> P(a2, p2, yl, 1, y2, n)
-+ P(a2, P2, y2 ,1, yl, 1) -> P(a2, P2, y2 , x2, y2, z2).
c) A detailed description of the path used is as follows
P (a l, p i, yl, x l, y l, z l ) ^  P (a l, p i, yl, x l, yl -1 , z l ) -> ... -+ P (a l, p i, yl, x l, 
1, z l) -> P (a l, p i, yl, xl + 1,1, zl) -+ ... -+ P (a l, p i, yl, p 2 ,1, z l) -> P (a l, p2, 
yl, p i, n, z l) —> P (a l, P2, yl, p i + 1, n, z l) —>... -» P (a l, P2, yl, n, n, z l) 
-+ P (a l, P2, yl, n, n, z l + 1) -+ ... -+ P (a l, p2, yl, n, n, a 2 ) -»  P(a2, p2, yl, 1, n 
, a l )  -> P(a2, p2, yl, 1, n - 1 , a l )  -> ... -> P(a2, P2, yl, 1, y2 , a l )  -+ P(a2, P2, 
yl, 1, y2 , a l  + 1) -» ... -> P(a2, p2, yl, 1, y2, ri) -> P(a2, p2, y2, 1, yl, 1) -+
P(a2, p2, y2, 1, yl -1 , 1) ^  ... -+ P(a2, p2, y2, 1, y l,  1) -> P(a2, p2, y2, 1, y2,
1 + 1) P(a2, p2, y2, 1, y2, z2) -> P(oc2, p2, y2, 1+1, y l, z l)  -> ...
P(a2, P2, y2 , x2, y l, zl).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Path EDPT4:
a) At the block level the path is as follows:
B ( a l ,  pi ,  Yl) B ( a l ,  p2, y l ) B ( a l ,  p2, y2) -> B ( a2, p2, y2)
b) A short description of the path is as follows:
P ( a l , p i , y l , x l , y l , z l ) - »  P (al, p i, yl, [32, n, z l ) P ( a l , p 2 ,  yl, pi,  l ,zl)
-> P (al, p2, y l,p l , y2, 1) -> P (al, p2, y2, p i, yl, n) -> P (al, p2, y2, n, 1, a2) 
P(a2, P2, y2 ,1 ,1, a l )  -» P(a2, P2, y2 , x2, y2, z2).
c) A detailed description of the path used is as follows
P (a l, p i, yl, x l, y l, z l) -> P (a l ,  p i, yl, x l, yl + 1, z l ) -» ... P (a l, p i, yl, x l, 
n, zl ) -» P (a l, p i, yl, x l + 1, n, z l ) -> ... -> P (al, p i, yl, p2, n, z l) -» P (a l, p2, 
yl, p i, 1, z l) P (al, p2, yl, p i, 1 + 1, z l) -> ... -> P (al, p2, yl, p i , y2, zl ) 
P (al, p2, yl, p i, y2, zl -1 ) -> ... -> P (al, P2, yl, p i, y2, 1 ) -* P (a l, P2, y2, p i, yl, 
n) -> P (a l, p2, y2, p i - 1, yl, n) ->• ... -> P (al, p2, y2, 1, yl, n) -» P (a l, p2, y2, 1, 
yl -  1, / ! ) -» ... -> P (a l, p2, y2 ,1, 1, n) -> P (al, p2, y2 ,1, 1, n - 1) -> ... -> P (al, 
p2, y2 ,1, 1, a2) -> P(a2, P2, y2, n ,1, a l )  -> P(a2, p2, y2, n ,l, a l  - 1) 
-> ... -> P(a2, P2, y2, n ,1, z2) -> P(a2, p2, y2, n ,1 +1, z2) ... P(a2, P2, y2, n ,
y2, z2) —> P(a2, P2, y2, n-1, y l, z l)  —»... —> P(a2, p2, y2 , x2, y l, zl).
Path EDPT5:
a) At the block level the path is as follows:
B ( a l ,  pi ,  yl) -* B ( a l ,  pi ,  y2 ) B ( a l ,  P2, y2) ^  B (a2, p2, y2)
b) A short description of the path is as follows:
P (al, p i, yl, x l, y l, z l ) P (a l, p i, yl, x l, y2 ,1) P (a l, p i, y2, x l, yl, n)
- » P(al ,  pi ,y2,  P2, n ,n )^ >  P (a l, P2,y2, p i, 1, n) P (al, p2,y2, n, 1, a2) 
P(a2, P2, y2 ,1 ,1, a l )  P(a2, P2, y2 , x2, y l, zl).
c) A detailed description of the path used is as follows
64
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
P (al, |3l, yl, x l, y l, z l ) -> P (al, (31, yl, x l, y l, zl - 1 )  -» ... -> P (a l, p i, yl, x l, 
y l, 1 ) —> P (a l, p i, yl, x l, yl + 1, 1 ) P (a l, p i, yl, x l, y2, 1) P (al,
p i, y2, x l ,y l ,  n) P (a l, p i,y2 , x l , y l - l , n )  -> ... P (a l, p i, y2, x l ,l ,  n) 
-> P (a l, p i, y2, xl + 1,1, n) -» ... -> P (a l, p i, y2, p 2 ,1, n ) -> P (a l, p2, y2, p i, 
n, n) -» P (a l, P2, y2, pi + 1, n, n) P (a l, p2, y2, n, n, n) -» P (al, P2, y2, n, 
n, n - 1) —> ... —> P (a l, P2, y2, n, «, a2) —» P(a2, P2, y2, 1 , n, a l )  —> P(a2, p2, 
y2, 1 , n, a l  - 1) ... P(oc2, p2, y2, 1 , n, z2)-> P(a2, p2, y2, 1 + 1 , n, z2)
... -» P(a2, P2, y2, x2 , n, z2) P(a2, P2, y2, x2 , n - 1, z2) ... P(a2, P2,
y2 , x2, y2, z2).
Path EDPT6:
a) At the block level the path is as follows:
B ( a l ,  p i, y l ) -» B ( a l ,  p i, y 2 ) -> B ( a2, p i, y 2 ) B (a2, p2, y2)
b) A short description of the path is as follows:
P (a l, p i, yl, x l, y l, z l ) -» P (a l, p i, yl, x l, y2, n) P (a l, p i, y2, x l, yl, 1)
-> P (a l, p i, y2, 1, n, a2 ) -> P(a2, p i, y2, n,n,al) -> P(a2, p i, y2, P2, n, n)
-> P(a2, p2, y2, p i ,1, n ) -> P(a2, p2, y2 , x2, y2, z2).
c) A detailed description of the path used is as follows
P (a l, p i, yl, x l, y l, z l ) -» P (a l, p i, yl, x l, y l, zl + 1) -> ... -» P (a l, p i, yl, x l, 
y l, n) -» P (a l, p i, yl, x l, yl + 1, ri) -» ... P (a l, p i, yl, x l, y2, n) P (a l, p i, 
y2, x l, yl, 1) -> P (a l, p i, y2, xl - 1, yl, 1) -» ... -* P (a l, p i, y2, 1, yl, 1) 
-» P (a l ,  p i, y2, 1, yl + 1, 1 ) - >P( al ,  p i, y2, 1, n, 1) P (a l, p i, y2, 1 ,n , 
1 + 1) ... P (a l, p i, y 2 ,1, n, a2 ) -»■ P(a2, p i, y2, n, n,a l)  -» P(a2, p i, y2, n
, n, a l  + 1) P(a2, p i, y2, n , n, n) —»P(a2, p i, y2, n - 1 , n, n)
-» ... - » P(a2, p i, y2, p2, n, n) -» P(ot2, p2, y2, pi ,1 ,» )  -» P(a2, p2, y2, p i -1 ,1, 
n ) - » . . .  —» P(a2, P2, y2, x2 ,1, n ) —> P(a2, P2, y2, x2 ,1 + 1, n ) —»... —> P(a2, 
P2, y2, x2 ,y2, n ) —> P(a2, P2, y2, x2 ,y2, n - 1 ) —»... —> P(a2, P2, y2 , x2, y2, z2).
65
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
It may be readily verified that these six paths are edge-disjoint. The other cases are similar.
3.3 Message Routing in the 3D Multi-Mesh
Routing problem can be defined as the process of sending messages from source 
processor to destination processor. The routing algorithm implemented on the router is 
responsible for determining the path from source to destination. The length of the path in 
the worst possible situation determines the performance of a routing algorithm.
In this section we present routing messages from any source processor to any destination 
processor for point to point communication. Let the source processor be S = P (al, (31, yl, 
x l, y l, z l) and the destination processor be D = P(a2, |32, y2, x2, y2, z2), 1 < a l ,  p i, yl, 
x l, y l, zl, a2, 32, y2, x2, y2, z2 < n, so that the 3D block corresponding to S is B (al, 
PL yl) and that corresponding to D is B(a2, P2, y2). We will describe the routing along 
the restricted path such as the Theorem 3.1 (Diameter). There are three situation to 
consider that we describe in Theorem 3.1, among them we’ll only consider the situation 
where none of the coordinates of the source block B (a l ,  p i, yl) have the same value as 
that of the corresponding coordinate in the destination block B (a2, p2, y2). The other 
cases are similar.
In this case, none of the coordinates of the source block B (a l, p i, yl) have the same 
value as that of the corresponding coordinate in the destination block B (a2, p2, y2). In
66
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
other words, a l # a2, pi ^  P2 and yl # 72. In this case, there exist two intermediate 
blocks B (a3, (33, 73) and B (o4, P4, y4) such that there is a direct link between
- the source block B (al, (31, yl) and the intermediate block B(a3, (33, y3),
- the block B(a3, [33, y3) and the block B(a4, (34, y4) and
- the block B(o4, p4, y4) and the destination block B(a2, [32, y2).
There are a number of ways in which we may choose the intermediate blocks B (a3, [33, 
y3) and B (a4, [34, y4). For example, we could select B (a2, (31, yl) and B (a2, (31, y2) as 
intermediate nodes. In order to route a message from a source processor S = (a l ,  p i, yl, 
x l, y l, z l) to any destination processor D = P(a2, P2, y2, x2, y2, z2), we first divide the 
source block by three imaginary planes - (a l , p i, yl, P2, *, *), (a l, p i, yl, *, y2, *) and 
(a l, p i, yl, *, *, a2) that we showed in figure 3.5. This gives us 8 octants in the source 
block, which we will denote by SOI, S02, S03, S04, S05, S06, S07 and S08. 
Similarly we divide the destination block by three other imaginary planes (a2, P2, y2, p i, 
*, *), (a2, p2, y2, *, yl, *) and (a2, p2, y2, *, *, a l )  giving us 8 octants- DOl, D02, 
D03, D04, D05, DO6, D07 and DO8 in the destination block.
We will use the boundary processors for the three planes described as exit/entry points to 
communicate to processors in other blocks. In the proof for theorem 3.1, we showed that, 
for a suitable choice of the exit point from the source block, we could choose a 
corresponding entry point for the destination block to define a path PT1. Keeping in mind 
the choices for the entry/exit points for PT1, we chose another set of entry/exit points to 
define a path PT2. We showed that one of these paths must be of length less than or equal
67
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
to 3n. The algorithm for message routing uses this idea to select the optimum path for 
routing.
The idea used in this algorithm is similar to that used in [DDS99].
Algorithm:
Step 1:
i) Determine the octets of the source and destination blocks.
ii) Calculate the two possible paths PT1 and PT2 from source processor to
destination processor as defined in section 3 and choose the path with the shortest 
length. Let the chosen path from B (al, (31, yl) to B(a2, (32, y2) be through 
blocks B(ocii, (3ji, yki) and B(ai2, (3j2, Yis).
iii) Attach, to the data packet, a list consisting of the addresses of the exit/entry
processors of these blocks. This list consists of the following pieces of
information:
Fieldl: Source block exit processor 
Field2: First intermediate block exit processor 
Field3: Second intermediate block exit processor and 
Field4: Destination processor entry processor.
Step 2:
If the value stored in Fieldl is the address of the current processor, go to step 3. 
Otherwise send the packet towards the processor specified in Fieldl using the 
appropriate intra-block link from the current processor and go back to step 2 .
Step 3:
a) Send the packet to the appropriate processor by using appropriate inter-block link 
from the current processor.
b) Update the list of four address field information as follows-
68
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
F ield l«- Field2 
Field2 «- Field3 
Field3 «- Field4 
Field4«- NULL
c) If Fieldl is NULL, stop. Otherwise, and go to step 2.
Example:
A possible path PT1 from the source processor (which is in the block (a l ,  (31, yl)) to the 
destination processor (in the block (a 2, (32, y2)) using the intermediate blocks (a2, (31, 
yl) and (a2, (32, yl) may be formulated as follows-
PT1 = P (al, (31, yl, x l, y l, z l ) ... -> P (al, (31, yl, 1, y l, a2) -> P(a2, p i, yl, n, yl,
a l )  -»... -> P(a2, p i, yl, p 2 ,1, a l )  -> P(a2, p2, yl, p i, n ,a l)  ... P(a2, p2, yl,
PI, y2, n) P(a2, p2, y2, p i, yl, 1) ... -4  P(a2, P2, y2 , x2, y2, z2).
Initially data packet appends the following four fields-
1. (a l, p i, yl, 1, yl, a2) as Fieldl
2 . (oc2, p i ,y l ,p 2 , l , a l )a sF ie ld 2
3. (o2, p2, yl, p i, y2, n) as Field3
4. (oc2, P2, y2 , x2, y2, z2) as Field4
Source processor first checks Fieldl, if the address of Fieldl is not the current processor 
then routes the messages to the processor P (a l, p i, yl, 1, y l, a2) via an intra-block link.
If the address of Fieldl is the current processor then route the message to the processor
P(a2, p i, yl, n, y l, a l )  by using the inter-block link and updates the list of four fields as
follows-
F ield l«- (a2, p i, yl, P 2 ,1, a l )
Field2 (a2, p2, yl, p i, y2, n)
69
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Field3 «— (a2, (32,72 , x2, y2, z2)
Field4 <- NULL
Then the processor P(a2, (31, yl, n, yl, a l )  checks the Fieldl and route the messages to 
the processor (Fieldl) P (a2, (31, yl, (32,1, a l )  by using intra-block link. Processor 
P (a2, (31, yl, (32, 1, a l )  has the same address as Fieldl so it routes message to processor 
P (a2, (32, yl, (31, n, a l )  via the inter-block link and updates the four fields as follows- 
Fieldl <- (a2, p2, yl, p i, y2, n)
Field2 (o2, (32, y2, x2, y2, z2)
Field3 NULL 
Field4 4-  NULL
Processor P (a2, (32, yl, p i, n, a l )  checks the Fieldl and route the messages to the
processor (Fieldl) that is P (a2, p2, yl, p i, y2, n) by using intra-block link. Processor
P(a2, p2, yl, p i, y2, n) sends the message to P(a2, p2, y2, p i, yl, 1) by using inter-block 
link and updates the four fields as follows- 
F ield l«- (oc2, p2, y2, x2, y2, z2)
Field2 «- NULL 
Field3 <- NULL 
Field4 <— NULL
This processor P (a2, P2, y2, p i, yl, 1) checks the Fieldl and route the messages to the 
address of Fieldl that is (a2, P2, y2, x2, y2, z2) by using intra-block link and updates the 
four fields as follows-
Fieldl <- NULL 
Field2 <- NULL 
Field3 «- NULL 
Field4 <- NULL
As the value of the fieldl is sets to NULL the routing process is terminated.
70
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3. 4 Summation/Average/Minimum/Maximum in the 3D Multi-Mesh
We may use the 3D MM to compute the sum of up to n data values stored in the n 
processors of a 3D MM of order n. The same idea may be used to compute the average, 
maximum or minimum of up to n data values. The scheme we use is similar to that used 
in [DDS99] for the Multi-Mesh. We assume that each processor has three registers X, Y 
and Z for data communication in the three axes and will use X (a, (3, y, x, y, z) (Y (a, [3, 
y, x, y, z) and Z (a, (3, y, x, y, z)) to denote the X (respectively Y and Z) register in 
processor P (a, (3, y, x, y, z). The data is initially in register Z of all n6 processors in the 
3D MM. The main idea of the algorithm, is to
i) compute, in parallel, the sum of all numbers in each 3D block,
ii) communicate the partial sums to blocks B(l, (3, y), 1 < (3, y < n,
iii) compute the sum of the partial sum of all numbers in B(l, (3, y), 1 < |3, y < n and 
communicate the partial sums to blocks B(l, 1, y), 1 < y<  n,
iv) compute the sum of the partial sum of all numbers in B(l, 1, y), 1 < y < n and 
communicate the result to block B(l, 1, 1).
The algorithm is as follows:
Algorithm Sum 
Step 1
Va, (3, y, x, y, 1 < a , P, y, x, y < n do in parallel 
for k = n -1  downto 1 do
Z(a, p, y, x, y, k) <- Z(a, p, y, x, y, k + 1) + Z(a, p, y, x, y, k);
71
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
/* Z(a, P, y, x, y, 1) now contains the partial sum of n values */
Y(a, p, y, x, y, 1) <- Z(a, p, y, x, y, 1); 
for j = n -1  downto 1 do
Y(a, p, Y,x,j, 1) * Y(a, p, y,x ,j+  1, 1) + Y(a, p, % x,j, 1);
/* Y(a, P, y, x, 1, 1) now contains the partial sum of n2 values */
X(a, P, y, x, 1 ,1) <- Y(a, p, y, x, 1,1); 
for i = n - 1 downto 1 do
X(a, p, y, i, 1 ,1 )« - X(a, p, y, i+1, 1, 1) + X(a, p, y, i, 1,1);
/* X(a, P, y, 1 ,1 ,1) now contains the partial sum of n3 values */
Y (a ,p ,y  1,1 ,1) « -X (a ,P ,y , 1,1,1);
Y(a, p, 1,1, y, n) <- Y(a, p, y, 1,1,1);
/*Using the link (y <-» y) the partial sums in blocks B(a, P, *) are transferred to 
blocks B(oc, p, 1) */
Step 2
Va, P, 1 < a , p, < n do in parallel 
for j = n -1  downto 1 do
Y(a, p, 1,1, j, n) <r- Y(a, p, 1,1, j+ 1, n) + Y(a, p, 1,1, j, n);
/* Y(a, P, 1 ,1 ,1 , n) now contains the partial sum of n values */
Y(a, p, 1 ,1,1 ,1) *— Y(a, P, 1 ,1 ,1 , n); /*Using the link y) */
X(a, p, l , l , l , l ) < - Y ( a , p ,  1 ,1 ,1 ,1);
X(a, 1 ,1, p, n, 1) <— X(a, P, 1 ,1 ,1 ,1 ); /* Using the link (p <-» x)*/
Step 3
Va, 1 < a , < n do in parallel 
for i = n -1  downto 1 do
72
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
X(a, 1,1, i, n, 1) <— X(a, 1,1, i+ 1, n, 1) + X(a, 1,1, i, n, 1);
/* X(a, 1 ,1 ,1  ,n , 1) now contains the partial sum of n5 values */
X(a, 1 ,1 ,1 ,1 ,1 )  <- X(a, 1 ,1 ,1 , n, 1); /*Using the link (p o  x)*/
Z(a, 1 ,1 ,1 ,1 ,1 ) <— X(a, 1 ,1 ,1 ,1 ,1 );
Z(l, 1,1, n, 1, a) <— Z(a, 1 ,1 ,1 ,1 ,1 ); /* Using the link (a  z)*/
Step 4
for k = n - 1 downto 1 do
Z(l, 1,1, n, 1, k)*- Z(l, 1,1, n, 1, k+ 1) + Z(l, 1, 1, n, 1, k);
/* Z(l, 1, 1, n, 1, 1) now contains the partial sum of n6 values */
Z(l, 1 ,1 ,1 ,1 ,1)<— Z(l, 1,1 ,n , 1,1); /*Using the link (a  <-» z )*/
We will now analyze the time needed for this algorithm. Let tc denote the time for one
communication, assuming that inter-block and intra-block communication take the same
time and ta denote the time for one addition. In the steps where we have use an
addition(-t-) operation, the operation is actually one data communication and one addition
so that the total time needed for the operation is tc + ta. Step 1 takes n(tc + ta) + tc + «(tc + ta)
+ tc + «(tc + ta) + tc = (3n + 3) tc + 3n ta time units. Step 2 takes n (tc + ta) + 2tc = (n + 2) tc +
n ta time units. Step 3 takes n (tc + ta) + 2tc = (n + 2) tc + n ta time units. Step 4 takes n (tc +
ta) + tc = (n + 1) tc + n ta time units. The total time required is (6n + 8) tc + 6n ta time. Thus
the algorithm to compute the sum of n6 numbers on the 3D MM is O («). This may be
compared to the time O (n) to compute the sum of n numbers on the Multi-Mesh.
In this chapter we have defined the 3D Multi-Mesh architecture, studied the diameter and 
connectivity of this network, and have developed two important algorithms for the 3D 
MM.
73
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 4
Optical Implementation of Multi-Mesh Links
In order to improve the performance of multiprocessor systems, the use of very high­
speed communication technology is crucial. As mentioned in chapter 2, to avoid the 
limitations of electronic technology that uses copper, optical technology is promising for 
inter-processor communication. We have reviewed a number of optics-based 
interconnection schemes as well as hybrid (electronic and optical) schemes in chapter 2 . 
To our knowledge there is no research on implementing the Multi-Mesh using optical 
technology. In this chapter we will discuss how the Multi-Mesh architecture may be 
implemented using optical technology and we have described a number of possible 
approaches for designing optics-based interconnections for the Multi-Mesh. Our results 
may be extended to define 3D Multi-Mesh using optical technology.
We have already mentioned that the Multi-Mesh (MM) network discussed in Chapter 2 
has attractive topological attributes. In a MM network of order n, there are n2 blocks 
(where a block is a mesh of processors) arranged in the form of an rt * n matrix. In 
chapter 2, Figure 2.11 shows a MM of order 3. In a MM network, the processors within a 
block are connected by intra-block links to other processors in the same block. Some 
processors of different blocks are connected by inter-block links. We now show how we 
can implement the inter-block connections of the MM network by using the wavelength 
routed WDM technology we also reviewed in chapter 2. To present our design, it is
74
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
convenient for us to separate the intra-block connections from the inter-block 
connections. We show a MM network of order 4 in Figure 4.1 where each square 
represents a two-dimensional mesh and inter-block connections are omitted. We show a 
single 2-dimensional block of order 4 in Figure 4.2 where each circle represents a 
processor and each edge represents an intra-block connection.
B2i ® 2 3
^ 3 3
41 42 -*43 J44
: A  B lo ck  o f  ith row  and jth  colum n in  a MM
Figure 4.1: 4 X 4 Blocks of a Multi-Mesh network of order 4
I
:
Figure 4.2: A block of a Multi-Mesh network of order 4
75
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
We note that intra-block connections are short as compared to the length of inter-block 
connections and have a constant length. It is convenient to implement such connections 
using VLSI technology or by using free space optical communication. In this thesis we 
will only look at a hybrid approach where we use metal lines when fabricating the array 
of processors using VLSI technology. The alternative of using free space optical 
communication as proposed in [LoSu94b] is quite straightforward.
The more challenging task is to realize the inter-block connections since the length of 
such a connection changes and becomes very long for large networks.
There are two novel and interesting features of our implementation of inter-block 
connections:
> The first attractive feature is that we have used wavelength routed WDM 
networks rather than WDM networks based on passive star couplers [Mu97]. It 
is well known that the power requirements for passive star couplers make them 
unsuitable for large networks [Mu97]. In other words, using our approach, we 
can easily define larger networks with a relatively lower power budget.
>  The second interesting feature is that we have incorporated fault tolerance using 
protection scheme [Ge98], [RaMu99b], [SRM02], The idea is that each pair of 
processors that are connected by an optical link will have 2 edge-disjoint optical 
paths - the primary path and the back-up path. If there is a failure in the primary 
path, simply the router settings have to be changed so that the back up path can
76
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
be used. This means that, in the case of a single failure in the optical part of the 
network, the overall routing scheme does not have to be changed and the 
network diameter is not affected.
To our knowledge no other interconnection network has used these two ideas.
To realize the inter-block connections, our tasks are to
> define a physical topology consisting of fibers, routers and end-nodes (the sources 
or destinations of data). In the case of a Multi-Mesh, the end-nodes are the 
boundary processors of each block in the Multi-Mesh.
> define a logical topology on the physical topology such that for every undirected 
inter-block link between x and y in a Multi-Mesh there is a logical edge x —> y 
and a logical edge y —» x in the logical topology. For economic reasons, we wish 
to use as few wavelengths as possible.
In a wavelength-routed network given a physical topology, in order to define a logical
topology, we have to
>  determine which processors need to be connected by a lightpath,
>  determine a viable route and a wavelength for each lightpath (RWA problem) 
[Mu97], [StBa99] .
77
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Since we are implementing a known pattern of connections (as defined by the inter-block 
connection rules of the Multi-Mesh), the lightpaths we need are already defined. In the 
following sections, we first discuss possible physical topologies for this problem and 
present ways to handle the RWA problem to realize the desired connections for the fault- 
free and faulty situations that we have considered.
4.1 Physical Topology for Optical Communication in a Multi-Mesh
In our scheme we propose to use n2 routers - one for each of the n2 blocks. Figure 4.3 
shows part of our physical topology where a square represents a block (which, as 
explained earlier, is a mesh of processors) and an oval represents an optical router. All 
the routers are arranged in the form of a two-dimensional grid. To simplify the diagram 
we have not shown the connections from the boundary processors to the routers. As 
shown in Figure 4.3, the connection between the routers is the architecture of a torus. For 
clarity, we have shown the wrap-around links only for the first and the last rows and 





“4 4“4 2 “4 3
: A  R o u t e r  c o n n e c t e d  ____________________  : A  B i d i r e c t i o n a l  l i n k
T  o  r o w  i  a n d  c o l u m n  j
Figure 4.3: Connections between Routers in a Multi-Mesh network of order 4
78
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
At this point, in Figure 4.3, we have used undirected links. Later on, we will implement 
such links using either unidirectional links or bi-directional links. If there is a 
unidirectional link x —> y, it means there is a fiber allowing communication from node x 
to node y. It is not necessarily true that there will be a fiber allowing communication 
from node y to node x. In the case of bi-directional link x <-> y, there will always be two 
fibers - one allowing communication from x to y and one for communication from y to x.
Now we will discuss how we propose to connect the boundary processors of a block to a 
router. We will discuss in detail the physical topology corresponding to the connections 
from the boundary processors on the top and the bottom edge of block By. The physical 
topology corresponding to the connections from the boundary processors on the right and 
the left edge of block By are similar.
rviff
B
Figure 4.4: Outputs of multiplexers are connected to the inputs of router
Router Ry will be connected to the corresponding block By carrying incoming and 
outgoing optical signals as follows:
79
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1) the router R;j will be connected to block Bij with one fiber carrying signals from 
processors P(i, j, 1, k) of block By for communication to processor P(k, j, n, i) of 
block Bkj, for all k, 1 < k < n, k ^  j. This may be easily achieved by using a 
multiplexer Mjj1, shown in Figure 4.4 with inputs from processors P(i, j, 1, k), for
all k, 1 < k < n. The fiber carrying the output of multiplexer M ^is connected as
an input to router Ry as shown in Figure 4.4. We will later use this fiber to define 
logical edges corresponding to the inter-block connections from the first row of
t l ithe block By to the n row of the other blocks in the same column.
2) the router Ry will be connected to block By with one fiber carrying signals from 
processors P(i, j, n, k) of block By to processor P(k, j, 1, i) of block Bkj, for all k, 
1 < k < n, k ^  This may be easily achieved by using a multiplexer M? shown
in Figure 4.4 with inputs from processors P (i, j, n, k), for all k, 1 < k < n. The 
fiber carrying the output of multiplexer M? is connected an input to router Ry as
shown in Figure 4.4. We will later use this fiber to define logical edges 
corresponding to the inter-block connections from the nth row of the block By to 
the first row of the other blocks in the same column.
80
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 4.5: Inputs of the demultiplexers are connected to the output of router
3) the router Rij will be connected to block By with one fiber carrying signals from
processors P(k, j, n, i) of block B k j  to processor P(i, j, 1, k) of block B y ,  for all k,
1 < k < n, k ^ j .  This may be easily achieved by using a de-multiplexer ,
shown in Figure 4.5 with inputs from processors P(k, j, n, i) for all k, 1 < k < n. 
The fiber carrying the input to de-multiplexer is an output from the router Ry 
as shown in Figure 4.5. We will later use this fiber to define the logical edges
tVicorresponding to the inter-block connections from the n row of the blocks Bkj to 
the first row of the block By in the same column.
4) the router Ry will be connected to block By with one fiber carrying signals from
processors P(k, j, 1, i) of block B k j  to processor P(i, j, n, k) of block B y  , for all k,
1 < k < « ,  k ^ j .  This may be easily achieved by using a de-multiplexer D ?  ,
shown in Figure 4.5 with inputs from processors P(k, j, n, i) for all k, 1 < k < n. 
The fiber carrying the input to the de-multiplexer D ?  is an output to router R y  as
81
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
shown in Figure 4.5. We will later use this fiber to define the logical edges 
corresponding to the inter-block connections from the first row of other blocks Bkj 
to the «th row of the block By in the same column.
The Figure 4.6 only shows the ith column of a Multi-Mesh and the four fiber links 




t E f Rh
: Blocks in ith column
: A Router
: An unidirectional Fiber
Figure 4.6: Connection between router Rn and block Bn
82
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.1.1 Physical topology using unidirectional links
The block diagram shown in Figure 4.7 is identical to that shown in Figure 4.3 except 
that the links have directions as shown.
: An unidirectional link
Figure 4.7: A MM network based on unidirectional links
We are discussing here only the implementation of the vertical inter-block connections 
since the horizontal inter-block connections may be achieved in exactly the same way.
4.1.2 Physical topology using bidirectional link
If we use bi-directional links, the only difference is that a link between router x and 
router y actually corresponds to a fiber from x to y and a fiber from y to x. As shown in 
Figure 4.8, we denote a link between x and y b y x o y .
83
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
< > : A Bidirectional link
Figure 4.8: A MM network based on bidirectional links
4.2 Logical Topology for a Fault-Free Multi-Mesh
As mentioned earlier, our logical topology must have a directed edge for each inter-block 
connection. Here we only discuss the vertical inter-block links since the case for the 
horizontal inter-block links are identical. In a Multi-Mesh of order n, the boundary 
processors on the top (bottom) edge of block B(a, P), are connected to the boundary 
processors on the bottom (top) edge of block B(*, P). In other words, processors P (a, 
P, 1, y) (P (a, p, n, y)) are connected to processor P(y, P, n, a) (P(y, p, 1, a)), for all y, 1 
< y < n, y * a.
In our problem, we need two lightpaths from each block Ba, p to block By, p - one for the 
connection from processor P(a, p, 1, y) to P(y, p, n, a) and one for the connection from
84
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
processor P(a, (3, n, y) to P(y, p, 1, a) for all a , y, 1 < a, y < n. We now look at the ring 
consisting only of the routers in column number P and the fibers connecting them. We 
may view the Ba> p, as the end node connected by the multiplexer collecting lightpaths 
from all processors on the top edge of the block to router Ra> p. The set of lightpaths from 
the top edges of the blocks p, 1 < a  < n define a completely connected ring. Similarly 
the set of lightpaths from the bottom edges of the blocks Ba, p, 1 < a  < n define another 
completely connected ring. In summary our problem is to define complete connectivity 
for a unidirectional ring using a set of wavelengths say {Ai, Aa,... Ar}. This constitutes the 
set of connections from all the processors on the top edge of block in column p. Then we 
define an independent second set of complete connections simply by using another set of 
wavelengths {Ar+i, Ar+2, ... A2R.}. This second set constitutes the set of connections from 
all the processors on the bottom edge of block in column p. Research has already been 
done on wavelength assignment in bi-directional WDM rings [StBa99] and a recursive 
procedure for wavelength assignment for complete connectivity in bi-directional rings 
has been reported [EBC98].
4.2.1 Logical topology using unidirectional links
We now describe our process for assigning routes and wavelengths to each lightpath to 
define complete connectivity for a unidirectional ring. Due to the symmetric nature of our 
network, we have chosen a straight forward route for our lightpaths - we will use only the 
fibers connecting routers in column p when defining lightpaths from any block in column 
P to any other block in the same column. We will use the following algorithm to assign
85
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
wavelengths to each lightpath. This algorithm assumes that there is a unidirectional ring 
with N nodes, 1 < N < n which are assigned numbers 1, 2, ... , N with wavelengths 
already assigned to them for complete connectivity. The algorithm simply puts a new 
node (NodeN+i), in any desired position on the ring and assigns wavelengths for 
communication from every existing node to the new node and wavelengths for 
communication from the new node to every existing node. We assume that node N + 1 is 
placed after node i shown in Figure 4.9, in the network. We will use new wavelengths 




Figure 4.9: Inserting the (N+l)th node in a unidirectional ring
Algorithm Assign-wavelength
Step 1) repeat step 2 for all j, 1 < j <N
Step 2) assign wavelength A,n+iJ for communication from node j to node N + 1 
Step 3) assign wavelength A,n+ij for communication from node N + 1 to node j.
86
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
To assign wavelengths for all nodes 1, 2, ... , n, we simply start with 2 nodes which 
require 1 wavelength for communication and then keep adding nodes 3, 4, . . .  n. The total 
number of wavelengths needed is 1 + 2 + ... + (n-1) = «(n-l)/2.
4.2 .2  L og ica l top o logy  u sing  b id irection al links
We will use the same route chosen in the previous section so that we will again use only 
the fibers connecting routers in column (3 when defining lightpaths from any block in 
column |3 to any other block in the same column. We already have an algorithm for 
assigning routes and wavelengths to each lightpath to define complete connectivity for a 
bidirectional ring [EBC98]. They also chose a shortest path routing and have described a 
recursive algorithm to determine the wavelengths needed for complete connectivity. We 
will use their algorithm. Since we need to define two lightpaths from each end node to 
every other end node, we will need K = (n2 -  l)/8 wavelengths.
4 .3  R o b u s t  L o g ic a l T o p o lo g y  fo r  a M u lt i-M e s h
Faults in interconnection networks have been investigated for a long time [Ge98], 
[RaMu99b], [SRM02]. The standard approach in designing fault tolerant interconnection 
networks is that, in the case of faults, we have to determine a path edge (node) disjoint 
from the faulty edge (node). In other words, to avoid faults, in the standard approach, the 
message has to use a different routing algorithm where the message passes through a 
sequence of processors different from that used in the absence of faults.
87
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
We are proposing a scheme for tolerating faults in the optical path that we may include at 
relatively little cost. This scheme uses the protection path scheme, so that if there is a 
fault affecting a number of lightpaths, we can use alternate optical paths. In other words, 
even though the optical path used in sending a message does change, the diameter and 
the routing algorithm remains the same.
We now discuss how we may handle the case of faults in the logical topology. In our 
scheme we make the following assumptions:
> our physical topology uses bidirectional links,
> we do not have to deal with more than one fault at a time.
We will use path protection schemes [SRM02] that has been proposed recently for wide 
area optical networks. In a path protection scheme, when defining lightpaths, it is ensured 
that additional optical resources are included in the network so that every lightpath 
affected by any fault in the network may be rerouted to avoid the faulty element. In the 
absence of faults, primary paths are used for all communication [GeRaOO]. When a single 
fault occurs, a number of lightpaths passing through the fault (resulting from a cut in the 
fiber, fault in the receiver or transmitter) will no longer be usable. Each of these failed 
lightpaths must be rerouted so that they use a backup path that does not use the faulty 
element affecting its primary path. To achieve this, we have to make sure that the spare 
capacity in the optical part of the network is sufficient to allow the creation of such 
backup lightpaths when needed. At the same time, for reasons of economy, the amount 
of additional resources needed to guarantee that backup paths may be created in all 
possible situations must be kept to a minimum. Shared path protection is used to
88
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
minimize the additional overhead needed to create the backup paths [RaMu99a]. In 
shared path protection, the following rules must be followed [RaMu99a]:
>  the primary path and the backup path for a given lightpath must be edge-disjoint,
>  two primary paths sharing a fiber must be assigned different wavelengths,
>  two backup paths may share a fiber as well as have the same wavelength provided 
the corresponding primary paths are edge disjoint (since we assume single fault 
these two fiber-disjoint primary paths cannot fail at the same time).
We will now discuss how we may incorporate protection path scheme using the bi­
directional optical networks discussed earlier in this chapter. In describing this scheme, 
we have to
> indicate the primary paths for each of the inter-block connections,
> indicate how we can define protection paths to handle every possible fault,
>  calculate the cost of such a scheme.
Our scheme uses, for primary paths, the same paths we used in defining the logical 
topology of fault free networks using bi-directional links. We will use the same set of K = 
(n -  l)/8 wavelengths used there.
We now consider the case of a single fault. In describing the approach we will use 
addition (+) or subtraction on rows and columns. It should be noted that there is a “wrap­
around” so that row (or column) n is followed by row (or column) 1. We only discuss the 
failure of communication in the downward direction, say from router R«p to router
89
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
R(a+ i)p. This failure may be due to a failure in the fiber or due to a fault either in the 
router R ap or R(a + i>p. The case of a failure of communication in the horizontal direction 
or in the upward direction is identical. We note that the fiber from router R (xp to router 
R(a+ i)p is used by a primary lightpath from block Bgp to block B(&+. m) p if and only if m < 
n/2 and 5 < a  < 8 + m. Our scheme for setting up backup lightpaths therefore only needs 
to consider the failed lightpaths that happen due to a fault in communication from router 
R ap to router R<a +i)p- For our convenience, we group these primary lightpaths and 
assign labels to them as follows:
Group 1 consisting of the following lightpaths to block B(«+ i)p 
o lightpath PL(«+1) i from block Bap 
o lightpath PL(a+1) 2 from block B(a_i)p 
o
o lightpath PL(a+1)(„ /2) from block B(a - n/2 + i)p
Group 2 consisting of the following lightpaths to block B(a+2)p 
o lightpath PL(<x+2) 1 from block Bap 
o lightpath PL(a+2)2 from block B(a_i)p 
o . . . .
o lightpath PL(a+2x„ /2 _ 0 from block B(a - «/2 + 2>p
Group «/2 consisting of the following lightpath to block B(a + „/2)p 
o lightpath PL(a +„/2) 1 from block B«p
90
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Example:
Figure 4.10 shows an example of a MM network of order 8 where a fault has occurred 












Bo B» B* Bo
A Block of ith row and jth column ( j^ )  : A router of ith row and jth column
Figure 4.10: A faulty link in a multi-mesh of order 8
91
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The primary lightpaths in the fibers of different groups are as shown in Table 4.1. 








1 PL 51 B43 B53
P L 52 B33 B53
PL 53 B23 B53
P L 54 B13 B53
2 PL6i B43 B 63
p l 62 B33 B 63
p l 63 B23 B 63
3 PL 71 B43 B73
P L 72 B33 B73
4 P L g i B43 B 83
In our scheme for defining backup lightpaths, if n is even, we need additional n/2 
wavelengths (k\, ••• ?W2}to implement our scheme. To define a backup lightpath for
each primary lightpath affected by a fault anywhere in the network, we specify the route 
and the wavelength for each of the backup lightpaths that must replace an affected 
primary lightpath as follows:
a) To replace the primary lightpaths PL(a+;) kin group i we use column (3 + i to route 
the backup lightpath BL(a+i) k, 1 < i < n i l , 1 < k < nil -  i + 1.
b) We assign wavelength Ap to backup lightpath BL(a+i) k wherep = (l-i)© n/2 (k-1) +1 
We note that when calculating p, we use “wrap-around” so that A,i is preceded by Xn/2.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Example
Figure 4.11 shows the same multi-mesh shown in Figure 4.10 but, includes all the routers 
and the horizontal and vertical fibers. We have omitted the wraparound connections to 
simplify the diagram and we have not shown the connections from a block to the routers. 
Once again we are considering the fault be in the communication from R 43 to R53.
Figure 4.11: A faulty multi-mesh of order 8
93
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 4.2 describes the details of the backup lightpaths. We will only explain one 
situation.
Table 4.2: The backup lightpaths







1 B43 B53 R43 —► R44 —> R54 —► R53 h
B33 B53 R 3 3 -+ R 3 4 —>R 44—>R 54— * ^ 5 3 A,2
B23 B53 R23 *R24 *R 34 *R44 ^ 5 4  *
R 53
^3
B13 B53 R l3  — >■ R 14—> R 24 —► R34 
—»R 44 —> R54 * R53
X 4
2 B43 B 63 R43 > R 44—>R45—♦ R55 —* 
R 65 * R 64 R 63
X4
B33 B 63 R 33—» R 34—> R 35^ R 45—>R 5S 
—* R ^5 —> R 64 R 63
h
B23 B 63 R23 *R 24 *R25 * R 35 * 
R 45—>R 55—>R65—>R 64—  ̂ R 63
X,2
3 B43 B73 R43 —> R44 —> R45 ► R46 —*■
R 56 —*■ R66 R76 —* R75 
R 74—» R73
A.3
B33 B73 R 33 —*■ R 34 —» R35 — * R36 —*■ 
R 36 ~ > R 56 —► R66 ► ^ 7 6  ~+
R73
A4
4 B43 B g 3 R43 —> R 44 —» R45 —> R 46 ~ ►
R47 —► R 57 —* R -67 R 77 R56 




Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Group 2 includes primary lightpath PL62 from block B33 to B63. The primary path passes 
through the faulty link from router R43 to R53. The corresponding backup lightpath BL62 
will use the route B33 —̂ R33 —> R34—> R35 —> R45—> R55 —> R^5—̂ R64 —> R63—̂ Bg3. The 
wavelength of BL62 will be Xp where p = (1 - i) + (k - 1). Our pool of additional 
wavelengths consists of wavelengths [A,i, X2, X3, A4]. Since the group number is 2, i = 2. 1 
-  2 = -1 which corresponds to X4, Here k = 2 so that k-1 = 1. The wavelength immediately 
after A,4isXi.
In this chapter we have considered the implementation of the inter-block connections in a 
Multi-Mesh using optical links. We have considered the cases of using uni-directional as 
well as bi-directional links. We have proposed a new scheme for handling faults affecting 
the lightpaths where the routing algorithm is unaffected by single optical faults.
95
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Chapter 5
Conclusions and Future Directions
5.1 Summary of Work Done
In this thesis we have proposed a new network topology called 3D Multi-Mesh (3D MM) 
for multiprocessor architecture which is an extension of a recently proposed architecture 
named Multi-Mesh. The main results of our investigations are as follows:
1) We have proposed a new architecture that uses the 3-dimensional mesh as its 
building block rather than a 2-dimensional mesh as done in the Multi-Mesh 
[DDS99]. We have shown that our architecture has better topological properties 
compared to the Multi-Mesh architecture and that a number of algorithms can be 
efficiently mapped on the 3D MM network.
2) We have explored a number of possible approaches for implementing the Multi- 
Mesh architecture using opto-electronic technologies. There are two novel 
features of our approach:
a. We have shown that WDM wavelength-routed networks may be used to 
realize some of the links.
b. We have shown that single faults may be handled easily without 
increasing the number of optical paths used.
96
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.2 Suggestions for Future Work
• We have presented the fundamental algorithms; still there are a number of basic 
algorithms like matrix multiplication, matrix transposes, and sorting etc. that can 
be efficiently mapped on 3D MM network.
• We have proposed a new network topology consists of n6 processors, in real life 
there may be a situation where the number of processors may be less than or 
greater than n6 of processors, in order to accommodate any number of processors 
incomplete 3D MM can be defined in the same way of incomplete Multi-Mesh.
• In our optical implementation of Multi-Mesh, we have taken care of one fiber link 
failure; it can be improved to two or more.
• We have proposed possible approaches of implementing Multi-Mesh using 
optical technology. Our results may be extended to define 3D Multi-Mesh using 
optical technology.
5.3 Concluding Remarks
We have proposed a new architecture for interconnection networks and have shown that 
the proposed network has significantly better topological properties (e.g., diameter, node 
degree) compared to other mesh-based network, specially the Multi-Mesh network. We 
have established the fundamental algorithm for summation/average/maximum/minimum 
and point-to-point communication and shown that this network outperforms the Multi- 
Mesh network. Our optical implementation has the following novel features:
> we have used wavelength routed WDM networks which has lower power 
requirements compared to passive star coupler based designs used in other optical 
implementations
'> we have shown that protection schemes may be used in this network with 
relatively little cost.
97
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Appendix A
Path Computations for Different Source and Destination Pairs 
Assumptions:
1. ($2 > xl
2. (31 > x2
3. 72 > yl
4. yl > y2
5. cx2 > zl
6. a l  > z2
In order to path calculation in diameter of 3D MM we have assumed the above 6 cases. 
As there are 6 assumptions there can be all together 64 cases as shown in the table.
Table: 64 possible cases of source and destination
Cases p2>xl . pi > x2 72 > y i yl > y2 a2>zl ai>z2
1 T T T T T T
2 T T T T T F
3 T T T T F T
4 T T T T F F
5 T T T F T T
6 T T T F T F
7 T T T F F T
8 T T T F F F
9 T T F T T T
10 T T F T T F
11 T T F T F T
12 T T F T F F
13 T T F F T T
14 T T F F T F
15 T T F F F T
16 T T F F F F
17 T F T T T T
98
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
18 T F T T T F
19 T F T T F T
20 T F T T F F
21 T F T F T T
22 T F T F T F
23 T F T F F T
24 T F T F F F
25 T F F T T T
26 T F F T T F
27 T F F T F T
28 T F F T F F
29 T F F F T T
30 T F F F T F
31 T F F F F T
32 T F F F F F
33 F T T T T T
34 F T T T T F
35 F T T T F T
36 F T T T F F
37 F T T F T T
38 F T T F T F
39 F T T F F T
40 F T T F F F
41 F T F T T T
42 F T F T T F
43 F T F T F T
44 F T F T F F
45 F T F F T T
46 F T F F T F
47 F T F F F T
48 F T F F F F
49 F F T T T T
50 F F T T T F
51 F F T T F T
52 F F T T F F
53 F F T F T T
54 F F T F T F
55 F F T F F T
56 F F T F F F
57 F F F T T T
58 F F F T T F
59 F F F T F T
60 F F F T F F
61 F F F F T T
62 F F F F T F
63 F F F F F T
64 F F F F F F
99
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
We have taken care of the 64 possible cases; due to lack of space here we’re showing 
some interesting cases. We first showed the block level and then in the processor level.
(C ase 1) (All the 6 assumptions are true)
P T 1: B (<xl, p i ,  y l)  ->  B (a2 , p i ,  y l)  ->  B (<x2, p2, y l)  ->  B (ot2, p2, y2)
P (a l, pi, yl, x l, yl, zl) - » P (al, pi, yl, 1, yl, a2) - » P (a2, pi, yl, n, yl, a l)
-» P (a2, pi, yl, p2 ,1, a l)  - » P (a2, p2, yl, pi, n, a l)  -» P (a2, p2, yl, pi, y2, n)
- » P (a2, p2, y2, pi, yl, 1) -> P (ot2, p2, y2, x2, y2, z2)
Path Length:
(xl - 1) + (a2 -  z l) + 1 + (n- P2) + (yl -  1) + 1 + (n - y2) + (n - a l )  + 1 + (Pi- x2) + 
(y i -  y2) + (z2 -1)
= xl - 1 + a2 -  zl + 1 + n- P2 + y l-  l + l +  n - y2 + « - a l  + l + pi- x2 + y l-  y2 + z2 -1 
= 3« + x l  + y l  - z l  - a l  + p i + yl - x2 - y2 + z2 + a2 - P2 - y2
PT 2: B ( a l ,  p i ,  y l)  ->  B ( a l ,  p i ,  y2) ->  B ( a l ,  P2, y2) ->  B (a2 , P2, y2)
P(al, pi, yl, xl, yl, z l ) -> P(al, pi, yl, xl, y2,1) -» P(al, pi, y2, xl, yl, n)
- » P(al, pi, y2, p2, n, n ) -» P(al, p2, y2, pi, 1, n) P(al, p2, y2, n, 1, a2)
—> P(a2, P2, y2 ,1 ,1, a l) P(a2, P2, y2 , x2, y2, z2)
Path Length:
= (y2-yl) + (zl-l)+l+(p2-xl) +(«-yl)+l+(«-pl) + (n-a2) + 1 + (x2-l) + (y2—1) + (al-z2) 
= y2 - yl + zl -1+ 1 + P2 - xl + n - yl + 1 + n - pi + n - a2 + 1 + x2 -1 + y2 -  1+ al-z2 
= 3« -  x l -  y l  + z l  + a l  - p i - yl + x2 + y2 -  z2 - a2 + P2 + y2 
PT1 + PT2 :
PT1: 3n + x l  + y l  - z l  - a l  + p i + yl - x2 - y2 + z2 + a2 - p2 - y2 
PT2: 3n -  x l  -  y l  + z l  + a l  - p i - yl + x2 + y2 -  z2 - a2 + P2 + y2
= 3 n
100
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(Case 2:) (Assumption 6 false)
PT1: B (a l, p i, y l) ->  B (a2, p i, yl) -> B (a2, P2, yl) B (a2, P2, y2)
P (a l ,  p i, yl, x l, y l, z l) -> P (a l, p i, yl, 1, yl, a2) -» P (a2, p i, yl, n, y l, a l)
-> P (a2, p i, yl, P 2 ,1, a l )  - > P (a2, p2, yl, p i, n, a l )  P (a2, p2, yl, p i, y2 ,1)
P (a2, P2, y2, p i, yl, n) -» P (a2, p2, y2, x2, y2, z2)
Path Length:
(xl - 1) + (a2 -  z l) + 1 + (n- p2) +(yl -  1) + 1 + (n - y2) + (a l - 1) + 1 + (pi- x2) + 
(yl -  y 2 ) + (n  - z 2 )
= xl -1 + a2 -  zl + 1 + n- P2 + y l-  l + l+  « - y2 + « + a l  + l+  pi- x2 + y l-  y2 - z2 -1 
= 3n + x l + y l  - z l  + a l  + p i + yl - x2 - y2 - z2 + a2 - P2 - y2
PT2: B (a l, p i, yl) -> B (a l, p i, y2) —> B (a l, P2, y2) —> B (a2, P2, y2)
P (a l, p i, yl, x l, y l, z l) -> P (a l, p i, yl, x l, y2 ,1) -> P (a l, p i, y2, x l, yl, n)
- 4 P (a l, p i, y2, p2, n, n) - » P (a l, p2,y2, p i, 1,/*) —>P (a l, P2, y2, n, 1, a2)
P (a2, p2, y 2 ,1,1, a l )  P (a2, P2, y2, x2, y2, z2)
Path Length:
= (y2-yl) + (zl-1) +1+ (P2-xl) + («-yl) +1+ (n-pl) + (n-a2) +1+ (x2-l)+ (y2-l)+(z2-al) 
= y2-yl +zl - l + l + p 2 - x l + « - y l  + l-(-n-pi-i-n-oc2 + l + x 2 - l - i - y 2 - l + a l + z 2  
= 3« -  xl -  yl  + zl + a l  - pi - yl + x2 + y2 -  z2 - a2 + P2 + y2
PT1 + PT2 :
PT1: 3/i + x l + y l  - z l  + a l  + p i + yl - x2 - y2 - z2 + a2 - p2 - y2 
PT2: 3/i -  x l  -  y l + z l  - a l  - p i - yl + x2 + y2 + z2 - a2 + P2 + y2
= 3n
101
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(Case 3) (Assumptions 4 & 5 & 6 False)
PT1: B (a l, (31, yl) -> B (a2, p i, yl) B ( a l ,  p2, yl) - » B (a2, p2, y2) 
P (al, p i, yl, x l, y l, z l ) -» P (a l, p i, yl, 1, yl, a2) P(oc2, p i, yl, n, y l, a l)
-» P(a2, p i, yl, p 2 ,1, a l ) -» P(a2, p2, yl, p i, n , a l )  -> P(a2, p2, yl, p i, y 2 ,1)
-» P(a2, p2, y2, p i, yl, n) -> P(a2, p2, y2 , x2, y2, z2)
Path Length:
(xl-1) + (zl-a2) + 1 + (»-p2) +(yl-l)+ l+ (n -y2)+ (al-1) +1+ (pi- x2)+(y2 -yl)+(n-z2) 
= xl-1 - a2 + zl + 1 + n- p2 + y l-  1 + 1 + n - y2 + a l  -  1 + 1 + pi- x2 - yl+ y2 + n - z2 
= 3n + xl + yl + z l + a l  + pi - yl - x2 + y2 - z2 - a2 - P2 - y2
PT2: a lp ly l -> a ip iy2->  a ip 2 y 2 ^  a2p2y2
P (a l, p i, yl, x l, y l, z l ) -> P (a l, p i, yl, x l, y2, n) P (al, p i, y2, x l, yl, 1)
-4 P (a l, p i, y2, p 2 ,1,1) -» P (a l, p2, y2, p i, n, 1) -> P (a l, p2, y2, n, n, a l)
P(a2, P2, y 2 ,1 ,n, a l )  P(a2, p2, y2 , x2, y2, z2)
Path Length:
= (y2-yl) + (n-zl) + 1+ (p2-xl)+(yl-l)+l+(«-pl)+ (ot2-l) + 1 +(x2-l)+(n-y2)+ (z2 - a l )  
= y 2 - y l - z l - l + l  + P 2 - x l + n  + y l - l + n - p i + n  + a2 + l + x 2 - l + « - y 2 -  al+z2 
= 3n -  x l -  y l - z l - a l  - p i + yl + x2 - y2 + z2 + a2 + P2 + y2
PT1 + PT2 :
PT1: 3« + x l + y l + z l + a l  + p i - yl - x2 + y2 - z2 - a2 - P2 - y2 
PT2: 3/i -  x l -  y l - z l - a l  - p i + yl + x2 - y2 + z2 + a2 + p2 + y2
= 3n
102
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(Case 4) (Assumptions 3 & 4 & 5 & 6 are false)
PT1: B (a l, p i, yl) -> B (a2, p i, yl) -> B (a2, P2, yl) B (a2, p2, y2)
P (a l, (31, yl, x l, y l, z l ) -> P (a l, (31, yl, 1, yl, a2) P(a2, p i, yl, n, y l, a l )
P(a2, p i, yl, p 2 ,1, a l ) P(a2, p2, yl, p i, n , a l )  -» P(a2, p2, yl, p i, y 2 ,1)
-» P(a2, p2, y2, p i, yl, n) -> P(a2, p2, y2 , x2, y2, z2)
Path Length:
(xl-l)+(zl-a2)+l+(n-p2)+(n-yl)+l+(y2-l)+ (a l - 1) + 1 + (Pi- x2) + (y2 - yl) + (n - z2) 
= xl -1 +zl - a2 + 1 + n- P2 + n - yl+ 1 + y2 -1 + n + a l  + 1 + pi- x2 - yl + y2 - z2 -1
= 3/i + x l  - y l  + z l  + a l  + p i - yl - x2 + y2 - z2 - a2 - p2 + y2
PT2: B (a l, p i, y l) B (a l, p i, y2) —> B (a l, P2, y2) —> B (a2, p2, y2) 
P (a l, p i, yl, x l, y l, z l ) -^  P (a l, p i, yl, x l, y2, n) P (al, p i, y2, x l, yl, 1)
-> P (a l, p i, y2, P 2 ,1,1) -» P (a l, p2, y2, p i, n, 1) P (a l, p2, y2, n, n, a l)
—> P(a2, p2, y 2 ,1 , n , a l )  —> P(a2, P2, y2 , x2, y2, z2)
Path Length:
= (yl-y2)+(«-zl) + 1 +(p2-xl) + (yl-1) +1 +(«-p 1 )+(a2-1)+1 +(x2-1)+(« - y2) + (z2 - a l )  
= y l- y2 - zl -1+  1 + p2 - x l + n + yl + 1 + n - p i + n + a2 + 1 + x2 -1 - y2 -  1- al+z2 
= 3n -  x l + y l - z l - a l  - p i + yl + x2 - y2 + z2 + a2 + P2 - y2
PT1 + PT2 :
PT1: 3n + x l - y l + z l + a l  + p i - yl - x2 + y2 - z2 - 02 - P2 + y2 
PT2: 3/i -  x l + y l + z l - a l  - p i + yl + x2 - y2 + z2 + a2 + P2 - y2
= 3/i
103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(Case 5) (Assumptions2&3&4&5&6arefalse)
PT1: B (a l, p i, y l) B (a2, p i, yl) - > B (a2, p2, yl) B (a2, P2, y2)
P (al, p i, yl, x l, y l, z l ) -» P (a l, (31, yl, 1, yl, a2) -> P(a2, p i, yl, n, yl, a l)
->  P(a2, pi, yl, p2, n, a l ) P(a2, P2, yl, pi, 1, a l)  ^  P(a2, p2, yl, pi, y 2 ,1)
-» P(a2, p2, y2, p i, yl, n) -> P(a2, P2, y2 , x2, yl, zl)
Path Length:
(xl-l)+(zl-a2) +l+(«-p2)+(«-yl)+l+(y2-l)+(al-l) +l+(x2 - p i) + (y2 -yl)+(/i - z2)
= xl - 1 - a2 + zl + 1 + n- P2 - y l -  l + l +  n + y2 + a l  + l -  p i+  x2 - yl+ y l - z l  -1 
= 3« + x l - y l + z l + a l  - p i - yl + x2 + y2 - z2 - a2 - P2 + y2
PT2: B (a l, p i, yl) -> B (a l, p i, y2) B (a l, P2, y2) -> B (a2, P2, y2) 
P (al, p i , y l , x l , y l , z l  ) - >  P ( a l , p l , y l ,  x l ,y2 ,n) ->  P (a l,p l,y 2 , x l ,y l ,  1)
-> P (al, p i, y2, P 2 ,1 , 1 ) -> P (a l, p2, y2, p i, n, 1) -> P (al, p2, y2 ,1, n, a2)
—> P(a2, P2, y2, n ,n, a l)  —> P(a2, P2, y2 , x2, y l, zl)
Path Length:
= (yl-y2) + (n - z l) + 1 +(p2-xl) + (y l-l)+ l+ (p l-l)+ (a2-l)+ l+  (n- xl)+(n-yl)+(zl -a l)  
= yl-y2 - zl -1+  1 + P2 - x l + n + yl + 1 + n + pi + n + a2 + 1 - x2 - 1 - y2 -  1- al+z2
= 3n -  x l + y l - z l - a l  + pi + yl - x2 - y2 + z2 + a2 + P2 - y2
PT1 + PT2 :
PT1: 3« + x l - y l + z l + a l  - p i - yl + x2 + y2 - z2 - a2 - P2 +y2 
PT2: 3« -  x l + y l - z l - a l  + p i + yl - x2 - y2 + z2 + a2 + P2 - y2
= 3 n
104
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(Case 6) (All the assumption 1 & 2 & 3 & 4 & 5 & 6  are false)
P T 1 : B ( a l ,  (31, y l)  ->  B (a 2 , p i ,  y l)  ->  B (a 2 , P2, y l)  ->  B (a 2 , P2, y2) 
P (a l, p i, yl, x l, y l, z l ) -» P (al, p i, yl ,n,  yl, a2) P(a2, p i, yl, 1, y l, a l )
-> P(a2, p i, yl, p2, n, a l ) -> P(a2, p2, yl, p i, 1 , a l )  -> P(a2, p2, yl, p i, y 2 ,1)
->• P(a2, P2, y2, (31, yl, n) -» P(a2, P2, y2 , x2, y2, z2)
Path Length:
(n - xl)+(zl-o2) +l+(p2 -  1 )+(n-yl)+l+(y2-l)+(al-l) +l+(x2 - p i) + (y2 -yl)+(« - z2) 
= - x l -1 - a2 + zl + 1 + n + p2 - y l -  l + l +  « + y2 + a l  + l -  p i+  x2 - yl+ y2 - z2 -1 
= 3n - x l - y l + z l + a l  - p i - yl + x2 + y2 - z2 - a2 + P2 + y2
P T 2 : B ( a l ,  p i ,  y l)  ->  B ( a l ,  p i ,  y2) —> B ( a l ,  P2, y2) ->  B (a 2 , P2, y2) 
P (a l, p i, yl, x l, y l, z l ) -» P ( a l , p i , y l ,  x l ,y2 ,n) ->  P(a l ,p i ,y2 ,  x l , y l , l )
P (a l, p i, y2, p 2 , 1 , 1 )  -» P (a l, p2, y2, p i, n, 1) -> P (al, p2, y2 ,1, n, a2)
—> P(a2, P2, y2, n ,n, a l )  P(a2, P2, y2 , x2, y2, z2)
Path Length:
= (yl-y2)+(«-zl) +1+ (xl-P2)+(yl-l)+ l+(P l-l)+(a2-l)+ l+  (n- x2) + (n-  y2) + (z2 - a l )  
= yl-y2 - zl -1+  1 - p2 + xl + n + yl + 1 + n + p i + n + a2 + 1 - x2 -1 - y2 -  1- al+z2
= 3« + x l + y l - z l - a l  + pi + yl - x2 - y2 + z2 + a2 - p2 - y2
PT1 + PT2 :
PT1: 3« - x l - y l + z l + a l  - p i - yl + x2 + y2 - z2 - a2 + p2 +y2 
PT2: 3k + x l + y l - z l - a l  + p i + yl - x2 - y2 + z2 + a2 - P2 - y2
=  3k
105
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Appendix B
Glossary of Important Terms
Boundary Processor: The processors on the sides of the block (but not on comers) are 
characterized by x and y values such that exactly one of these coordinates are 1 or n.
Connectivity: The minimum number of arcs that have to be removed from the network 
to cut the network into two disconnected networks.
Corner Processor: The processors situated in the comer of a block meaning that all of 
the coordinate values are exactly 1 or n.
Cost: Total number of communication links of a network.
Diameter: Diameter of a graph is the maximum of the shortest distance (hops) between 
two nodes.
Dynamic Interconnection Network: Connections among the processors can be
changed; the processors are not directly wired.
106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Fault Tolerant Optical Network: A network that is capable of working even in the 
presence of faults. In some cases WDM optical network provides alternate paths to avoid 
faults.
Free Space: Instead of optical fiber free space optical interconnection uses air space for 
optical signal propagation
Inter-Block Links: The links that connect the processor of different blocks
Interconnection Network: Interconnection network connects different processors in a 
multi-processor system.
Internal Processor: the processors in a block having all the connections (neighbors) 
within the bock.
Intra-Block Link: Processors within a block are connected by intra-block link.
Lightpath: The all-optical path through which the information flows in a wavelength- 
routed optical network. A lightpath may be composed of a single wavelength or it may 
consist of multimode of wavelengths.
107
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Logical Topology: Is a graph that is obtained from the physical topology by assigning 
the lightpaths between the nodes. The nodes of the graph are the end-nodes of the 
physical topology and two nodes connected by a directed edge if there is a lightpath 
between them.
Multi-Hop Network: A network in which a packet may hop through zero or more 
intermediate nodes before it reaches its final destination.
Multiplexer/Demultiplexer: Optical multiplexers are used to combine several 
independent signals at different wavelength into one fiber. A demultiplexer works exactly 
the opposite way that is splitting the signals at different wavelengths.
Multiprocessor Architecture: A system consisting of more than one processing units 
where processors work simultaneously to solve a given problem.
Neighbors: Processors within the block that are directly connected are called neighbors.
Network Size: Total number of nodes in a network
Node Degree: Total number of incoming and outgoing links of a node.
Optical Communication: Data communication in a network where data is transmitted 
through optical fiber.
108
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Optical Couplers: Coupler is a general term that covers all the devices that combine 
beams of light into or split into beams of light out of a fiber.
Optical Fiber: Optical fiber is a medium of data transmission where data is transmitted 
in the form of light wave. Optical fiber is a thin filament of glass, which acts as a wave­
guide
Optical Router: In an optical network, a router is a device that is connected to a number 
of fibers, some carrying incoming optical signals to the router and the others carrying 
outgoing optical signals. A router determines how the incoming signals will be directed 
to the outgoing fibers.
Passive Star Coupler: The passive star coupler is a “broadcast” device, where an optical 
signal transmitted using a given wavelength from a node in the network will be 
communicated to all other nodes in the network.
Physical Topology: Provides the physical connections between the nodes in a network. 
Regular Graph: A graph where each node has same node degrees.
Routing and Wavelength Assignment (RWA): Given a network topology and a set of
lightpaths (to be determined), the problem of routing the lightpaths in the network and
109
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
assigning wavelengths to these lightpaths is referred as the routing and wavelength 
assignment (RWA).
Single-Hop Network: A network in which a packet travels from its source to its 
destination directly (in one hop). The packet does not encounter an electro-optic 
conversion before reaching its final destination.
Static Interconnection Network: All connections among the processors are fixed
meaning that the processors are wired directly.
Wavelength Division Multiplexing (WDM) Network: It is a promising approach used 
in optical fiber where the optical transmission spectrum is divided into a number of non 
overlapping wavelength (or frequency) bands, with each wavelength supporting a single 
communication channel operating at peak electronic speed.
Wavelength Routed Network: A wavelength routed WDM network is a network where 
each end-node (the source or destination of data) is connected to a router and each router 
is connected to other routers. The advantage of such network is that the data is not 
broadcast to all the end-nodes. The settings of the routers determine which end-nodes will 
be connected by a lightpath.
110
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Bibliography
[Ak89] S. G. Akl, The Design and Analysis of Parallel Algorithms, The Prentice-Hall 
Inc., New Jersey, 1989.
[BBRM97] M.S Borella, J.P. Jue, D. Baneijee, B. Ramamurthy and B. Mukherjee, 
“Optical components for WDM lightwave networks”, Proceedings of the IEEE, Vol. 85, 
Issue 8, pp. 1274-1307, 1997.
[Be73] C. Berge, Graphs and Hypergraphs, American Elsevier publishing company, Inc., 
1973.
[ChKr93]R.D. Chamberlain and R.R Krchnavek, “Architectures for optically 
interconnected multicomputers”, IEEE Global Telecommunication Conference, 
GLOBECOM ’93, Vol. 2, pp. 1181-1186, 1993.
[CrBo98] O. Crochat, and Jean-Y.L. Boudec. “Design Protection for WDM Optical 
Networks”, IEEE J. Select Areas Commun. Vol. 16, No. 7, pp. 1158-1165, 1998.
[DDHH+99] B. T. Doshi, S. Dravida, P. Harshavardhana, O. Hauser, and Y. Wang, 
“Optical network design and restoration”, Bell Labs tech. J., pp. 58 -  83,1999.
[DDS99] D. Das, M. De and B. P. Sinha, “A new network topology with multiple 
meshes”, IEEE Transactions on Computers, Vol. 48, No. 5, pp. 536-551, 1999.
[DGS97] M. De, D. Das, M. Ghosh and B. P. Sinha, “An efficient sorting algorithm on 
the Multi-Mesh network”, IEEE Transactions on Computers, Vol. 46, No. 10, pp. 1132- 
1137,1997.
I l l
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[EBC98] G. Ellinas, K.Bala and G. K. Chang, “Scalibility of a novel wavelength 
assignment algorithm for WDM shared protection rings”, Proc. IEEE/OS A’ 98 Optical 
Fiber Communication Conference, pp. 363-364,1998.
[Ge98] O. Gerstel, “Opportunities for optical protection and restoration”, in Optical Fiber 
Communication Conf., Vol. 2, San Jose, CA, pp. 269 -  270,1998.
[GeRaOO] O. Gerstel and R. Ramaswami, “Optical Layer Survivability: A Services 
Perspective”, IEEE Commun. Mag., pp. 104-113, 2000.
[HKMF+95] W. Hendrick, O Kibar, P. Marchand, C. Fan, D. V. Blerkom, F. 
McCormick, I. Cokgor, M. Hansen and S. Esener, “Modeling and optimization of the 
optical transpose interconnection system”, in Optoelectronic Technology Center, 
Program Review, Cornell University, 1995.
[HwBr83] K. Hwang and F. A. Briggs, Computer Architecture and Parallel Processing. 
New York: McGraw-Hill, 1983.
[Le92] T. Leighton, Introduction to Parallel Algorithm and Architectures: Arrays ■ Trees • 
Hypercubes, Morgan Kaufmann Publishers, San Mateo, California, 1992.
[LiFiOl] L. L. Ling and A.J.C. Filho,“D-ARM: A new proposal for Multi-Dimensional 
interconnection networks”, SIGCOMM Computer Communication Review, Vol. 31, No. 
1, 2001.
[LoSu94a] Louri and H. Sung, “An Optical Multi-Mesh Hypercube: a scalable optical 
interconnection network for massively parallel computing”, Journal of Lightwave 
Technology, Vol.12, Iss. 4, pp. 704 -716,1994.
[LoSu94b] Louri and H. Sung, “3D Optical Interconnects for high-speed interchip and 
interboard communications”, Computer, Vol. 27, pp. 27-37, 1994.
112
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[MMHE93] G. C. Marsden, P. J. Marchand, P. Harvey and S. C. Esener, “Optical 
transpose interconnection system architectures”, Optical Letters, Vol. 18, No. 13, pp. 
1083-1085, 1993.
[MPPM02] G. Maier, A. Pattavina, S. D. Patre, and M. Martinelli, “Optical Network 
Survivability: Protection Techniques in the WDM Layer”, Photonic Network 
Communication, 4:3/4, 251-269, 2002.
[Mu97] B. Mukherjee,” Optical Communication Networks”, McGraw-Hill, Newyork, 
1997.
[MuSe03] S. Murthy and A. Sen, “A Peer-to-Peer Network Based on Multi-Mesh 
Architecture”, GLOBOCOM, 2003
[OsOO] A. Osterloh, “Sorting on the OTIS-Mesh”, Proc. 14th International Parallel and 
Distributed Processing Symposium (IPDPS 2000), pp. 269-274, 2000.
[Ra92] S. Rajasekaran, "Randomized algorithms for packet routing on the Mesh", 
Advances in Parallel Algorithms, pp. 227-301,1992.
[RaMu99a] S. Ramamurthy and B. Mukheijee, “Survivable WDM Mesh Network, Part I- 
Protection” , Proc. of IEEE INFOCOM ’99, pp.744-751,1999.
[RaMu99b] S. Ramaswami and B. Mukheijee, “Survivable WDM mesh networks Part II 
-  Restoration”, ICC’99. 1999 IEEE International Conference on Communications, Vol. 
3. pp. 2023-2030,1999.
[RaSa98] S. Rajasekaran and S. Sahni, “Randomized routing, selection and sorting on the 
OTIS-Mesh optoelectronic computer”, IEEE Transactions on Parallel and Distributed 
Systems, Vol. 9, No. 9, pp. 833-840,1998.
113
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[SaWa97] S. Sahni and C.-F. Wang, “BPC permutations on the OTIS-Mesh 
optoelectronic computer”, Proc. of the 4th International Conference on Massively Parallel 
Processing using Optical Interconnections (MPPOI ’97), Montreal, Canada, pp. 130-135, 
1997.
[SBS01] A. Sen, S. Bandyopadhyay and B. P. Sinha, “A new architecture and a new 
metric for lightwave networks”, IEEE Journal of Lightwave Technology, Vol. 19, No. 7, 
pp. 913-925, 2001.
[ScSe89] I. D. Scherson and S. Sen, “Parallel sorting in two-dimensional VLSI models of 
computation”, IEEE Transactions on Computers, Vol. 38, No. 2, pp. 238-249, 1989.
[SFK97] D. Sima, T. Fountain, P. Kacsuk, “Advanced Computer Architectures”, Addison 
Wesley, 1997.
[So03] V.Soini, “Introduction to Optical Networking”, 2003.
[SRM02] L. Sahasrabudhe, S. Ramamurthy and B. Mukheijee, “Fault management in IP- 
over-WDM networks: WDM protection versus IP restoration”, IEEE J. Select. Areas 
Commun., Vol. 20, No. 1, pp. 21 -  33,2002.
[St83] Q. F. Stout, “Mesh connected computers with broadcasting,” IEEE Trans. 
Computers, vol. 32, pp. 826-830,1983.
[StBa99] T. E. Stem, K. Bala, Multiwavelength Optical Network, United States of 
America, Addison Wesley Longman, Inc. 1999.
[StCo91] H.S.Stone andJ.Cocke,’’Computer Architecture in thel990s,” Computer Vol. 
24, No. 9, pp30-38, 1991.
[Sz95] T.Szymanski,’’Hypermeshes”: Optical Interconnection Networks for Parallel 
Computing, 1995.
114
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[To94] M. Tompa, Lecture notes on message routing in parallel machines, Technical 
Report # 94-06-05, Department of Computer Science and Engineering, University of 
Washington, 1994.
[U184] J. D. Ullman, Computational Aspects of VLSI, computer Science Press, 1984. 
[WaSaOO] C.-F. Wang and S. Sahni, “Image processing on the OTIS-Mesh optoelectronic 
computer”, IEEE Transactions on Parallel and Distributed Systems, Vol. 11, No. 2, pp. 
97-109,2000.
[WaSaOl] C.-F. Wang and S. Sahni, “Matrix multiplication on the OTIS-Mesh 
optoelectronic computer”, IEEE Transactions on Computers, Vol. 50, No. 7, pp. 635-646, 
2001.
[WaSa02] C.-F. Wang and S. Sahni, “Computational Geometry on the OTIS-Mesh opto­
electronic computer”, Proc. International Conference on Parallel Processing, Vol. 18, No. 
21, pp.501-507, 2002.
[WaSa98] C.-F. Wang and S. Sahni, “Basic operations on the OTIS-Mesh optoelectronic 
computer”, IEEE Transactions on Parallel and Distributed Systems, Vol. 9, No. 12, pp. 
1226-1236, 1998.
[ZMPE00] F. Zane, P. Marchand, R. Paturi and S. Esener, “Scalable network 
architectures using the optical transpose interconnection system (OTIS)”, Journal of 
Parallel and Distributed Computing, Vol. 60, No. 5, pp. 521-538,2000.
[Zo96] P. Zoeteweij, ’’Multicomputer Routing Algorithms: A simulation study”, M.Sc. 
Thesis, 1996.
115
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
VITA AUCTORIS
Nahid Afroz was bom in Barisal, Bangladesh. She obtained a B.Sc. with honors, in 
Computer Science in 1999 from the University of Dhaka, Bangladesh. She is pursuing 
her graduate studies at the University of Windsor in Computer Science and is currently a 
candidate for the master’s degree and hoping completion in June, 2004. Her thesis 
advisor is Dr. Subir Bandyopadhyay.
116
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
