Mapping a weak hypercube on an optical slab waveguide by Rengan, Divya
Louisiana State University
LSU Digital Commons
LSU Master's Theses Graduate School
2006
Mapping a weak hypercube on an optical slab
waveguide
Divya Rengan
Louisiana State University and Agricultural and Mechanical College, drenga1@lsu.edu
Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_theses
Part of the Electrical and Computer Engineering Commons
This Thesis is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion in LSU
Master's Theses by an authorized graduate school editor of LSU Digital Commons. For more information, please contact gradetd@lsu.edu.
Recommended Citation
Rengan, Divya, "Mapping a weak hypercube on an optical slab waveguide" (2006). LSU Master's Theses. 4147.
https://digitalcommons.lsu.edu/gradschool_theses/4147
MAPPING A WEAK HYPERCUBE ON AN
OPTICAL SLAB WAVEGUIDE
A Thesis
Submitted to the Graduate Faculty of the
Louisiana State University and
Agricultural and Mechanical College
in partial fulllment of the
requirements for the degree of
Master of Science in Electrical Engineering
in
The Department of Electrical and Computer Engineering
by
Divya Rengan
Bachelor of Technology in Information Technology
University of Madras, Chennai, 2004
December 2006
Acknowledgements
I would like to express my gratitude to Dr. R. Vaidyanathan, my major professor, for his
guidance and support over the past two years. Without his help this thesis would not have
been possible. I would like to thank Dr. Ahmed El-Amawy and Dr. Jerry Trahan for
agreeing to be part of my thesis defense committee. I would also like to thank my family,
friends and well-wishers.
ii
Table of Contents
Acknowledgements : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ii
List of Figures : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : iv
Abstract : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vi
Chapter
1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
2 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5
2.1 General Conventions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5
2.2 Hypercube : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6
2.2.1 A Review of Hypercube Properties : : : : : : : : : : : : : : : : : : : 6
2.3 Interprocessor Communication on H
d
: : : : : : : : : : : : : : : : : : : : : : 13
2.4 Optical Implementation of Interprocessor Communication : : : : : : : : : : : 14
2.4.1 Slab Waveguides : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14
2.4.2 Mapping : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17
2.4.3 Aggregates : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18
3 Dense Mapping : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 22
3.1 Construction of the Destination Array, Dst : : : : : : : : : : : : : : : : : : : 22
3.2 Construction of the Dimension Array, Dim : : : : : : : : : : : : : : : : : : : 26
3.3 Construction of Source Array, Src : : : : : : : : : : : : : : : : : : : : : : : : 30
3.4 Analyzing the Mapping : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 31
3.4.1 Correctness : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 31
3.4.2 Number of Aggregates : : : : : : : : : : : : : : : : : : : : : : : : : : 36
4 Sparse Mapping : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 39
4.1 Construction of the Dimension Array, Dim : : : : : : : : : : : : : : : : : : : 39
4.2 Construction of the Destination Array, Dst : : : : : : : : : : : : : : : : : : : 44
4.3 Construction of the Source Array, Src : : : : : : : : : : : : : : : : : : : : : 45
4.4 Number of Aggregates : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 49
4.5 Relation to the Extended Hypercube : : : : : : : : : : : : : : : : : : : : : : 52
5 Lower Bounds : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 57
5.1 A General Lower Bound : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 57
5.2 Dense Mapping Lower Bound : : : : : : : : : : : : : : : : : : : : : : : : : : 58
6 Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 66
Bibliography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 68
Vita : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 71
iii
List of Figures
2.1 Examples of hypercubes : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7
2.2 A 3-dimensional directed hypercube : : : : : : : : : : : : : : : : : : : : : : : : : 8
2.3 Recursive structure of a hypercube : : : : : : : : : : : : : : : : : : : : : : : : : 9
2.4 Recursive decomposition of H
4
: : : : : : : : : : : : : : : : : : : : : : : : : : : 9
2.5 An illustration of the proof of Lemma 2.4 : : : : : : : : : : : : : : : : : : : : : 11
2.6 Binary representations of b and b
0
: : : : : : : : : : : : : : : : : : : : : : : : : : 12
2.7 Binary representations of j and j
0
: : : : : : : : : : : : : : : : : : : : : : : : : : 12
2.8 Weak communication in a 3-dimensional hypercube : : : : : : : : : : : : : : : : 14
2.9 An optical slab : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15
2.10 Slab structure considered in this thesis : : : : : : : : : : : : : : : : : : : : : : : 16
2.11 Mapping 1 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 20
2.12 Mapping 2 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 20
2.13 Mapping 3 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 20
2.14 A way to specify the standard mapping : : : : : : : : : : : : : : : : : : : : : : : 20
3.1 Structure of the destination array : : : : : : : : : : : : : : : : : : : : : : : : : : 23
3.2 Array Dst for d = 3 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 24
3.3 Array Dst for d = 4 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 24
3.4 Relation between h
d
;
^
h
d
and
^
h
x
d
: : : : : : : : : : : : : : : : : : : : : : : : : : : 25
3.5 Examples of the array Dim : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 27
3.6 Positions of ;  and  in the dimension array : : : : : : : : : : : : : : : : : : : 29
3.7 Array Dim for a 4-dimensional hypercube : : : : : : : : : : : : : : : : : : : : : 30
iv
3.8 Src array for a 4-dimensional hypercube : : : : : : : : : : : : : : : : : : : : : : 30
3.9 Bit values for the proof of Lemma 3.7 when k > 0 : : : : : : : : : : : : : : : : : 34
3.10 Bit values for the proof of Lemma 3.7 when k = 0 : : : : : : : : : : : : : : : : : 35
3.11 Diering bits in the proof of Lemma 3.7 : : : : : : : : : : : : : : : : : : : : : : 36
4.1 Examples of reverse diagonal arrays : : : : : : : : : : : : : : : : : : : : : : : : : 40
4.2 Unit reverse diagonal array, U
2
: : : : : : : : : : : : : : : : : : : : : : : : : : : 40
4.3 Recursive construction of Dim
3;1
: : : : : : : : : : : : : : : : : : : : : : : : : : 41
4.4 4-dimensional sparse Dim array : : : : : : : : : : : : : : : : : : : : : : : : : : : 43
4.5 Sparse destination array for d = 4 : : : : : : : : : : : : : : : : : : : : : : : : : : 44
4.6 A 4-dimensional Sparse Src : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 45
4.7 Mapping for an extended hypercube Q
3
: : : : : : : : : : : : : : : : : : : : : : : 53
4.8 A 5-dimensional extended hypercube : : : : : : : : : : : : : : : : : : : : : : : : 54
5.1 An illustration of the proof of Lemma 5.3 : : : : : : : : : : : : : : : : : : : : : 58
5.2 An illustration of the proof of Lemma 5.6 : : : : : : : : : : : : : : : : : : : : : 60
5.3 Illustrating the structure for numbers u; a; v; b : : : : : : : : : : : : : : : : : : : 60
5.4 Complete and incomplete aggregates : : : : : : : : : : : : : : : : : : : : : : : : 61
5.5 An illustration of the proof of Lemma 5.9 : : : : : : : : : : : : : : : : : : : : : 62
5.6 An illustration for the proof of Lemma 5.10 : : : : : : : : : : : : : : : : : : : : 64
v
Abstract
The communication fabric of a parallel processing system is represented as a directed graph.
In a weak topology a processor uses at most one incoming edge or one outgoing edge for
communication at any given point in time. Currently, the data rate that can be supported
on electronic interconnects is reaching its limits. Optical interconnects have been identied
as one of the most promising approaches to the growing demands for today's systems.
The \medium bandwidth" of an optical waveguide is huge (order of petabits per second
for a 1mm
2
cross-section optical slab). The challenge lies in utilizing as much of this medium
bandwidth as possible. We address this problem by exploiting knowledge of the communica-
tion patterns. Key to our approach is a method to map communications to optical channels.
This thesis deals with the mapping of a d-dimensional weak hypercube on to an optical
slab waveguide. The weak topology helps reduce the cost of optical components used by
allowing component reuse across dierent channels. We present two mappings, the dense
and sparse mappings. The dense mapping for a d-dimensional weak hypercube packs all
communication channels into a d2
d
array of optical channels and uses (d 2)2
d
+4 lasers and
2
d
detectors (or vice-versa). The sparse mapping uses a 2
d 1
2
d
channel array, but does not
use all channels to map hypercube edges. We show that this mapping requires 2
d
lasers and
2
d
detectors. We also dene a supergraph of the hypercube, called the extended hypercube,
that maximally utilizes the empty channels in a sparse mapping. We establish that the
extended hypercube is the largest supergraph of the hypercube that utilizes all available
channels, without increasing the number of lasers and detectors used. The mappings dened
vi
for both these sparse cases are optimal. They use N lasers and N detectors, where N is the
number of nodes in the topology.
We also derive lower bounds on the number of lasers and detectors needed for a \standard"
mapping of a hypercube to an optical slab waveguide. We show that the costs of all the
dense and sparse mappings proposed in this thesis match these lower bounds.
vii
Chapter 1
Introduction
The interconnection network between processors in a parallel processing system is one of its
most important components. In fact, today's computing systems are limited more by the
communication fabric connecting the processors than the processors themselves [5, 10]. In
addition, CMOS-transistors are expected to support data rates of around 20 Gbps in the
near future [36]. The data rate possible on traditional metal interconnects is fast reaching
its limits. There is, therefore, a gap between what these traditional interconnects can deliver
and the computing needs of the future [11, 22, 23, 28]. Optical interconnects [8, 13, 16, 20, 35]
show great promise in lling this gap. They support faster data rates, higher bandwidths
and are less prone to noise. Optical interconnects have been studied well in both long haul
networks [2, 6, 14] and at the board/backplane levels [7, 8, 17, 18]. In this thesis we consider
optical interconnects between modules that are spatially proximate (such as at the board or
backplane levels).
Optical communication can be free-space [4, 27] or waveguided [12, 11, 15, 30]. Free-
space systems involve transmitting light in air and directing it using mirrors and prisms.
In waveguided systems, light is inserted into a waveguide at one end and detected at the
other. Here we consider waveguided systems, specically, systems with slab waveguides
that support transmission of light in several dierent modes.
1
However, many of the ideas
presented are applicable to systems with bers and free space optics as well.
1
For this thesis a slab is a waveguide with a suÆciently large cross-section that supports transmission
of light in many modes (angles, for our purpose). In contrast, bers have a much smaller cross-section and
allow just a single or a few modes.
1
A simple slab waveguide can carry a huge bandwidth (easily in the range of petabits per
second). Practically, systems can harness just a fraction of this bandwidth (in the terabits
per second range). There is a wide gap between the \medium bandwidth" and the \system
bandwidth." The main reasons for this gap are the challenges in inserting and detecting light
into and from the slab waveguide and packaging all the required fast lasers and detectors
used within a small space. Section 2.4.1 elaborates on this idea.
One way to bridge this gap between the system and the medium bandwidths is through
technological advances; i.e., by using a large number of small and fast lasers. However,
we take a dierent approach (that is independent of and complements any technological
advances). We use to our advantage the fact that most interconnection topologies do not
use all available channels for communication at the same time. In this context, we consider
a \weak" topology [26], which is one in which each node sends and receives information from
at most one of its neighbors at any given instance of time. By mapping a weak topology on
a slab waveguide, a single laser or detector can be used over multiple channels to reduce the
cost of optical hardware and better utilize the available medium bandwidth. It may also be
viewed as an approach that allows a larger instance of the communication topology to be
mapped on a slab with xed hardware and packaging constraints. With a naive mapping,
each channel in the communication network corresponds to a laser and detector. On a weak
topology, this will force many of these lasers and detectors to be idle most of time. This
idle time of lasers and detectors can be used to roll many of them together, so that a single
laser or detector serves many channels. For example, if 64 lasers and detectors can be used,
then a naive mapping will admit a 16-node hypercube (4-dimensional hypercube) with 4
lasers/detectors per node. With our method a 64 node hypercube can be mapped.
2
In this thesis we study the mapping of d-dimensional weak hypercube on an optical
slab waveguide. A slab waveguide supports multiple modes and wavelengths [12] and each
channel in a slab is distinguished by its mode and wavelength. (A doublet, (; ), where
 is the mode and  is the wavelength at which the channel is operating describes the
channel.) Two channels are said to be adjacent if they use the same mode and have adjacent
wavelengths or use the same wavelength and have adjacent modes. This notion of adjacency
is important because it allows us to aggregate adjacent channels into a single laser (such as
a tunable laser), provided these aggregated channels cannot be used simultaneously. The
weak topology provides us the basis for ascertaining simultaneity. Similarly, detectors can
also be aggregated. These ideas are formally explained in Section 2.4.3.
Previously, Sethuraman [31], studied the mapping of a weak k-ary n-cube on a slab.
Our work applies similar techniques to a weak hypercube. Note that while a d-dimensional
hypercube is a special case of a k-ary n-cube (where k = 2 and n = d) the results of
Sethuraman are not generalizations of ours. Our mapping uses half the number of edges
compared to the work of Sethuraman; i.e., each processor has a degree of only d, rather than
the 2d required for a d-dimensional torus. Moreover, we also explore the idea of a \sparse"
mapping that allows improvements in the number of lasers and detectors used.
We use two approaches to mapping a d-dimensional weak hypercube (that has N = 2
d
processors) on the slab. The mapping uses anMW channel array (where M is the number
of modes and W is the number of wavelengths supported by the slab). In our case, we use
W = N = number of processors (nodes) in the topology, as this reduces optical hardware.
The \dense mapping" employs an array of size d  2
d
and uses (d   1)2
d
+ 4 lasers and
2
d
detectors (or 2
d
lasers and (d   1)2
d
+ 4 detectors). This is the minimum array size
3
for any \standard" mapping of the hypercube topology on a slab. Our second approach
is the \sparse mapping" that employs a 2
d 1
 2
d
channel array and uses 2
d
lasers and 2
d
detectors. Obviously, not all channels of this array are used as the d-dimensional hypercube
only has d2
d
edges. We identify the largest supergraph of the hypercube (called the extended
hypercube) that can utilize these empty channels without increasing the required numbers of
lasers and detectors. The above sparse mapping uses the smallest possible numbers of lasers
and detectors (1 per processor) for any strongly connected topology. We formally prove this
optimal. In Chapter 5 we derive a non-trivial lower bound that proves the number of lasers
in a dense mapping to also be optimal.
The remainder of this thesis is organized as follows. Chapter 2 discusses some preliminary
ideas that will be important to follow ideas in the remaining chapters. Chapter 3 elaborates
on the dense mapping approach. Chapter 4 deals with the sparse mapping and denes the
extended hypercube. We also present a possible direction of unifying the dense and sparse
mapping into one more-general mapping. Chapter 5 derives lower bounds for mapping a
hypercube to an optical slab waveguide. Here, we also show that our results for the dense
and sparse mappings match the lower bound. Finally, in Chapter 6 we summarize our results
and identify some directions for future work.
4
Chapter 2
Preliminaries
This chapter discusses some important concepts that are used in the rest of this thesis.
Specically, we describe the hypercube topology, the idea of interprocessor communication
in point-to-point networks, and issues in an optical implementation of interprocessor com-
munication.
2.1 General Conventions
Unless obvious from the context, a binary number will be suÆxed with a subscripted 2. For
example, 11
2
is 3 (in decimal). A binary b-bit number has bits indexed 0 to b   1, with
bit 0 as the least signicant bit (lsb) and bit b   1 as the most signicant bit (msb). Let
i = i
b 1
i
b 2
   i
1
i
0
be a b-bit integer, where i
j
2 f0; 1g and 0  j < b.
Let S  f0; 1;    ; b  1g. Then ij
S
=
^
i
b 1
^
i
b 2
  
^
i
1
^
i
0
, where for 0  j < b,
^
i
j
=
8
>
>
>
<
>
>
:
i
j
if j 62 S
i
0
j
if j 2 S
That is, ij
S
is the integer obtained from i by complementing the bits whose indicies are in
S. When S = fag is a singleton set we will simply write ij
a
instead of ij
fag
. For example,
7j
2
= 111j
2
= 111
2
= 011
2
= 3, whereas 7j
3
= 0111j
3
= 0111
2
= 1111
2
= 15. Also
7j
f2;3g
= 0111j
f2;3g
= 1011
2
= 11.
Denition 2.1 Let A = [a
ij
] be an x  y array, all of whose cells are either integers or
empty. We dene the following operations on A.
5
(a) For any real r, array rA = [r  a
ij
] is the array with the non-empty cells of A multiplied
by r; empty cells remain empty.
(b) For any set S of non-negative integers, Aj
S
= [a
ij
j
S
] is the array with non-empty cells
obtained by complementing the binary representation of the non-empty cells of A, ac-
cording to the bit positions in S; empty cells remain empty. 2
2.2 Hypercube
The topology considered in this thesis is a d-dimensional hypercube that is dened as follows
Denition 2.2 For any d  1, a d-dimensional hypercube is an undirected graph H
d
=
(V;E) where V = f0; 1;    ; 2
d
  1g is the set of nodes. Two nodes p and q are connected by
an edge if and only if the binary representations of p and q dier in exactly one bit position.
2
That is, (p; q) is an edge of the hypercube i there exists 0  j < d such that pj
j
= q. If
nodes p and q dier in bit b (where 0  b < d), then the edge (p; q) is called a dimension-b
edge.
Examples of hypercubes are shown in Figure 2.1.
The \undirected hypercube" described so far readily extends to a directed graph where
each undirected edge is replaced by two oppositely directed edges. We will use this directed
hypercube in this thesis. For simplicity we will still call it a hypercube and denote it by H
d
.
An example of a 3-dimensional (directed) hypercube is shown in Figure 2.2.
2.2.1 A Review of Hypercube Properties
We now state some well-known properties of the hypercube. More details may be found in
Leighton [19].
6
11
10
00
01
(a) d = 2
000
100
010
001
110101011
111
(b) d = 3
0111
0011 0110
0001
0100
0000
1111
1110
1100
1000
0101
0010
1011
1001
1101
1010
(c) d = 4
FIGURE 2.1: Examples of hypercubes
Basic Topological Properties: A d-dimensional (undirected) hypercube H
d
has 2
d
nodes, each of in-degree d and out-degree d. Hypercube H
d
has a diameter of d and is
a node-symmetric graph. Each node is incident on a dimension-b edge for each 0  b < d.
Recursive Structure: A d-dimensional hypercube H
d
can be expressed in terms of two
smaller (d  1)-dimensional hypercubes as follows
Let H
d 1
and H
0
d 1
be two (d  1)-dimensional hypercubes. Number the nodes of H
d 1
in the usual way from 0 to 2
d 1
  1 and those of H
0
d 1
from 2
d 1
to 2
d
  1. Now, H
d
can
7
111
011 110
001 100
000
101
010
FIGURE 2.2: A 3-dimensional directed hypercube
be constructed by connecting node i of H
d 1
to node i+2
d 1
of H
0
d 1
. Figure 2.3 illustrates
the general idea and Figure 2.4 shows an example with d = 4.
Hamiltonian path: A Hamiltonian path (resp., cycle) of a graph is a path (resp., cycle)
that traverses every node in the graph exactly once. It is well known that a hypercube H
d
with d  2 has a Hamiltonian cycle and, hence, a Hamiltonian path. Here we describe one
such path and term it the standard Hamiltonian path for a hypercube. Observe that every
path (including a Hamiltonian path) can be specied as a string of nodes of the graph.
Denition 2.3 For any string s, let s denote the reversed string. For strings s
1
and s
2
, let
s
1
Æ s
2
denote the concatenation of s
1
and s
2
. Let s = hs
0
; s
1
;    ; s
k
i be a string of reals,
and let r be a real number. Then r + s is the string hr + s
0
; r + s
1
;    ; r + s
k
i. 2
For example, if s
1
= ha; bi and s
2
= hc; di, then s
1
= hb; ai and s
1
Æ s
2
= ha; b; c; di. If
s = h1; 2; 3; 4i, then 4 + s = h5; 6; 7; 8i.
8
dH
’
d−1
H
d−1
H
d−1
i + 2i
FIGURE 2.3: Recursive structure of a hypercube
0101
0010 1000
11001010
1111
0000
0111
1001
H’
4H
3
3H
0110
0100
1110
0011
0001
1011 1101
FIGURE 2.4: Recursive decomposition of H
4
Let h
d
denote the string representing the standard Hamiltonian path of hypercube H
d
.
We now give a recursive denition of h
d
. Dene h
0
to be h0i. For d > 0, let hypercube H
d
be
recursively decomposed into subcubes H
d 1
and H
0
d 1
, and let their standard Hamiltonian
paths be h
d 1
and h
0
d 1
, respectively. Recall that H
d 1
and H
0
d 1
are identical except that
node i of H
d 1
corresponds to node i+ 2
d 1
of H
0
d 1
. Similarly h
0
d 1
= 2
d 1
+ h
d 1
is simply
9
hd 1
with 2
d 1
added to each element. Then for d > 0, h
d
can be dened as follows:
h
d
= h
d 1
Æ 2
d 1
+ h
d 1
: (2.1)
For example,
h
0
= h0i
h
1
= h
0
Æ h
0
0
= h0i Æ h0 + 2
0
i = h0i Æ h1i = h0; 1i
h
2
= h0; 1i Æ h1 + 2
1
; 0 + 2
1
i = h0; 1; 3; 2i
h
3
= h0; 1; 3; 2i Æ h6; 7; 5; 4i = h0; 1; 3; 2; 6; 7; 5; 4i
Subsequently, we will denote the j
th
element of h
d
by h
d
(j). Where the dimension d is
implicit, we will denote h
d
simply by h. The following lemma species the elements of h
d
in
a closed form.
Lemma 2.4 For integers b and j, let b
d 1
b
d 2
   b
1
b
0
be the binary representation of b and
let j
d 1
j
d 2
   j
1
j
0
be the binary representation of j. For any 0  j < 2
d
, let h
d
(j), the j
th
element of the standard Hamiltonian path h
d
, be b. Then for each 0  i < d,
b
i
= j
i
 j
i+1
;
where j
d
= 0.
Proof: We proceed by induction on d  1. For d = 1, h
1
= h0; 1i and b = j. Here j
1
= 0,
so b
0
= j
0
 j
1
= j
0
. This agrees with b = j.
Assume the lemma to hold for all d  0 and consider h
d+1
= h
d
Æ h
0
d
. Let b be the j
th
element of the standard Hamiltonian path h
d+1
. We now consider two cases.
10
Case 0  j < 2
d
(b is in the rst half of the Hamiltonian path): By the induction hypothesis
for 0  i < d, we have b
i
= j
i
 j
i+1
as required. Also, since j
d+1
= j
d
= 0, we have
b
d
= 0 = j
d
 j
d+1
as required.
Case 2
d
 j < 2
d+1
(b is in the second half of the Hamiltonian path): We are given that
the j
th
element of h
d+1
is b. Since j points to an element in the latter half of h
d+1
,
which is h
0
d
= h
d
+ 2
d
, we have 2
d
 b < 2
d+1
. Therefore b
d
= j
d
= 1 and j
d+1
= 0,
which implies that b
d
= j
d
 j
d+1
. So we now consider indices 0  i < d. The strategy
is to express the j
th
element, b, of h
d+1
as the (j
0
)
th
element, b
0
, of h
d
, developing a
relationship between j; b and j
0
; b
0
. Once this is done, the induction hypothesis will
establish the result.
The j
th
element of h
d+1
is the (j 2
d
)
th
element of h
0
d
that makes the second half of h
d+1
(see Figure 2.5). If the x
th
element of h
0
d
is y, then the (2
d
  1  x)
th
element of h
d
is
’
dhdh
bb−2
d+1
2  −1
d+1
2  −2jd2 +1d2d2 −1
0
d
d+1
d
d
d
2 −1
d
2 −210
corresponding elements
d
2 −1
d
2 −2
d(j−2  )1
d
2 −210
index in h
index in h’
node index
index in h
d+1h
FIGURE 2.5: An illustration of the proof of Lemma 2.4
11
(y 2
d
). Therefore, if the (j 2
d
)
th
element of h
0
d
is b, then the j
0
= (2
d
 1 (j 2
d
))
th
=
(2
d+1
  1  j)
th
element of h
d
is b
0
= b  2
d
. This element satises b
0
i
= j
0
i
 j
0
i+1
by the
induction hypothesis.
Note that b
0
and b dier only in bit d where b
0
i
= 0 while b
i
= 1. For 0  i < d,
b
0
i
= b
i
. In fact, if b = X + 2
d
, then b
0
= X (see Figure 2.6). Let j = 2
d
+ Y . Then
b’ =
b =
d
1
d−1
X
0
0 X
FIGURE 2.6: Binary representations of b and b
0
j
0
= (2
d+1
  1)   j = 2
d
  1   Y (see Figure 2.7). For each 0  i < d, the i
th
bit of
 d+1
2   − 1
 d
2 −1−Y0
d
j’
Y1j
111111
FIGURE 2.7: Binary representations of j and j
0
2
d+1
  1 is a 1. So j
0
i
= 1 i j
i
= 0. That is, j
0
i
is the one's complement of j
i
. Now for
any 0  i < d, b
i
= b
0
i
= j
0
i
j
0
i+1
=

j
i
j
i+1
= j
i
j
i+1
. Also, b
d
= 1 = 10 = j
d
j
d+1
.
12
2.3 Interprocessor Communication on H
d
A point-to-point, N -processor network consists of a set P = fp
0
; p
1
;    ; p
N 1
g of processors
connected pairwise by communication links. This system can be represented by a directed
graph G = (P;E) where edge set E is the set of communication links between processor
pairs. The hypercube outlined in the previous section is an example of such a network.
A communication C in a point-to-point network is simply a set of edges of its graph; that
is, C  E. Figure 2.8 shows ve communications C
a
{ C
e
for hypercube H
3
. Notice that the
communication in Figure 2.8(e) is a subset of that in Figure 2.8(d). In this thesis we consider
weak communications. In a weak communication a processor sends at most one message
and receives at most one message at a given step, regardless of the number of neighboring
processors in the underlying topology. Figure 2.8 (a){(c) illustrate weak communications on
H
3
. Here, node 0 has nodes 1; 2; 4 as neighbors. Therefore in a step, node 0 can send to
any one of nodes 1; 2 or 4 and receive from any one of nodes 1; 2 or 4. In Figure 2.8 (a)
node 0 sends and receives from node 1. In fact, in this gure all communications are along
dimension 0 of H
3
. If all nodes in the hypercube either send or receive a message through
a particular dimension b edge at a given time, then that kind of communication is called
a uniform communication [32]. However, a weak communication need not be conned to
one dimension as shown in Figure 2.8 (b). Here node 0 sends to node 2 but receives from
node 4. In Figure 2.8 (c) node 1 sends to node 3 and receives from node 0. In fact, this
gure shows a standard Hamiltonian path of H
3
(see Section 2.2.1). Figures 2.8 (d) and
(e) do not represent weak communications as, for example, node 0 sends to nodes 2 and 4
simultaneously.
13
73 5 6
1 2 4
0
(a) C
a
7
3 5 6
1
2
4
0
(b) C
b
7
3
5 6
1 2 4
0
(c) C
c
7
3 5 6
1 2 4
0
(d) C
d
7
3 5 6
1 4
0
2
(e) C
e
FIGURE 2.8: Weak communication in a 3-dimensional hypercube
2.4 Optical Implementation of Interprocessor Commu-
nication
In this section we discuss a general optical slab waveguide and explain how a communication
can be performed on this structure.
2.4.1 Slab Waveguides
An optical slab waveguide is made of any transparent material that can be used to transmit
light. It can be viewed as consisting of three sections: (a) the input, (b) the stem and (c) the
output (see Figure 2.9). The input section is the portion through which light is inserted
into the slab. Depending on the slab, this portion may have dierent geometries. Figure 2.9
14
simply shows a at surface through which light is inserted into the slab. Similarly the output
is the portion of the slab through which light exits. Again this can be of dierent geometries
and Figure 2.9 shows one possibility. The portion between the input and the output is the
stem and it is simply a conduit for the light that maintains any distinguishing property of
the inserted light; for our purpose such distinguishing properties would be the mode and
wavelength of the light.
output
stem
input
FIGURE 2.9: An optical slab
The input light can be in a range of wavelengths. Theoretically, this wavelength can be
from 400nm to 1550nm [1]. In fact, several dierent wavelengths can be transmitted simul-
taneously on the slab. This is called Wavelength Division Multiplexing (WDM) [24, 29, 34].
This multiplexed signal is separated (demultiplexed) at the output. Current technology
(DWDM-Dense Wavelength Division Multiplexing) for bers allows more than 150 wave-
lengths to be multiplexed [9]. These wavelengths are in the 1550 nm range for which at-
tenuation is small in bers. For slabs (that are used over much shorter distances and have
insignicant attenuation), the range of useful wavelengths is much bigger and the number of
signals that can be multiplexed is much larger.
Unlike bers that need to avoid dispersion and transmit a single or a few modes, a slab
(used over short distances) can support multiple modes. This allows light to be input in
dierent modes (that roughly translate to \angles" for our purpose). If the slab preserves
15
the modes, then signals input at dierent modes will also exit the slab at dierent modes.
Thus two signals of the same wavelength can be distinguished by their modes. This form of
multiplexing is called Mode Division Multiplexing (MDM) [3, 11].
In short, a slab can support both WDM and MDM. As noted above, currently it is
possible to have much more than 150 wavelengths and a slab of cross section 1 mm
2
can
support over 2000 MDM angles [11]. So this slab can carry in excess of 300,000 signals. The
data rate of these signals is limited only by the capabilities of the lasers and detectors (not
by the slab itself). Currently, lasers and detectors operating at 40 Gbps are available [21].
Thus the slab can carry in excess of 300,000 channels, each operating at 40 Gbps, or a total
of 12 Pbps.
Figure 2.10 illustrates the optical setting that we consider in this thesis. The input to
the slab waveguide is a set of lasers. As mentioned above these lasers can operate in a range
of wavelengths and modes. Each input signal occupies a channel in the slab.
detectors
hardware
output
waveguide
hardware
inputlasers
FIGURE 2.10: Slab structure considered in this thesis
The input to the slab can be single/multiple lasers operating at dierent wavelengths,
or it could be single/multiple lasers operating at dierent modes. For the waveguide to
maintain the mode and wavelength of the light that passes through it, the input to the
slab must be collimated. To input light at dierent wavelengths, either several lasers, each
capable of generating a single wavelength, or a single tunable laser capable of tuning to
a relatively few adjacent frequencies (one at a time), can be used. If the application uses
16
adjacent wavelengths, it is possible to save on optical hardware by replacing several lasers
each operating at a xed wavelength by a single tunable laser capable of tuning to one of a
set of adjacent frequencies.
To input light in multiple modes, either multiple lasers, each transmitting collimated
light at a xed mode (angle), or a single laser transmitting in dierent modes can be used.
To select one out of a range of modes, one could use \spatial light modulators" [25]. As in
the case of tunable lasers, it is possible to save on optical hardware by using a spatial light
modulator in the place of several single mode lasers.
If the slab can carry signals in M modes and W wavelengths, then it can support MW
channels. We use the doublet (; ), where  is the mode and  is the wavelength, to
represent a channel. The entire set of channels is called the channel array; it is an M W
array of channels (; ).
Each channel (; ) is available at the output of the waveguide. The channels need to
be demultiplexed and separated. The number of detectors used can aect the cost of the
interconnect. By using the same detector for adjacent channels their number can be reduced.
Though this would mean bigger and slower detectors, the number of wires used to connect
them can be signicantly reduced.
2.4.2 Mapping
To perform interprocessor communication on the slab, each edge of the topology must be
mapped to a channel of the channel array. Let the slab support M modes 
0
; 
1
;    ; 
M 1
and W wavelengths 
0
; 
1
;    ; 
W 1
. As noted earlier, each channel is a mode-wavelength
pair (
i
; 
j
). The mapping we seek here is an injective function f : E !MW where
M = f
i
: 0  i < Mg;W = f
j
: 0  j < Wg and E is the edge set of the topology.
17
This mapping can be specied as two M  W arrays Src and Dst that contain node
indices. For any 0  i < M and 0  j < W , if the directed edge e = (p; q) is mapped by f
to channel (
i
; 
j
), then Src(i; j) = p and Dst(i; j) = q.
In this thesis we consider a hypercube where an edge e = (p; q) can also be specied as a
dimension-b edge from source p or to destination q. That is, given b, only one of p or q need
be specied. Thus we can also use anotherMW array, Dim, along with Src or Dst. If the
above edge (p; q) is a dimension-b edge, then Dim(i; j) = b. If Src(i; j) = p;Dst(i; j) = q
and Dim(i; j) = b, then pj
b
= q and qj
b
= p. Thus we need only specify any two of Src;Dst
and Dim to fully specify the mapping. In most of this thesis we use the arrays Dim and
Dst.
2.4.3 Aggregates
In this thesis we consider weak hypercubes in which a node cannot use all its edges simul-
taneously. The mapping must use this fact to reduce the number of lasers and detectors.
If two channels (
i
; 
j
) and (
i
; 
j+1
) (that share the same mode and adjacent wavelengths)
can never be used simultaneously (possibly because they are mapped to edges incident on
the same node), then a single tunable laser positioned at mode 
i
can switch between wave-
lengths 
j
and 
j+1
depending on which channel the node wishes to use. Similarly, if channels
(
i
; 
j
) and (
i+1
; 
j
) cannot be used simultaneously, light from a single laser spreads over
two channels and a spatial light modulator (SLM) regulates the transmission of the light in
the two channels. Similarly at the destination, a single detector can be used to detect signal
at such adjacent channels that cannot be used simultaneously.
Vaidyanathan and Sethuraman [33] have captured this idea in terms of the notion of
aggregates. A source (resp. destination) aggregate is a maximal stretch of contiguous rows
18
or columns of array Src (resp. Dst) such that each entry in this stretch is either empty
(unused) or has the same processor index. Figures 2.11, 2.12 and 2.13 show aggregates
circled for three mappings. An aggregate containing less than two non-empty entries is
called a trivial aggregate. Figures 2.11, 2.12 and 2.13 does not circle trivial aggregates. Also,
these gures contain no empty entries.
The source aggregate in row 0 and columns 0 and 1 of Figure 2.13 represents a single
laser, positioned at mode 
0
that tunes to wavelengths 
0
or 
1
. Since processor 1 uses
channels (
0
; 
0
) and (
0
; 
1
) to communicate with processors 0 and 3, it cannot do so
simultaneously (by the denition of a weak model). Therefore a single tunable laser suÆces.
Thus, the number of source aggregates equals the number of lasers needed. In fact, the
number of source aggregates labeled with processor index p is the number of lasers needed
for processor p. Similarly the number of destination aggregates with entry q is the number
of detectors needed for processor q. For example, Mapping 1 of Figure 2.11 needs 24 lasers
and 19 detectors, Mapping 2 of Figure 2.12 needs 24 lasers and 8 detectors and Mapping 3
of Figure 2.13 needs 12 lasers and 8 detectors. Thus, not all mappings are the same.
In this thesis we use \standard mappings" [31, 33]. A standard mapping uses an RN
channel array for an N -node topology. In addition, the destination array has wavelength
aggregates that span entire columns (as in Figures 2.12 and 2.13). It can be shown that for
a standard mapping of a simple graph, all source aggregates are mode aggregates. Note that
the destination array of a standard mapping can be specied by specifying the N entries,
one per column, as each column will form one aggregate. For example, the array Dst for
Mapping 3 in Figure 2.13 could be specied as the list h0; 3; 6; 5; 1; 2; 7; 4i.
A detailed description of aggregates and standard mapping appears in Sethuraman [31].
19
0
2
1
1
0
1
1
0
1
2
2
1
2
2
02
7
5
2
3
5
1
1
3
5
4
2
7
4
6
6
7
1
7
Src
Dst
Dim
n
n
n
l l l l l l l l 6
2
0
0
0
0
1
2
2
1
0
0 1 2 3 4 5
3
2
2
7
0
3
6
5
7
4
3
0
1
2
4
0
0
5
6
4
0
6
6
2
1
1
5
4
0
7
1
3
FIGURE 2.11: Mapping 1
0
1
0
2
1
2
0
2
1
0
1
2
0
Dst
Dim
n
1
3
5
6
6
0
5
2
0
1
0
2
1
0
1
2
2
n
6
5
5
5
1
1
1
2
2
2
7
7
7
4
4
4
6
n
l l l l l l l l
0
0
0 3
3
3
6
3
1
2
2
7
1
7
4
2
1
7
4
3
0
5
0
6
4
2
1
0
0 1 2 3 4 5 6 7
Src
FIGURE 2.12: Mapping 2
l l l l l l l
n
n 1
2n
3
3
30
0
l
0
2
1
0
1
2
1
0
2
1
2
0
0
2
1
Dst
Dim
0
7 77 66
22
0 0 5 54 41 1
3 3
4
6
6
6
5
5
5
1
1
1
2
2
2
7
7
7
4
4
0
7
0
Src
4
2
5
3
6
0
0
1
2
1
0
2
1
2
6
1
0 1 2 3 4 5
FIGURE 2.13: Mapping 3
1
0
2
1
2
0
0
2
1
0
1
2
1
0
2
1
2
0
0
2
12
Dst list : 0 3 6 5 1 2 7 4
Dim
0
1
FIGURE 2.14: A way to specify
the standard mapping, Mapping 3
of Figure 2.13
20
As far as we are concerned, we will assume a standard mapping and aim to reduce the
number of source aggregates.
A standard mapping does not specify the number of rows. For a d-dimensional hypercube
with 2
d
-nodes and d2
d
directed edges, a standard mapping requires an R 2
d
channel array
where R  d. When R = d, we call the mapping dense, as all channels of the channel array
are used for hypercube edges. Otherwise it is a sparse mapping.
In this thesis we will map a d-dimensional hypercube on an R 2
d
channel array, where
R = d or R = 2
d 1
. We will use a standard mapping and specify the mappings as an
X  2
d
dimension array (Dim) and a 2
d
-element destination list. For example, Mapping 3
of Figure 2.13 will be specied as shown in Figure 2.14. From the destination list and the
dimension array, the array, Src, can be derived. To obtain the entries of Src, we complement
the bit positions of the entries in Dst in the order mentioned in the corresponding entry of
the array Dim. In Figure 2.14 consider Dim(2; 3) = 1. Changing bit 1 in the binary
representation of the corresponding element in (position 3) in Dst list (which is 5) gives
5j
1
= 7. This is the value of the entry in Src(2; 3) for Mapping 3 (see Figure 2.13).
21
Chapter 3
Dense Mapping
This chapter deals with mapping a d-dimensional weak hypercube on to an optical slab
waveguide. As noted in Chapter 2 this involves specifying two of the following three arrays:
(1) the source array, Src, (2) the destination array, Dst and (3) the dimension array, Dim,
all of which are of size d  N . (Recall that for a dense mapping the size of the channel
array is d  N , where d is the number of dimensions in the hypercube and N = 2
d
is the
number of nodes of the hypercube.) The array Dst uses wavelength aggregates (that span
entire columns) as a result of which the array Src has mode aggregates (that run along
rows). Because the destination array has aggregates spanning entire columns, one could
represent Dst as an N -element list. In this chapter, we dene the mapping by specifying
the destination (Dst) and dimension (Dim) arrays.
We begin by dening arrays Dst and Dim in Sections 3.1 and 3.2, respectively. The proof
for why these array denitions work for a weak hypercube follows in Section 3.4. Then, we
determine the number of aggregates (numbers of lasers and detectors) in the mapping in
Section 3.4.2.
3.1 Construction of the Destination Array, Dst
In a standard mapping, each column of the dN destination array will form one aggregate
(see Figure 3.1); i.e., all entries in a column are the same. The hypercube we consider in
this thesis has N nodes and therefore the array Dst will have N columns, and hence, N
aggregates. Since each column has only one distinct element, the array Dst can be specied
22
as an N -element list of nodes of the hypercube. We use the standard Hamiltonian path to
.................................. .......... d
0 N
FIGURE 3.1: Structure of the destination array
construct the list that species the entries of the destination array. For any d  1, let j be a
d-bit number. The well-known shue permutation  : f0; 1;    ; 2
d
  1g ! f0; 1;    ; 2
d
  1g
is dened as follows.
(j) =
8
>
>
>
<
>
>
>
:
2j; if 0  j < 2
d 1
2j + 1  2
d
; if 2
d 1
 j < 2
d
If j has a binary representation j
d 1
j
d 2
   j
1
j
0
, where j
`
2 f0; 1g for each 0  ` < d, then
(j) creates a number whose binary representation is a left rotate of j. For example, if d = 3,
then (2) = (010
2
) = 100
2
= 4 while (5) = (101
2
) = 011
2
= 3.
Coming back to the array Dst, let h
d
= hh
d
(j) : 0  j < 2
d
i be the standard Hamil-
tonian path of a d-dimensional hypercube H
d
. Let
^
h
d
= hh
d
((j)) : 0  j < 2
d
i be
the shued Hamiltonian path. For example with d = 3, h
3
= h0; 1; 3; 2; 6; 7; 5; 4i and
^
h
3
= h0; 3; 6; 5; 1; 2; 7; 4i. Assign the list
^
h to the columns of arry Dst. Specically, for
0  i < d and 0  j < 2
d
, the entry Dst(i; j) = h
d
((j)). Figure 3.2 shows array Dst for
d = 3.
As another example with d = 4, h
4
= h0; 1; 3; 2; 6; 7; 5; 4; 12; 13; 15; 14; 10; 11; 9; 8i. The
list
^
h
4
is obtained by separating the odd and even elements we have the list h0; 3; 6; 5; 12; 15; 10;
23
66
5
5
5
1
1
1
2
2
2
7
7
7
4
4
4
3
3
3
60
0
0
FIGURE 3.2: Array Dst for d = 3
9; 1; 2; 7; 4; 13; 14; 11; 8i. This implies that array Dst has the form in Figure 3.3. The two
halves of the array Dst have been separated in Figure 3.3 for reasons that will be clearer
later.
12
12 15
15
10
10
14 8
8
11
1114
13
13
4
4
7
7
2
2
1
115
9
5 8
811
1114
1413
134
47
72
21
19
90
0 3
3
6
6 5
9
10
10
1512
12
5
5
6
6
3
3
0
0
FIGURE 3.3: Array Dst for d = 4
We now state a technical result that is used later.
Lemma 3.1 For d  2,
^
h
d
=
^
h
0
d
Æ
^
h
1
d
Æ
^
h
2
d
Æ
^
h
3
d
; where for any 0  i < 2
d 2
^
h
0
d
(i) = h
d 1
(2i)
^
h
1
d
(i) = 2
d 1
+ h
d 1
(2
d 1
  1  2i)
^
h
2
d
(i) = h
d 1
(2i+ 1)
^
h
3
d
(i) = 2
d 1
+ h
d 1
(2
d 1
  2  2i):
Proof: Figure 3.4 shows the relationship between h
d
;
^
h
d
and
^
h
x
d
for 0  x < 3 when d = 4.
24
3 32 11010320
0 1 3
3
6 7 8 9 10 11 12 13 14 15
01 2 2
2
6 7 5 4 12 15 14 10 891113
h
indices
2
6 7 5 4
4h
x
3h
h 4 0 1 3
50 1 2 3 4
FIGURE 3.4: Relation between h
d
;
^
h
d
and
^
h
x
d
Recall that h
d
= h
d 1
Æh
0
d 1
, where h
0
d 1
= (h
d 1
+ 2
d 1
) and s denotes the reverse of string s
(see Denition 2.3).
From Figure 3.4 it is clear that
^
h
0
d
is simply the even indexed elements of h
d 1
. This
relationship can be proved more formally by induction on d. Therefore for 0  i < 2
d 2
,
^
h
0
d
(i) = h
d 1
(2i): Similarly
^
h
2
d
is the odd numbered elements of h
d 1
, which establishes that
for 0  i < 2
d 2
,
^
h
2
d
(i) = h
d 1
(2i+ 1):
Again from Figure 3.4,
^
h
0
d
is the even elements of h
0
d 1
, as a result of which for each
0  i < 2
d 2
we have
^
h
1
d
(i) = h
0
d 1
(2i): (3.1)
The following facts stem from Denition 2.3 (8) applied to the current context.
h
0
d 1
(j) = h
0
d 1
(2
d 1
  1  j) (3.2)
25
Note that we have used the fact that h
0
d 1
is a (2
d 1
)-element string.
h
0
d 1
(j) = 2
d 1
+ h
d 1
(j) (3.3)
From Equations 3.1, 3.2 and 3.3 we have
^
h
0
d
= h
0
(2i) = h
0
(2
d 1
  1  2i) = 2
d 1
+ h
d 1
(2
d 1
  1  2i): (3.4)
Finally for
^
h
3
d
we proceed as for
^
h
1
d
with the observation that
^
h
3
d
is the odd elements of h
0
d 1
.
The expression for
^
h
3
d
can be obtained by substituting 2i in Equation 3.4 by 2i+ 1.
3.2 Construction of the Dimension Array, Dim
The dimension array contains dimension numbers between 0 and d 1. For any 0  i < d and
0  j < 2
d 1
, if Dst(i; j) = x and Dim(i; j) = y, then they denote the edge (xj
y
; x), where
xj
y
is the dimension-y neighbor of x; that is, the binary representation of xj
y
is the binary
representation of x with bit y complemented. Recall that all entries within a column of the
destination array are the same. Therefore, each column j of the array Dim will specify all
d dimensions in order for the processor in column j of array Dst to be able to communicate
with all its d neighbors in the hypercube. Figure 3.5 shows the dimension array for d = 1
and d = 2.
To dene the entries of array Dim for d  3, we introduce three quantities 
j
; 
j
and 
j
,
for each column 0  j < N . These three quantities x the values of three entries in each
column j of the dimension array. The remaining entries in a column will be the remaining
dimensions in any order. The motivation to specify these three entries for every column will
26
0 0
Dst list: 0 1
(a) d = 1
0
1
1
0
0
1
1
0
2130Dst List:
(b) d = 2
FIGURE 3.5: Examples of the array Dim
be evident later when we derive the number of aggregates for this mapping.
For any integer j  0, dene R
0
(j) to be the position of the rightmost 0 in the binary
representation of j; the lsb is in position 0. For example, if j = 101
2
then the rightmost 0
is in position 1, so R
0
(5) = 1. If j = 7
10
= 111
2
then the rightmost 0 is in position 3, so
R
0
(7) = 3. It is easy to see that for any even integer j, R
0
(j) = 0.
Lemma 3.2 For any b-bit number j, R
0
(j)
(a) R
0
(j) = b i j = 2
b
  1
(b) R
0
(j) = b  1 i j = 2
b 1
  1
(c) R
0
(j) < b  1 i j 6= 2
b
  1 and j 6= 2
b 1
  1
Proof: We consider the three cases.
Case (a): Since 2
b
  1 = 0 111   11
| {z }
b bits
2
the rightmost 0 is in position b i.e., R
0
(j) = b.
Case (b): Like the previous case, the binary representation of 2
b 1
is 0 111   11
| {z }
b 1 bits
2
. There-
fore, the rightmost 0 is in position b  1 (the msb); i.e., R
0
(j) = b  1.
Case (c): From the above two cases it is clear that R
0
(j) can be b or b  1 only if j = 2
b
  1
or j = 2
b 1
 1. Therefore, R
0
(j) has to be less than b 1 if j 6= 2
b
 1 and j 6= 2
b 1
 1.
27
We now dene 
j
; 
j
and 
j
for each column j.
Denition 3.3 For any d  3 (where d is the number of dimensions in the hypercube) and
0  j < 2
d 1

d
j
= 
d
j+2
d
= 0

d
j
= 
d
j+2
d
=
8
>
>
>
<
>
>
>
:
d  1; if j = 0

d
j 1
; if j > 0

d
j
= 
d
j+2
d
=
8
>
>
>
<
>
>
:
d  1; if j = 2
d 1
  1
1 +R
0
(j); otherwise
2
Remark: If the dimension d is implicit, we write 
d
j
; 
d
j
and 
d
j
, simply as 
j
; 
j
and 
j
.
Lemma 3.4 For d  3 and 0  j < 2
d
, we have 0  
j
; 
j
; 
j
< d and 
j
; 
j
; 
j
are distinct.
Proof: Clearly, 
j
= 0 is in the right range. Also 
j
> 0 as it is d   1 (where d  3 and
hence d  1  2) or 1 +R
0
(j). So 
j
6= 
j
. 
j
> 0 as 
j
equals d  1 or 
j 1
, both of which
are > 0. So 
j
6= 
j
. In the remainder of this proof, we establish that 0 < 
j
; 
j
< d and
that 
j
6= 
j
.
Since d  3, the value of d   1 for 
j
or 
j
is in the right range. From Lemma 3.2,
R
0
(j) < d  1, except when j = 2
d
  1 or j = 2
d 1
  1. So here 
j
= 1+R
0
(j) < d. For the
two cases noted above 
j
= 2. So for all 0  j < 2
d
, we have 
j
< d. This implies for all j,

j
= 
j 1
< d.
To prove 
j
6= 
j
we rst observe that 
0
= d   1 6= 1 = 
0
. We now show that for all
0 < j < 2
d
, 
j
6= 
j 1
= 
j
.
Consider the case where j; j   1 62 f2
d
  1; 2
d 1
  1g. Here j is odd i j   1 is even. Also
R
0
(j) = 0 i j is even. So R
0
(j) 6= R
0
(j   1). This implies that 
j
6= 
j 1
.
28
gg
j
b
b
b
b
b
b
b
b
b
b a
a
a
a
a
aa
a
jg
aa b
b
bb
b
b
NN/2−1j−1 j+1j
jg
g
g
g
g
g
gg
g
a
a
a
a
................................
0 N/2
d
FIGURE 3.6: Positions of ;  and  in the dimension array
If j 2 f2
d
  1; 2
d 1
  1g then 
j
= d   1 but 
j 1
= 1 as 2
d
  2 and 2
d 1
  2 are even
numbers. Similarly if j   1 2 f2
d
  1; 2
d 1
  1g then j is an even number with 
j
= 1 and

j 1
= 2.
The above lemma ensures that each column j of array Dim can contain 
j
; 
j
and 
j
representing edges of H
d
in three distinct dimensions. Figure 3.6 describes the arrangement
of 
j
; 
j
and 
j
in column j. More formally, for 0  j < 2
d 1
consider entries Dim(i; j) and
Dim(i; j + 2
d
) of array Dim.
Dim(i; j) = Dim(i; j + 2
d
) =
8
>
>
>
>
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
>
>
>
>
:

j
i i  j (mod d)

j
i i  1  j (mod d)

j
i i+ 1  j (mod d)
any unused dimension otherwise:
(3.5)
Notice in Figure 3.6 that d   3 entries of each column j in the dimension array are lled
by the remaining dimensions in any order (\otherwise" clause in the formal description
above). Notice also that the arrangement repeats for columns j where 0  j < 2
d 1
and
2
d 1
 j < 2
d
. Consequently the array Dim (and the array Dst of Figure 3.3) are shown
split into two halves. Figure 3.7 is an example of the 4-dimensional array Dim.
29
0
1
3
2
1
0
0
1
2
3
1
0
2
3
3
2
0
1
3
2
1
0
0
1 2
0
1
2
3
1
0
2
3
3
2
0
1
3
2
1
0
1
0
2
3
1
0
2
3
3
2
3
1
0
2
3
3
2
0
1
3
2
1
0
FIGURE 3.7: Array Dim for a 4-dimensional hypercube
3.3 Construction of Source Array, Src
Given the destination list and dimension array, the construction of the source array is
straightforward. This can be done by complementing the bit in the binary representation of
a destination list element that is specied by the corresponding element of the array Dim.
Specically, if h
d
(j) is the element in Dst(i; j), then Src(i; j) =
^
h
d
(j)j
Dim(i;j)
=
^
h
d
(j) with
bit Dim(i; j) complemented. Figure 3.8 shows array Src and
^
h
4
(j) for the array Dim in Fig-
ure 3.7. Every entry in any given column j of the source array is dierent as it is connected
to node
^
h
4
(j) by a dierent dimension edge. Therefore, there are no wavelength aggregates
in Src. (If there were, it would mean that an edge has been mapped more than once). In
the next section we determine the mode aggregates of array Src.
15 10 9 1 2 7 4 13 14 11 8
0 0
3
12
14
11 11 11
7 8 8
13
2 1
30 6 5
3 15
6
3
10
9
0
12
10
9
12
15
10
4h  (j)
12
3
5
9
6
10
6 6
5 5 5
15
0
12
15
9
14
7 7
8 11 4 4 4
8
1
14 13 13 13
14
7
1 1
2 2 2
4
FIGURE 3.8: Src array for a 4-dimensional hypercube
30
3.4 Analyzing the Mapping
In Sections 3.1 and 3.2 we specied the mapping of the hypercube to the slab by constructing
arrays Dst and Dim. In this section we rst prove the mapping correct (i.e., all edges of
hypercube H
d
have been mapped). Then we derive the number of aggregates due to this
mapping.
3.4.1 Correctness
Observe that all entries in a column of array Dst are the same and entries in dierent
columns are dierent (this is obvious from the construction). Next note that for array Src
all entries in a column are dierent (this was established by Lemma 3.4 and Equation (3.5)).
Therefore, each pair hSrc(i; j); Dst(i; j)i represents a dierent directed edge of H
d
. Since
the d 2
d
channel array has d2
d
entries (which is also the number of directed edges in H
d
),
and since all these entries represent dierent edges, all edges of H
d
are covered.
Lemma 3.5 For any d  1, the mapping specied in Sections 3.1-3.3 correctly map's each
edge of a d-dimensional hypercube H
d
on to a distinct channel of a d 2
d
channel array.
We now develop some technical results before we proceed to derive the number of aggre-
gates (Section 3.4.2).
Lemma 3.6 Let 0  i < d and 0  j < 2
d 1
. If Dim(i; j) = 
j
, then Dim(i; j   1) = 
j 1
and Dim(i; j + 1) = 
j+1
(if these cells exist in one half of the source array.)
Proof: From Equation (3.5), 
j
is in entry Dim(i; j) i i  j mod d. Let 
j
be in row r
j
,
i.e., i = r
j
. Then from Equation (3.5) it is clear that 
j
is in row (r
j
  1) (mod d) and 
j
is in row (r
j
+ 1) (mod d). Similarly if j > 0, then in column j   1 we have 
j 1
in row
31
rj 1
 (j   1) (mod d) where r
j 1
= (r
j
  1) (mod d). Similarly 
j 1
is in row (r
j
  2)
(mod d) and 
j 1
is in row r
j
(mod d). This establishes one half of the lemma namely

j
and 
j 1
are in the same row. The other half establishing 
j
and 
j+1
to be in the same
row follows similarly.
Lemma 3.7 For any 0  j < 2
d 1
, if Dim(i; j) = 
j
, then Src(i; j   1) = Src(i; j) =
Src(i; j + 1), if they exist.
Proof: By Lemma 3.6, Dim(i; j   1) = 
j 1
and Dim(i; j + 1) = 
j+1
. Therefore,
Src(i; j) = Dst(i; j)j
Dim(i;j)
=
^
h(j)j

j
= h(2j)j
0
(3.6)
Src(i; j   1) = Dst(i; j   1)j
Dim(i;j 1)
=
^
h(j   1)j

j 1
= h(2j   2)j

j 1
(3.7)
Src(i; j + 1) = Dst(i; j + 1)j
Dim(i;j+1)
=
^
h(j + 1)j

j+1
= h(2j + 2)j

j+1
(3.8)
We need to show that for any 0  j < 2
d 1
, h(2j)j
0
= h(2j   2)j

j 1
= h(2j + 2)j

j+1
whenever these entries exist.
We know from the construction of array Dst that
^
h(j) = h(2j), for 0  j < 2
d 1
.
Therefore columns j  1; j and j+1 of array Dst will have entries h(2j  2); h(2j); h(2j+2)
respectively. Let h(2j) = a, h(2j   2) = c and h(2j + 2) = b and represent them in the
following manner.
h(2j) = a = a
d 1
a
d 2
  a
1
a
0
h(2j   2) = b = b
d 1
b
d 2
   b
1
b
0
h(2j + 2) = c = c
d 1
c
d 2
   c
1
c
0
Let j = 0j
d 2
j
d 3
   j
1
j
0
; recall that 0  j < 2
d 1
. Let k and ` (where 0  k; ` < d) be the
positions of the rightmost 0 and the rightmost 1, respectively, in the binary representation
32
of j. Then Figures 3.9 and 3.10 show the bit values in j; j+1; j 1; 2j; 2(j+1) and 2(j 1).
In those gures A = bj=2
k+1
c and B = bj=2
`+1
c. Note that A and B are d   (k + 2) and
d   (` + 2) bit quantities. In fact, A = j
d 2
j
d 3
   j
k+1
and B = j
d 2
j
d 3
   j
`+1
. Dene
^
A = j
d 2
^
j
d 3
  
^
j
k+2
j
k+1
and
^
B = j
d 2
^
j
d 3
  
^
j
`+2
j
`+1
with
^
j
p
= j
p
 j
p+1
.
Observe now that any two of h(2j); h(2j   2) and h(2j + 2) dier in exactly 2 bits
(regardless of the case). Figure 3.11 gives the bits where they dier. Since 
j
= 0, h(2j)j

j
=
h(2j)j
0
diers from h(2j 2) in bit 1 or `+1 (depending on whether k > 0 or k = 0). Similarly,
h(2j)j

j
diers from h(2j + 2) in bits k + 1 or 1.
Thus for k > 0
h(2j)j
0
= h(2j   2)j
1
= h(2j + 2)j
k+1
; (3.9)
and for k = 0
h(2j)j
0
= h(2j   2)j
`+1
= h(2j + 2)j
1
: (3.10)
We now show that the bit numbers for h(2j   2) and h(2j + 2) in Equations 3.9 and 3.10
are precisely 
j 1
and 
j+1
. We consider two cases.
Case k > 0: From Figure 3.9, R
0
(j   1) = 0, so 
j 1
= 1. Since j < 2
d 1
, we have
0  k  d   1. If k = d   1, then j = 2
d 1
  1, and Src(i; j + 1) would be outside
the half of the array Src that we are considering. So k < d   1. This implies that

j
= 1 + k = 
j+1
. Thus Equation 3.9 can be written as:
h(2j)j

j
= h(2j   2)j

j 1
= h(2j + 2)j

j+1
:
33
k > 0
j
d  1 d  2 k + 1 k k   1 1 0
0 A 0 1 ... 1 1
| {z }
k+1 bits
2j
d  1 k + 2 k + 1 k 2 1 0
A 0 1 ... 1 1 0
| {z }
k+2 bits
h(2j)
d  1 k + 2 k + 1 k 2 1 0
^
A j
k+1
1 ... 0 0 1
| {z }
k+2 bits
j + 1
d  1 d  2 k + 1 k k   1 1 0
0 A 1 0 ... 0 0
| {z }
k+1 bits
2(j + 1)
d  1 k + 2 k + 1 k 2 1 0
A 1 0 ... 0 0 0
| {z }
k+2 bits
h(2j + 2)
d  1 k + 2 k + 1 k 2 1 0
^
A j
0
k+1
1 ... 0 0 0
| {z }
k+2 bits
j   1
d  1 d  2 k + 1 k k   1 1 0
0 A 0 1 ... 1 0
| {z }
k+1 bits
2(j   1)
d  1 k + 2 k + 1 k 2 1 0
A 0 1 ... 1 0 0
| {z }
k+2 bits
h(2j   2)
d  1 k + 2 k + 1 k 2 1 0
^
A j
k+1
1 ... 0 1 0
| {z }
k+2 bits
FIGURE 3.9: Bit values for the proof of Lemma 3.7 when k > 0
34
k = 0
j
d  1 d  2 `+ 1 ` `  1 1 0
0 B 1 0 ... 0 0
| {z }
`+1 bits
2j
d  1 `+ 2 `+ 1 ` 2 1 0
B 1 0 ... 0 0 0
| {z }
`+2 bits
h(2j)
d  1 `+ 2 `+ 1 ` 2 1 0
^
B j
0
`+1
1 ... 0 0 0
| {z }
`+2 bits
j + 1
d  1 d  2 `+ 1 ` `  1 1 0
0 B 1 0 ... 0 1
| {z }
`+1 bits
2(j + 1)
d  1 `+ 2 `+ 1 ` 2 1 0
B 1 0 ... 0 1 0
| {z }
`+2 bits
h(2j + 2)
d  1 `+ 2 `+ 1 ` 2 1 0
^
B j
0
`+1
1 ... 0 1 1
| {z }
`+2 bits
j   1
d  1 d  2 `+ 1 ` `  1 1 0
0 B 0 1 ... 1 1
| {z }
`+1 bits
2(j   1)
d  1 `+ 2 `+ 1 ` 2 1 0
B 0 1 ... 1 1 0
| {z }
`+2 bits
h(2j   2)
d  1 `+ 2 `+ 1 ` 2 1 0
^
B j
`+1
1 ... 0 0 1
| {z }
`+2 bits
FIGURE 3.10: Bit values for the proof of Lemma 3.7 when k = 0
35
h(2j   2) h(2j + 2)
k > 0 k = 0 k > 0 k = 0
h(2j) 0; 1 0; `+ 1 0; k + 1 0; 1
h(2j   2) 1; k + 1 1; `+ 1
FIGURE 3.11: Diering bits in the proof of Lemma 3.7
Case k = 0: From Figures 3.9 and 3.10, j   1 has the leftmost 0 in position `. That is,
R
0
(j   1) = `. From Equation 3.10, 
j 1
= ` + 1. Also R
0
(j) = k = 0 (see Figure
3.10). Then 
j
= 
j+1
= 1. Thus Equation 3.10 can be written as:
h(2j)j

j
= h(2j   2)j

j 1
= h(2j + 2)j

j+1
:
With both cases we may write for all elements with columns 0; 1;    ; 2
d 1
  1 of array Src
h(2j)j

j
= Src(i; j) = h(2j   2)j

j 1
= Src(i; j   1) = h(2j + 2)j

j+1
= Src(i; j + 1):
This completes the proof.
Lemma 3.8 For any 2
d 1
 j < 2
d
, if Dim(i; j) = 
j
, then Src(i; j   1) = Src(i; j) =
Src(i; j + 1), if they exist.
The proof for this case follows along the same lines as the 0  j < 2
d 1
case except that bit
d   1 will be a 1 instead of a 0. The argument provided still holds good and therefore we
skip the proof for this case.
3.4.2 Number of Aggregates
In this section we derive the number of aggregates in the source and destination arrays.
36
Destination Array: Since all aggregates in the array Dst span entire columns, the num-
ber of aggregates is N (see Figure 3.3 for an example).
Source Array: An aggregate is trivial i it has a size of 1 (as there are no empty entries).
Here we will establish that array Src has aggregates of size 1 (trivial), 2 and 3. Each
non-trivial aggregate will be associated with an 
j
for some j.
Lemma 3.9 For each 0  j < 2
d 1
, there is a non-trivial aggregate containing 
j
.
Proof: If j = 0, then Dim(0; 0) = 
0
and Dim(0; 1) = 
1
. By Lemma 3.7, Src(0; 0) =
Src(0; 1), forming a 2-element aggregate. If j = 2
d 1
  1, then for i = (2
d 1
  1)(mod
d); Dim(i; j) = 
j
and Dim(i; j  1) = 
j 1
. Again by Lemma 3.7, Src(i; j) = Src(i; j  1),
forming another 2-element aggregate. For any 0  j < 2
d 1
and i  j (mod d), we have
Dim(i; j   1) = 
j 1
; Dim(i; j) = 
j
and Dim(i; j + 1) = 
j+1
. Once again Lemma 3.7
ensures that Src(i; j   1) = Src(i; j) = Src(i; j + 1) to form a 3-element aggregate.
Theorem 3.10 For any d  2 the array Src has 2
d
(d  2) + 4 aggregates.
Proof: From the proof of Lemma 3.9, one half of array Src xhas 2-element aggregates and
2
d 1
  2 3-element aggregates. The remaining d2
d 1
  (2 2+3(2
d 1
  2)) = (d  3)2
d 1
+2
elements are trivial aggregates. Thus, the number of aggregates (in both halves of Src) is
2(2 + 2
d 1
  2 + (d  3)2
d 1
+ 2) = 2((d  2)2
d 1
+ 2) = (d  2)2
d
+ 4.
Remark: In Chapter 5 we show that this is the optimal number of aggregates.
Theorem 3.11 For any d  2, a d-dimensional hypercube has the following mappings on a
slab waveguide.
(a) With 2
d
detectors and (d  2)2
d
+ 4 tunable lasers.
37
(b) With (d  2)2
d
+ 4 detectors and 2
d
tunable lasers.
(c) With 2
d
detectors and (d  2)2
d
+ 4 lasers with SLMs.
(d) With (d  2)2
d
+ 4 detectors and 2
d
lasers with SLMs.
38
Chapter 4
Sparse Mapping
In the previous chapter we have shown that it is possible to map a d-dimensional hypercube
onto a sawtooth slab waveguide by a dense mapping using 2
d 1
(d  2)+2 lasers and N = 2
d
detectors (or vice versa). In this chapter we will consider larger channel arrays, specically
N
2
 N arrays. Since the number of edges in the hypercube is dN , not all entries of this
channel array are used when d  3. The unused channels are said to be \empty." We call
such a mapping as a sparse mapping. Note that a sparse mapping can also be a standard
mapping; in fact, we will use a standard sparse mapping with one wavelength aggregate per
column of the destination array. As in the case of dense mapping, we specify the mapping
through arrays Dst and Dim. (Clearly these arrays are of size 2
d 1
 2
d
or
N
2
N .)
In the next two sections we specify the dimension and destination arrays. The number
of aggregates for this mapping will be discussed in Section 4.4.
4.1 Construction of the Dimension Array, Dim
Here we dene the dimension array,Dim, recursively. As before, we dene just one 2
d 1
2
d 1
half of the array. The other half has the same entries as the rst half. Before we proceed,
we dene some additional terms.
Denition 4.1 For any d  0, a reverse diagonal array R is a 2
d
 2
d
array with rows
numbered 0; 1;    ; 2
d
  1 and columns numbered 0; 1;    ; 2
d
  1. Entry R(i; 2
d
  1  i) has
some non-empty value, for each 0  i < 2
d
. All other entries are \empty" (or without value).
2
39
h1
i
(a)
"
2
2
#
(b)
2
6
6
6
4
3
3
3
3
3
7
7
7
5
(c)
FIGURE 4.1: Examples of reverse diagonal arrays
2
6
6
6
4
1
1
1
1
3
7
7
7
5
FIGURE 4.2: Unit reverse diagonal array, U
2
Figure 4.1 shows examples of reverse diagonal arrays.
Denition 4.2 For d  0, a reverse diagonal array of size 2
d
2
d
, in which all non-empty en-
tries have value 1 is called a unit reverse diagonal array and is denoted by U
d
(see Figure 4.2).
2
For any array A =
2
6
6
6
4
A
2
A
1
A
3
A
4
3
7
7
7
5
; decomposed into 4 quadrants; A
1
; A
2
; A
3
; A
4
, we will refer
to he rst, second, third and fourth quadrants as indicated by the subscripts (following the
convention for angles from 0
Æ
to 360
Æ
).
Coming back to the denition of the array Dim, we rst note that the array is for a
d-dimensional hypercube. We use Dim
d
to indicate this. Also since our denition is for
a half of the array, we use Dim
d;1
and Dim
d;2
to denote the two halves. As noted earlier
Dim
d;1
= Dim
d;2
. The entries of Dim
d
can be dened recursively as follows.
40
Dim
1;1
= [0]
Dim
2;1
=
"
Dim
1;1
1U
0
1U
0
Dim
1;1
#
=
"
0 1
1 0
#
Dim
3;1
=
"
Dim
2;1
2U
1
2U
1
Dim
2;1
#
=
2
6
6
6
4
0 1 2
1 0 2
2 0 1
2 1 0
3
7
7
7
5
FIGURE 4.3: Recursive construction of Dim
3;1
Denition 4.3 Dim
1;1
= Dim
1;2
= [0].
For all d > 1, Dim
d;1
= Dim
d;2
=
2
6
6
6
4
Dim
d 1;1
(d  1)U
d 2
(d  1)U
d 2
Dim
d 1;1
3
7
7
7
5
: 2
For example, Figure 4.3 shows the recursive construction of Dim
3;1
. Recall that we dened
three quantities 
j
; 
j
and 
j
for each column j of array Dim for dense mapping (Chapter 3).
In the case of sparse mapping also the recursive denition provided above can be structured
around the values of 
j
; 
j
and 
j
. Their positions, however, are dened in a larger array of
size 2
d 1
 2
d 1
. The following lemma places 
d
j
; 
d
j
and 
d
j
in Dim
d
.
Lemma 4.4 For any d  3 and any 0  j < 2
d 1
(i) Dim(j; j) = 
d
j
= 
d
j+2
d 1
(ii) Dim((j   1) mod (2
d 1
); j) = 
d
j
= 
d
j+2
d 1
(iii) Dim((j + 1) mod (2
d 1
); j) = 
d
j
= 
d
j+2
d 1
Proof: Since Dim
d;1
= Dim
d;2
, clearly 
d
j
= 
d
j+2
d 1
; 
d
j
= 
d
j+2
d 1
and 
d
j
= 
d
j+2
d 1
. So we
restrict our discussion to 0  j < 2
d 1
. We proceed by induction on d  3.
For the base case (d = 3) and from Figure 4.3 we nd that 
3
j
= 0 (for 0  j < 4),

3
0
= 2; 
3
0
= 
3
1
= 1; 
3
1
= 
3
2
= 2; 
3
2
= 
3
3
= 1; 
3
3
= 2. By comparing these values with
41
Dim
3;1
shown in Figure 4.3, it is easy to verify that the lemma holds for d = 3.
Assume the lemma to hold for any d  3, and consider
Dim
d+1;1
=
2
6
6
6
4
Dim
d;1
dU
d 1
dU
d 1
Dim
d;1
3
7
7
7
5
:
Proof of part (i): If 0  j < 2
d 1
, then Dim
d+1;1
(j; j) is a main diagonal term of the
rst Dim
d;1
in the recursive expression for Dim
d+1;1
. By the induction hypothesis
and Denition 3.3, this element is 
d
j
= 0 = 
d+1
j
. Similarly if 2
d 1
 j < 2
d
, then
Dim
d+1;1
(j; j) is the diagonal element of its second Dim
d;1
. Again, Dim
d+1;1
(j; j) =

d
j 2
d
 1
= 0 = 
d+1
j
. Therefore part (i) of the lemma holds.
Proof of part (iii): If 0  j < 2
d 1
  1, then Dim
d+1;1
((j + 1) mod (2
d
); j) is element
((j+1); j) of the rst Dim
d;1
in the recursive expression forDim
d+1;1
. By the induction
hypothesis, an element of Dim
d;1
is 
d
j
= 1 + R
0
(j) (see Denition 3.3), where R
0
(j)
is the position of the rightmost 0 in the binary representation of j. Observe that
since 0  j < 2
d 1
  1, R
0
(j) is the same, regardless of whether j is represented by
d bits or d + 1 bits. That is, the rightmost 0 of j is not in bit position d   1. Thus

d
j
= 1 + R
0
(j) = 
d+1
j
. Element Dim
d+1;1
(2
d 1
; 2
d 1
  1) is not in the above Dim
d;1
,
rather it is in the (0; 2
d 1
  1)
th
element of U
d
. From Denition 3.3, 
d
j
= d  1, when
j = 2
d 1
  1. Since Dim
d+1;1
(2
d 1
; 2
d 1
  1) is element U
d
(0; 2
d 1
  1) = d, we have
Dim
d+1;1
(2
d 1
; 2
d 1
  1) = d = 
d+1
2
d 1
 1
. Thus part (iii) holds when 0  j < 2
d 1
.
42
22
2
2
2
2
1
1
1
1
1
1
1
1
1
1
1
1
1
11
0
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
2
2
2
2
2
2
2
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
2
1
0
FIGURE 4.4: 4-dimensional sparse Dim array
If 2
d 1
 j < 2
d
  1, then Dim
d+1;1
((j + 1) mod (2
d
); j) = Dim
d+1;1
((j + 1); j) is the
element (j + 1   2
d 1
; j   2
d 1
) of the second Dim
d;1
in the recursive expression for
Dim
d+1;1
. By the induction hypothesis this element is 
d
j
= 1+R
0
(j) = 
d+1
j
(as argued
above). The proof for this part now follows along the same lines as the 0  j < 2
d 1
case. For j = 2
d
  1, Dim
d+1
((j + 1) mod 2
d
; j) = Dim
d+1
(0; 2
d
  1) = d. Thus part
(iii) of the lemma is established.
Proof of part (ii): We have shown that for all 0  j < 2
d
, 
d+1
j
= Dim
j+1;1
((j + 1) mod
(2
d 1
); j). Then for j > 0, 
d+1
j
= 
d+1
j 1
= Dim
d+1;1
(j mod 2
d 1
; j   1). Also for 
d+1
0
,
d = D
d
(2
d 1
  1; 0) = Dim
d+1;1
(2
d
  1; 0) = Dim
d+1;1
((0   1) mod (2
d
); 0). Thus in
general, 
d+1
j
= ((j   1) mod (2
d
); j).
At this point we have specied the array Dim. An element of Dim
d
is either a dimension
from f0; 1;    ; d  1g or it is empty. If Dim
d
(i; j) is empty, then the corresponding elements
Src
d
(i; j) and Dst
d
(i; j) of the source and destination arrays are also empty. An example of
a 4-dimensional array Dim is shown in Figure 4.4. The corresponding array Dst is in Figure
4.5. Both halves of the arrays have been illustrated.
43
4
4 14
13
9
9
9
9
10
10
10
10
15
15
15
15
12
12
12
12
5
5
5
5
6
6
4
1
1
1
1
2
2
2
2
7
7
7
7
4
13
14
14
14
11
11
11
11 8
8
8
8
13
13
6
6
3
3
3
3
0
0
0
0
FIGURE 4.5: Sparse destination array for d = 4
4.2 Construction of the Destination Array, Dst
Since this section does not use a recursive denition, we will return to the usual notation of
Src;Dst and Dim, rather than Src
d
; Dst
d
and Dim
d
. For the sparse case, there are \empty"
entries in each column of arrays Dst and Src that (as noted above) correspond to the empty
entries of array Dim. For each column j of array Dst, all non-empty entries have the same
value, namely
^
h(j), the j
th
element of the shued standard Hamiltonian path (Section 3.1).
The array Dst for the sparse mapping is similar to that of the dense mapping. Recall that
it is suÆcient to specify array Dst as the N -element list of nodes
^
h = h
^
h(j) : 0  j < 2
d 1
i.
However, the size of this array for a sparse mapping is 2
d 1
 N (instead of the dN size
array in the case of dense mapping). An example of a sparse destination array is shown in
Figure 4.5. As is evident from the gure, not all cells in the array are lled. Notice from
Figure 4.4 and Figure 4.5 that empty cells coincide.
44
4.3 Construction of the Source Array, Src
With the destination and dimension arrays already dened, the construction of the source
array is similar to that of the dense mapping. Again empty entries of arrays Dim and Dst
also correspond to empty entries of array Src. More specically, for any 0  i < 2
d 1
and
0  j < 2
d
,
Src(i; j) =
8
>
>
<
>
>
:
empty; if Dst(i; j) is empty
Dst(i; j)j
Dim(i;j)
; otherwise.
An example of the array Src for a 4-dimensional hypercube is shown in Figure 4.6. Notice
from the gure that each row of this array has two aggregates, one in each half. The larger
number of rows for a sparse mapping allows room for just one processor to occupy an entire
row of a half of the array Src. We formally prove this later in Section 4.4.
0000111
2
4
8 8
11
13
88
11 1111
14141414
131313
4
3
9999
10101010
15151515
12121212
5555
6666
333
44
7777
222
1
FIGURE 4.6: A 4-dimensional Sparse Src
Consider the following example with a 7-bit string A = 0110101. With S
1
= f0; 1; 5g,
Aj
S
1
= 0010110 (the complemented bits are underlined). If S
2
= f2; 4; 5g, then [Aj
S
1
]j
S
2
=
0100010. Also observe that Aj
S
2
= 0000001 and [Aj
S
2
]j
S
1
= 0100010 = [Aj
S
1
]j
S
2
. In fact,
it is easy to see that if S
1
 S
2
= fx : (x 2 S
1
and x 62 S
2
)or(x 2 S
2
and x 62 S
1
g, then
45
[Aj
S
1
]j
S
2
= [Aj
S 2
]j
S
1
= Aj
S
1
S
2
.
The following lemma states this observation without proof.
Lemma 4.5 For any array A and any sets S
1
; S
2
of bit indices, [Aj
S
1
]j
S
2
= [Aj
S
2
]j
S
1
=
Aj
S
1
S
2
where S
1
 S
2
is the set of indices that are either in S
1
or in S
2
, but not in both.
Lemma 4.6 For any d  2, array Src
d
can be expressed recursively as follows.
Src
d;1
=
2
6
6
6
4
Src
d 1;1
S
11
S
13
Src
d 1;1
j
fd 1;d 2g
3
7
7
7
5
and Src
d;2
=
2
6
6
6
4
Src
d 1;2
S
21
S
23
Src
d 1;2
j
fd 1;d 2g
3
7
7
7
5
;
whereS
11
; S
13
; S
21
and S
23
are 2
d 2
 2
d 2
reverse diagonal arrays.
Proof: From the fact that
Dim
d;1
= Dim
d;2
=
2
6
6
6
4
Dim
d 1;1
(d  1)U
d 2
(d  1)U
d 2
Dim
d 1;1
3
7
7
7
5
(4.1)
(see Denition 4.3) and because Src
d
(i; j) = Dst
d
(i; j)j
Dim
d
(i;j)
, for any 0  i < 2
d 1
and
0  j < 2
d
(see Section 4.3), the arrays S
11
; S
13
; S
21
and S
23
are established to be reverse
diagonal.
Let
Src
d;1
=
2
6
6
6
4
A S
11
S
13
B
3
7
7
7
5
:
(4.2)
46
From Equation 4.1 and Lemma 3.1 we have for all 0  i; j < 2
d 2
,
A(i; j) = h
d 1
(2j)j
Dim
d 1
(i;j)
: (4.3)
To prove that A(i; j) = Src
d 1;1
(i; j) we need to show that
A(i; j) = h
d 1
(2j)j
Dim
d 1
(i;j)
=
^
h
d 1
(j)j
Dim
d 1
(i;j)
= Src
d 1;1
(i; j): (4.4)
By Denition 3.1, for 0  j < 2
d 2
,
^
h
d 1
(j) = h
d 1
(2j). This establishes Equation 4.4 and
hence the second quadrant of Src
d;1
is indeed Src
d 1;1
.
From Equation 4.2 we have, for all 0  i; j < 2
d 2
,
B(i; j) = [2
d 1
+ h
d 1
(2
d 1
  1  2j)]j
Dim
d 1;1
(i;j)
: (4.5)
We need to show that
B(i; j) = Src
d 1;1
(i; j)j
fd 1;d 2g
= [
^
h(j)j
Dim
d 1;1
(i;j)
]j
fd 1;d 2g
= [
^
h(j)j
fd 1;d 2g
]j
Dim
d 1;1
(i;j)
:
(4.6)
From Equations 4.5 and 4.6 we only need to show
2
d 1
+ h
d 1
(2
d 1
  1  2j) =
^
h
d 1
(j)j
fd 1;d 2g
: (4.7)
47
Note that 0  j < 2
d 2
, so j is a d  2 bit number. Let j = (j
d 3
j
d 4
   j
1
j
0
| {z }
d 2 bits
)
2
. Then
2j = (j
d 3
j
d 4
   j
1
j
0
0
| {z }
d 1 bits):(4.8)
Since 2
d 1
  1 = 11   11
| {z }
d 1 bits
we have
(2
d 1
  1)  2j = (j
0
d 3
j
0
d 4
   j
0
1
j
0
0
1
| {z }
d 1 bits
): (4.9)
Let h
d 1
(2
d 1
  1  2j) = x = (x
d 2
x
d 3
  x
1
x
0
)
2
. Then by Lemma 2.4 and Equation 4.9,
we have
x
0
= 1 j
0
0
= j
0
x
k
= j
0
k
 j
0
k 1
= j
k
 j
k 1
; for 0 < k < d  2
x
d 2
= j
0
d 3
 0 = j
0
d 3
(4.10)
Thus,
2
d 1
+ h
d 1
(2
d 1
  1  2j) = 2
d 1
+ x = (1x
d 2
  x
1
x
0
| {z }
d 1 bits
)
2
; (4.11)
where the bits of x are given by Equation 4.10.
Returning to the right hand side of Equation 4.7,
let
^
h
d 1
(j) = h
d 1
(2j) = y = (y
d 1
y
d 2
   y
1
y
0
| {z }
d 1 bits
)
2
. Then from Equation 4.8 and Lemma 2.4
we have the following
y
0
= 0 j
0
= j
0
y
k
= j
k
 j
k 1
; for 0 < k < d  2
y
d 2
= j
d 3
 0 = j
d 3
y
d 1
= 0 0 = 0
(4.12)
48
From Equations 4.10, 4.11 and 4.12, we see that 2
d 1
+ h
d 1
(2
d 1
  1   2j) and
^
h
d 1
(j)
dier only in bits d  1 and d  2. Thus Equation 4.7, and hence Equation 4.6, holds. This
establishes the recursive structure for Src
d;1
.
The proof for Src
d;2
is similar and is omitted for brevity.
4.4 Number of Aggregates
In the case of a sparse mapping, both the destination array and the source array have N
aggregates each. The fact that the destination array has N aggregates follows from the fact
that the mapping is a standard mapping. For the source array, each half of the 2
d 1
 N
array has 2
d 1
aggregates. The Src therefore has N aggregates. We prove this now.
Lemma 4.7 For any d  1 and any 0  i; j; k < 2
d 1
, if Src
d
(i; j) and Src
d
(i; k) are
non-empty, then Src
d
(i; j) = Src
d
(i; k).
Proof: We proceed by induction on d. The lemma holds for the case d = 1 as there is only
one entry in Src
1;1
and Src
1;2
.
Assume the lemma to hold for any d  1 and consider Src
d+1
. By Lemma 4.6
Src
d+1;1
=
2
6
6
6
4
Src
d;1
S
11
Src
13
Src
d;1
j
d;d 1
3
7
7
7
5
:
Clearly the induction hypothesis applies to the Src
d;1
in the second quadrant of Src
d+1;1
.
The induction hypothesis also applies to the Src
d;1
j
d;d 1
of the fourth quadrant as all elements
of Src
d;1
are altered in an identical manner. The strategy we use is as follows. Pick any
row i. If this row traverses Src
d;1
(or Src
d;1
j
d;d 1
), then consider a particular element  of
49
Src
d
(or Src
d;1
j
d;d 1
). All other elements in row i of Src
d;1
(or Src
d;1
j
d;d 1
) are identical to
, by the induction hypothesis. Now we note that S
11
and S
13
are reverse diagonal arrays
(Lemma 4.6) and contain only one non-empty element in row i. All we need to establish is
that this non-empty element equals .
Depending on which quadrants row i traverses, we consider two cases.
Case 0  i < 2
d 1
: Here row i traverses the Src
d;1
in the second quadrant of Src
d+1;1
.
Consider
 = Src
d;1
(i; i) =
^
h
d
(i)j
D
d
(i;i)
= h
d
(2i)j

i
= h
d
(2i)j
0
: (4.13)
The only non-empty element in row i of S
11
is in column 2
d 1
  1   i of S
11
or in
column 2
d 1
+ 2
d 1
  1  i = 2
d
  1  i of Src
d+1;1
. This element has the value
^
h
d+1
(2
d
  1  i)j
Dim
d+1
(i;2
d
 1 i)
= h
d+1
(2(2
d
  1  i))j
d
: (4.14)
The last part of this equation follows from the fact that Dim
d+1
(i; 2
d
  1   i) =
dU
d 1
(i; 2
d
  1  i) = d. With 0  i < 2
d 1
, let
i = (i
d 2
i
d 3
   i
1
i
0
| {z }
d 1 bits
)
2
: (4.15)
Then 2
d
  1  i = (11   11
| {z }
d bits
)
2
  (0i
d 2
i
d 3
   i
1
i
0
| {z }
d 1 bits
)
2
= 1i
0
d 2
i
0
d 3
   i
0
1
i
0
0
| {z }
d bits
.
50
So, 2(2
d
  1  i) = 1i
0
d 2
i
0
d 3
   i
0
1
i
0
0
0
| {z }
d+1 bits
.
Let h
d+1
(2(2
d
  1  i)) = x = x
d
x
d 1
  x
1
x
0
, where
x
0
= i
0
0
 0 = i
0
x
d
= 1 0 = 1
x
d 1
= 1 i
0
d 2
= i
d 2
and for 0 < l < d  1
x
`
= i
0
`
 i
0
` 1
= i
`
 i
` 1
; for0  ` < d  1.
(4.16)
Then the only non-empty element in row i of S
11
has a value of
h
d+1
(2(2
d
  1  i))j
d
= 0x
d 1
x
d 2
  x
0
; (4.17)
where x
`
is given by Equation 4.16.
As noted earlier (Equation 4.13), all elements in row i of the Src
d;1
, in the the second
quadrant of Src
d+1;1
have value  = h
d
(2i)j
0
.
Let  = 
d

d 1
   
1

0
. Then h
d
(2i) = j
i
= 
d

d 1
   
1

0
0
. From Equation 4.15
2i = i
d 2
i
d 3
   i
1
i
0
0, and we have

0
0
= 0 i
0
= i
0
or 
0
= i
0
0

`
= i
`
 i
` 1
for 0 < ` < d  1

d 1
= i
d 2
 0 = i
d 2

d
= 0 0 = 0:
(4.18)
From Equation 4.14 and 5.1 we nd that the only non-empty element of S
11
has value
, as required.
51
Case 2
d 1
 i < 2
d
: The only dierence between this and the previous case is that bit
d  1 of i is a 1 (rather than 0 in the previous case). The argument remains the same,
however. We, therefore, skip the details.
Overall the proof for Src
d+1;2
follows along the same lines as that for Src
d+1;1
.
Therefore we have the following main result of this section.
Theorem 4.8 For any d  1, a d-dimensional hypercube can be mapped on a slab waveguide
using 2
d
lasers and 2
d
detectors.
Remark: We prove this mapping optimal in Chapter 5.
4.5 Relation to the Extended Hypercube
The sparse construction so far has several empty elements in the arrays Src;Dst and Dim.
That is, not all channels of the channel array are used. If these empty elements of the
source and destination were lled with values already present in the aggregates (so that the
aggregates are not broken), then we get a supergraph of the hypercube. Figure 4.7 shows
this for H
3
. In this section we formally identify this supergraph of the hypercube H
d
and
call it the extended hypercube Q
d
. The extended hypercube represents the largest graph
that the sparse mapping will support.
Denition 4.9 For any d  1, let Q
d
= (V; E) be the extended hypercube, where V =
f0; 1;    ; 2
d 1
g. An edge (p; q) 2 E i p and q dier in an odd number of bits. 2
Clearly, a regular hypercube has edges corresponding to 1 dierent bit (rather than an odd
number of diering bits). An example of a 5-dimensional extended hypercube is illustrated
in Figure 4.8. In this gure only extra edges out of node 0 have been shown.
52
Dst Array
4
0 3
3
3
6
6
6
5
5
5
1
1
1
2
2
2
7
7
7
4
4
40
0
0 73 6 5 1 2
Src Array
6
5
1 1 1
2 2 2 2
7 7 7
4 4 4
0 0 0
3 3 3
6 6 6
5 5 5
3
1
7
4
0
FIGURE 4.7: Mapping for an extended hypercube Q
3
Lemma 4.10 For any m  0, standard Hamiltonian paths h
d
(j) and h
d
((j+m)(mod (2
d
)))
dier in an even number of bits, if m is even, and an odd number of bits, if m is odd.
Proof outline: Since h
d
represents a Hamiltonian path, for 0  j < 2
d
, the pair hh
d
(j); h
d
((j+
1)(mod(2
d
)))i represents an edge of the hypercube H
d
. So h
d
(j) and h
d
((j + 1)(mod(2
d
)))
dier in exactly 1 bit. Therefore, h
d
(j) and h
d
((j + 2)(mod(2
d
))) dier in 2 bits. Let
a
0
; a
1
; a
2
;    ; a
k
represent the string of nodes that form a Hamiltonian cycle for any hyper-
cube. Observe that a
0
and a
1
have to dier in 1 bit, a
0
and a
2
dier in 0 or 2 bits, a
0
and
a
3
dier in either 1 or 3 bits and so on.
Lemma 4.11 For d  1 let
^
h
d
=
^
h
a
d
Æ
^
h
b
d
. For x 2 fa; bg the binary representation of any
pair of distinct elements in
^
h
x
d
dier in an even number of bits.
Proof: The elements of
^
h
a
d
are from the set fh
d
(2i) : 0  i < 2
d 1
g = S
a
(say) and the
elements of
^
h
b
d
are from S
b
= fh
d
(2i + 1) : 0  i < 2
d 1
g. Thus all elements of S
a
(or S
b
)
53
FIGURE 4.8: A 5-dimensional extended hypercube where the regular edges are shown dotted
and the extended hypercube edges are shown dashed for one node
have indicies in h
d
that dier by an even number of bits. That is, if h
d
(j); h
d
(k) 2 S
a
(or
S
b
) then (j   k)(mod(2)) = 0. By Lemma 4.10, h
d
(j) and h
d
(k) dier in an even number of
bits.
In the following we denote the source, destination and dimension array of the extended
hypercube by Src
d
;Dst
d
and Dim
d
to indicate that they have no empty elements.
Lemma 4.12 For any d  1; any 0  i < 2
d 1
and any 0  j < 2
d
the binary representation
of Src
d
(i; j) and Dst
d
(i; j) dier in an odd number of bits.
Proof: We consider one half at a time. Let 0  i; j < 2
d 1
. We know that Dim
d
(i; j) =

i
= 0, so Src
d
(i; i) is non-empty and Src
d
(i; i) = h(2i)j
0
=
^
h(i)j
0
.
54
Clearly Src
d
(i; i) = Src
d
(i; j) (for all 0  j < 2
d 1
) diers from Dst
d
(i; i) =
^
h
d
(i) in an
odd number of bits. By Lemma 4.10, Src
d
(i; j) diers from every
^
h
d
(j) = Dst
d
(i; j) (where
0  j < 2
d 1
) in an odd number of bits.
The following are well known results.
Lemma 4.13 For any d; x  1,

d+1
k

=

d
k

+

d
k 1

.
Lemma 4.14 [Binomial Theorem] For any integer n  0 and reals a; b,
(a+ b)
n
=
n
X
i=0

n
i

a
i
b
n i
With a = b = 1 in Lemma 4.14, one obtains the following.
Corollary 4.15 For any n  0,
n
X
i=0

n
i

= 2
n
.
Lemma 4.16 Each node of the extended hypercube Q
d
(for any d  1) has
b
d 1
2
c
X
i=0

d
2i+1

= 2
d 1
neighbors.
Proof: We proceed by induction on d  1. For d = 1 we have,
0
X
i=0

d
2i+1

=

1
1

= 1 = 2
1 1
.
Assume the lemma to hold for any d  1 and consider two cases for
b
d
2
c
X
i=0

d+1
2i+1

.
Case d is odd: From Lemma 4.14 we have
b
d
2
c
X
i=0

d+1
2i+1

=
d 1
2
X
i=0

d
2i+1

+

d
2i

: (4.19)
Notice that 2i + 1  2:
d 1
2
+ 1 = d. So

d
2i

is a valid binomial coeÆcient. Also since
the set f2i; 2i+1 : 0  i 
d 2
2
g = f0; 1;    ; dg, we have from Equation 4.19 and from
Lemma 4.14
b
d
2
c
X
i=0

d+1
2i+1

=
d
X
i=0

d
i

= 2
d
:
55
Case d is even: Again from Lemma 4.14 we have
b
d
2
c
X
i=0

d+1
2i+1

=
d
2
X
i=0

d+1
2i+1

=

d+1
d+1

+
d
2
 1
X
i=0

d+1
2i+1

=

d
d

+
d
2
 1
X
i=0

d
2i+1

+

d
2i

=
d
X
i=0

d
i

= 2
d
:
The last but one step follows from the fact that

d+1
d+1

=

d
d

= 1 and f2i; 2i + 1 : 0 
i <
d
2
g = f0; 1;    ; d  1g and the last step is from Corollary 4.15.
In any case,
b
d
2
c
X
i=0

d+1
2i+1

= 2
d
as required.
Theorem 4.17 For any d  1, Src
d
and Dst
d
as dened in Lemma 4.9 map the extended
hypercube Q
d
on a slab waveguide using 2
d
lasers and 2
d
detectors.
Proof: Lemma 4.16 shows that every edge mapped between Src
d
(i; j) and Dst
d
(i; j) is an
edge of Q
d
. Also for any 0  i; i
0
< 2
d 1
and 0  j; j
0
< 2
d
such that i 6= i
0
or j 6= j
0
, the
doublets hSrc
d
(i; j);Dst
d
(i; j)i and hSrc
d
(i
0
; j
0
);Dst
d
(i
0
; j
0
)i are dierent. If Src
d
(i; j) = Src
d
(i
0
; j
0
),
then i = i
0
. In addition, if Dst
d
(i; j) = Dst
d
(i
0
; j
0
), then j = j
0
. So the two doublets can be
identical i i = i
0
and j = j
0
. Therefore each entry of the channel array represents a dierent
edge of Q
d
. Finally by Lemma 4.16, Q
d
has 2
d
 2
d 1
edges, which is also the number of
elements in the channel array.
Therefore, every edge of Q
d
is mapped. Clearly the mapping has 2
d
source aggregates
and 2
d
destination aggregates.
We prove in the next chapter that the sparse mapping has an optimal number of aggre-
gates.
56
Chapter 5
Lower Bounds
We have so far derived upper bounds on the number of aggregates (lasers and detectors)
in the dense and sparse mapping cases. How good are these mappings? In this chapter we
derive matching lower bounds for these mappings and show that they are optimal (in terms
of the number of aggregates).
5.1 A General Lower Bound
In this section we derive a general (trivial) lower bound that applies for any weak topology
with an underlying strongly connected graph.
Lemma 5.1 Every strongly connected weak topology requires at least 1 laser and 1 detector
per node.
Proof: Because the graph is strongly connected each node has an indegree and an outdegree
 1. This implies that it needs at least one laser and one detector.
Recall that the sparse mapping uses N lasers and N detectors (one each, per node) to
map an extended hypercube on to an optical slab waveguide. Since an (extended) hypercube
is strongly connected, Lemma 5.1 establishes our mapping to be optimal.
Theorem 5.2 The sparse mappings of the hypercube and extended hypercube (Theorems 4.8
and 4.17) have an optimal number of source and destination aggregates.
57
5.2 Dense Mapping Lower Bound
Since the dense mapping is a standard mapping, the array Dst has N aggregates, and hence
N detectors. This is required (and optimal) for any standard mapping. Here we derive a
lower bound for the number of source aggregates in any standard dense mapping.
A simple directed graph is one in which for any pair of nodes u; v, there is at most one
directed edge hu; vi. That is, a simple graph does not allow multiple edges from u to v.
Lemma 5.3 Consider any weak topology based on a simple directed graph that is mapped
to a slab using a standard mapping in which the array Dst has one column aggregate per
column. If no edge of the graph is mapped to more than one channel, the array, Src, cannot
have any non-trivial column aggregates.
Proof: Consider any column j of Dst. Let this column have index u in all entries. If the
corresponding column in Src has a non-trivial aggregate in rows i and i
0
(that is, Src(i; j) =
Src(i
0
; j) = v), then edge hu; vi is mapped to channels (i; j) and (i
0
; j). (Figure 5.1 illustrates
this.)
u
u
i
j
i’
Dst
i
j
v
vi’
Src
FIGURE 5.1: An illustration of the proof of Lemma 5.3
Since the dense mapping of Chapter 3 satises all the conditions of Lemma 5.3 we have
the following result.
58
Corollary 5.4 The standard dense mapping of a hypercube has no column aggregates in
Src.
Lemma 5.3 shows that no column of Src can have a non-trivial aggregate. We now limit
the number of non-trivial aggregates in Src.
Lemma 5.5 In a standard mapping of a strongly connected weak topology, for any two
columns j; j
0
of Dst, the indices corresponding to column aggregates are distinct.
Proof: Let column j and j
0
have aggregates with indicies u and v respectively. Since Dst
has N columns, where N is the number of nodes and each node has an incoming edge (for a
strongly connected topology), each column of Dst must be associated with a distinct node
(aggregate index).
Before we proceed, recall that the notation aj
b
is used to denote a binary number a with
bit b complimented.
Lemma 5.6 In a standard dense mapping, two adjacent columns of array, Src, cannot have
more than two non-trivial aggregates.
Proof: Suppose columns j and j+1 have three non-trivial row aggregates in rows i
1
; i
2
and
i
3
(see Figure 5.2). Note that these aggregates may extend in either direction beyond columns
j and j +1. Also note that since a column in Src cannot have a column aggregate the value
a; b and c in Figure 5.2 are distinct. Let u and v be the indicies in column j and j + 1 re-
spectively of Dst (as shown in Figure 5.2(b)). Thus ha; ui; hb; ui; hc; ui; ha; vi; hb; vi and hc; vi
are hypercube edges. Since each hypercube edge is between processor pairs whose binary
representation dier in exactly 1 position, let, u = aj
k
= bj
`
= cj
m
and v = aj
k
0
= bj
`
0
= cj
m
0
.
Since u 6= v, we have k 6= k
0
; ` 6= `
0
and m 6= m
0
. Consider a and b rst. Since a 6= b, we
59
j j+1
aa
bb
cc
(a) Src
v
v
v
u
u
u
j j+1
(b) Dst
FIGURE 5.2: An illustration of the proof of Lemma 5.6
have k 6= ` and k
0
6= `
0
, and
u = aj
k
= vj
k
0
jk
= vj
k;k
0
= bj
`
0
j
k;k
0
= bj
f`
0
gfk;k
0
g
= uj
`j
f`
0
gfk;k
0
g
= uj
f`;`
0
gfk;k
0
g
:
In summary, u = uj
f`;`
0
gfk;k
0
g
. That is f`; `
0
g  fk; k
0
g = ;. Since ` 6= k and `
0
6= k
0
, ` = k
0
and `
0
= k. Without loss of generality let ` = k
0
= 0 and k = `
0
= 1 and a; b; u and v have
the form shown in Figure 5.3. Now we bring in c. Since uj
m
= vj
m
0
= c the only way is
0
0
1
1
1
u
u’
u’ u’
u’
u
u
u
1
1
1
1
0
0
0
0
0
u
a
v
b
X
X
X
X
01
FIGURE 5.3: Illustrating the structure for numbers u; a; v; b
to have fm;m
0
g = fk; k
0
g (see Figure 5.3) which would make c = a or c = b, providing the
necessary contradiction.
Consider any standard dense mapping of a d-dimensional hypercube resulting in an array
60
Src
d
. Let Src[0; j] denote the part of the array including the rst j + 1 columns 0; 1;    ; j
and in general let Src[j; j
0
] denote the portion of Src from column j to column j
0
.
Denition 5.7 Let A be an aggregate of Src
d
. For any 0  j < 2
d
, aggregate A is in-
complete in Src
d
[0; j] i some portion of A lies outside Src
d
[0; j]. Otherwise it is called a
complete aggregate of Src
d
[0; j]. 2
Figure 5.4 shows examples of a complete and an incomplete aggregate.
0 j
A
A 2
1
Src
FIGURE 5.4: Examples of a complete aggregate A
1
and an incomplete aggregate A
2
of
Src
d
[0; j].
Lemma 5.8 For any d  1 and 0  j < 2
d
, Src
d
[0; j] has at most two incomplete aggregates.
Proof: Every incomplete aggregate of Src
d
[0; j] must traverse column j. By Denition 5.7
if j = 2
d 1
, then there are no incomplete aggregates. Let j < 2
d
  1 and consider columns j
and j+1. All incomplete aggregates of Src
d
[0; j] traverse columns j and j+1. By Lemma 5.6
there can be at most two such aggregates.
We now develop an important result which lays the foundation for the main result of this
section.
Lemma 5.9 For any 0  j < 2
d
, Src
d
[0; j] has at least (j + 1)(d  2) + 2 aggregates.
61
yY
X
Z
y’
Y−y
j+1j0
FIGURE 5.5: An illustration of the proof of Lemma 5.9
Proof: We proceed by induction on j  0. Note rst that all aggregates are row aggregates
(Lemma 5.3). Then for the rst column (j = 0) each element is part of a dierent aggregate,
so there are d = (0+ 1)(d  2) + 2 aggregates. Assume the lemma to hold for any j  0 and
consider Src
d
[0; j + 1] (see Figure 5.5). Let Src
d
[0; j] have X complete aggregates and Y
incomplete aggregates. Of these Y incomplete aggregates, let y become complete in column
j + 1. Let column j + 1 start y
0
new incomplete aggregates. Let Z be the number of trivial
aggregates in column j + 1.
The number of incomplete aggregates in Src
d
[0; j + 1] is Y   y + y
0
by Lemma 5.8. The
number of aggregates in Src
d
[0; j] is
X + Y  (j + 1)(d  2) + 2 (5.1)
by the induction hypothesis. The number of trivial aggregates on column j + 1 is
Z + d  (Y + y
0
) (5.2)
62
(see Figure 5.5). The total number of aggregates in Src
d
[0; j + 1] is
(X + Z + Y
| {z }
complete
) + (Y   y + y
0
| {z }
incomplete
) = X + d from Equation 5.2
 (j + 1)(d  2) + 2  Y + d from Equation 5.1
 (j + 1)(d  2) + 2  2 + d as Y  2
= (j + 2)(d  2) + 2.
This completes the proof.
If Lemma 5.9 is used over the entire array Src
d
, we have a lower bound of (d  2)2
d
+ 2
aggregates. Our method (Theorem 4.17) uses (d  2)2
d
+ 4 aggregates. In the remainder of
this section we further tighten the lower bound of Lemma 5.9 to establish that the method
of Theorem 3.10 is optimal.
Lemma 5.10 For any standard, dense mapping of a d-dimensional hypercube, Src
d
has at
least a pair of adjacent columns j and j + 1 such that no single aggregate traverses both j
and j + 1.
Proof: Suppose the lemma was not true. Let Src
d
have k row aggregates (see Figure 5.6)
A
1
; A
2
;    ; A
k
such that each A
i
, 1  i < k, starts at column j
0
i
. These aggregates satisfy
j
i
 j
0
i 1
(for 1 < i  k); that is, A
i
starts at a column number higher than one in which
A
i 1
ends. This assures that A
i
and A
i 1
overlap in at least one column. Let A
i
have contain
index a
i
. Let column j
i
of Dst
d
have index b
i
(see Figure 5.6). Then we have the following
path in the hypercube.
b
1
! a
1
! b
2
! a
2
!    ! b
k
! a
k
:
63
j j j j1
2
2 3 k
k
j’ j’j’1
b b b b1 2 3 k
Src
Dst
..........
a
a
a
a
1
2
3
k
A
A
A
Ak
3
2
1
FIGURE 5.6: An illustration for the proof of Lemma 5.10
Let A = fa
i
: 1  i  kg and B = fb
i
: 1  i  kg. Clearly A and B are non-empty. We
claim that A \B = ;.
Let a
i
= b
i
0
= a 2 A\B. Without loss of generality, let i < i
0
then we have the following
cycle
a = a
i
! b
i+1
! a
i+1
!    ! b
i
0
 1
! a
i
0
 1
! b
i
0
= a:
Clearly this is a cycle of odd length. That is not possible on a hypercube (needs even number
of bit complements to get back to same node). Since A and B are disjoint and non-empty,
not all elements of the hypercube can be included in B. Since each column of Dst
d
has a
distinct index (Lemma 5.5), B cannot cover the entire array Dst
d
. That is, A cannot cover
the entire array Src
d
.
We now derive the main result of this section.
Theorem 5.11 Every standard dense mapping of a d-dimensional hypercube on a slab re-
quires 2
d
destination aggregates and at least (d  2)2
d
+ 4 source aggregates.
Proof: The bound on the destination aggregates follows from the fact that it is a standard
64
mapping.
By Lemma 5.10, there exists columns j and j+1 of Src
d
such that no aggregates overlap
over these columns. So Lemma 5.9 applies separately to both Src
d
[0; j] and Src
d
[(j+1); 2
d
 
1]. This gives a total of [(j + 1)(d   2) + 2] + [(2
d
  (j + 1)(d   2) + 2] = (d   2)2
d
+ 4
aggregates.
Our method in Chapter 3 (Theorem 3.10) is therefore optimal in the number of aggre-
gates.
65
Chapter 6
Conclusion
It is widely acknowledged that optical interconnects are the best candidates to meet the com-
munication needs of future computing systems. This thesis deals with optical interconnects
among spatially proximate components. While a lot of research has addressed the techno-
logical improvements for such interconnects, very little research has gone into approaching
the design of optical interconnects in a manner that exploits knowledge of the computation,
as is done in this thesis.
We have developed a few methods to map a d-dimensional weak hypercube onto an
optical slab waveguide. A weak hypercube allows each node to communicate with at most
one neighbor at a time. Our approach uses this under-utilization of the topology's edges to
pack as many channels as possible into a single laser/detector, while reducing the number of
optical components used and consequently their cost.
We have also derived non-trivial lower bounds for any standard mapping of a d -dimensional
weak hypercube and shown that the mappings we have proposed match these lower bounds
and are, hence, optimal.
We have proposed results for two cases: (1) when the size of the channel array is d 2
d
(dense mapping) and (2) when the size of the channel array is 2
d 1
 2
d
(sparse mapping).
For each of these cases we have dened the construction of the channel array in terms of
three 2-dimensional arrays, namely, the source, destination and dimension arrays. We have
also derived matching lower bounds for the number of aggregates for each of the mappings.
The dense mapping uses (d   2)2
d 1
+ 4 lasers and 2
d
detectors, while the sparse mapping
66
uses 2
d
lasers and 2
d
detectors.
Though the sparse mapping has a clean recursive structure denition we have also shown
that it can be viewed as an extension of the dense mapping approach; i.e., the Hamiltonian
path for Dst array and 
j
; 
j
and 
j
along the diagonal is the same for both. It is possible
to generalize these two cases (sparse and dense) in a mapping on any R  2
d
channel array
for d  R < 2
d 1
.
Other directions include the mapping of more complex higher dimensional topologies
and ways to pack more elements per aggregate. The cost of the mappings that we consider
depends only on the number of aggregates per mapping. Analyzing the placement, shape
and size of these aggregates may be possible.
Our results can also help in designing new optical interconnect models.
67
Bibliography
[1] \Light and Optics," ACEPT, Department of Physics and Astronomy, Arizona State
University, 1999. http : ==acept:asu:edu=P iN=rdg=color:color:shtml
[2] G. P. Agarwal, \Fiber-Optic Communication Systems," John Wiley, June 2002.
[3] S. Berdague and P. Facq, \Mode Division Multiplexing in Optical Fibers," Applied
Optics, vol. 21, June 1, 1982, pp. 1950-1955.
[4] M. Chateauneuf, A. G. Kirk, D. V. Plant, T. Yamamoto, and J. D. Ahearn, \512-
Channel Vertical-Cavity Surface-Emitting Laser Based Free-Space Optical Link," Ap-
plied Optics, vol.41, (2002), pp. 5552-5561.
[5] G. Chen, \A Tutorial on Interconnection Networks," Nanjing University, 1999.
[6] I. Chlamtac and A. Ganz, and G. Karmi, \Lightnets: Topologies for High Speed Optical
Networks," IEEE/OSA Journal of Lightwave Technology, vol. 11, May/June 1993.
[7] R. T. Chen, D. Robinson, Z. Sun, T. Jannson, and D. V. Plant, \60 GHz Board-to-Board
Optical Interconnection Using Polymer Optical Buses in Conjunction with Microprism
Couplers," Applied Physics Letters, vol. 60(5), February 1992, pp. 536-538.
[8] S. Y. Cho, M. A. Brooke, and N. M. Jokerst, \Optical Interconnections on Electrical
Boards Using Embedded Active Optoelectronic Components," IEEE Journal of Selected
Topics in Quantum Electronic, vol. 9,(2), March/April 2003, pp. 465-476.
[9] Introduction to DWDM, Cisco Systems, 2000,
http : ==www:cisco:com=univercd=cc=td=doc=product=mels=dwdm=dwdm fns:htm
[10] J. Duato, S. Yalmanchili, and L. Ni, \Interconnection Networks: An Engineering Ap-
proach," Morgan Kaufman, 2002.
[11] M. Feldman, A. El-Amawy, and R. Vaidyanathan, \High Speed, High Capacity Bused
Interconnects Using Optical Slab Waveguides," Proc. Workshop on Optics and Com-
puter Science (Springer-Verlag Lecture Notes in Computer Science, Vol. 1586, 1999,
pp.924-937.
[12] M. Feldman, A. El-Amawy, and R. Vaidyanathan, \Optical Slab Waveguide for Massive,
High-Speed Interconnects," U.S.Patent 6332050, December 2001.
[13] M. R. Feldman, S. C. Esener, C. C. Guest, and S. H. Lee, \Comparison Between Optical
and Electrical Interconnects Based on Power and Speed Considerations," Applied Optics,
vo. 27, no. 9, 1988, pp. 1742-1751.
68
[14] G. R. Fowles, \Introduction to Modern Optics," Dover Publications, 2nd edition, 1989.
[15] X. Han, G. Kim, J. Lipovski, and R. T. Chen, \An Optical Centralized Shared-Bus Ar-
chitecture Demonstrator for Microprocessor-to-Memory Interconnects," IEEE Journal
of Selected Topics in Quantum Electronics, vol. 9, no. 2, March/April 2003, pp. 512-517.
[16] L. J. Irakliotis and P. A. Mitkas, \Optics: A Maturing Technology for Better Comput-
ing," IEEE Computer Magazine, vol. 31, no. 2, February 1998, pp. 36-37.
[17] G. Kim, X. Han, and R. T. Chen, \A Method for Rebroadcasting Signals in an Optical
Backplane Bus System," Journal of Lightwave Technology, vol. 22, no. 3, March 2004,
pp. 840-844.
[18] G. Kim, X. Han, and R. T. Chen, \Crosstalk and Interconnection Distance Considera-
tions for Board-to-Board Optical Interconnects using 2-D VCSEL and Microlens array,"
IEEE Photonics Technology Letters, vol. 12, no. 6, June 2000, pp. 743-745.
[19] T. Leighton, \Introduction to Parallel Algorithms and Architectures: Arrays, Trees,
Hypercubes," Morgan Kaufmann Publishers, San Mateo, California, 1992.
[20] R. Lytel, H. Davidson, N. Nettleson, and T. Sze, \Optical Interconnections within
Modern High-Performance Computing Systems," Technical report, Sun Microsystems,
May 2000.
[21] \Ultrahigh Speed Multi-Gigabit Wireless Laser Communication System with Fully
Integrated High-Speed Microwave Radio Backup," MDA Technologies, 2002. http :
==www:mdatechnology:net=techresearch:asp?articleid = 536
[22] E. Mohammed, A. Alduino, T. Thomas, and H. Braunisch, et al.,, \Optical Interconnect
System Integration for Ultra Short Reach Applications," Intel Technology Journal, vol.8,
issue 2, May 2004, pp. 115-128.
[23] A. V. Mule, S. Schultz, T. K. Gaylord, and J. D. Meindl, \Input Coupling and Guided
Wave Distribution Schemes for Board-Level Intra Chip Optical Clock Distribution Net-
work Using Volume Grating Coupler Technology," Proc. IEEE International Intercon-
nect Technology Conference, 2001, pp. 128-130.
[24] B. E. Nelson, G. A. Keeler, D. Agarwal, N. C. Helman, and D. A. B. Miller, \Wave-
length Division Multiplexed Optical Interconnect Using Short Pulses," IEEE Journal
of Selected Topics in Quantum Electronic, vol. 9,(2), March/April 2003, pp. 486-491.
[25] E. G. Paek, \High Speed Characterization of Spatial Light Modulators and its Appli-
cations to Optical Information Processing," Lasers and Electro-Optics Society, vol. 1,
1999, pp. 323-324.
[26] C. G. Plaxton, \Load Balancing, Selection and Sorting on the Hypercube," Proc. Sym-
posium on Parallel Algorithms and Architectures, 1989, pp. 64-73.
[27] M. Raksapatcharawong and T. M. Pinkston, \Modelling Free-Space Optical k-ary n-
cube wormhole networks," Journal of Parallel and Distributed Computing, vol. 55(1),
1998, pp. 60-93.
69
[28] R. Ramaswami and K. Sivarajan, \Optical Networks: A Practical Perspective, 2nd
edition," Morgan-Kauman, October 2001.
[29] C. S. Ram Murthy and M. Guruswamy, \WDM Optical Networks: Concepts, Designs
and Algorithms," Prentice Hall, 2002.
[30] T. Saito, T. Ota, T. Toratani, and Y. Ono, \16-ch Arrayed Waveguide Grating Module
with 100-GHz spacing," Furukawa Review, Vol. 19, 2000.
[31] K. Sethuraman, \Mapping Weak Multidimensional Torus Communications on Optical
Slab Waveguides," M.S.Thesis, Dept. of Electrical Computer Engineering, Louisiana
State University, 2005.
[32] R. Vaidyanathan and A. Padmanabhan, \Bus-Based Networks for Fan-in and Uniform
Hypercube Algorithms," Parallel Computing, vol. 21, pp. 1807-1821, 1995.
[33] R. Vaidyanathan and K. Sethuraman, \On Mapping Multidimensional Weak Tori on
Optical Slab Waveguides," Proc. International Conference on Parallel Processing, 2005.
[34] Y. Yang, and J. Wang, \Designing WDM Optical Interconnects with Full Connectivity
by Using Limited Wavelength Conversion," Proc. International Parallel and Distributed
Processing Symposium, 2004.
[35] A. Yenjay, R. Gao, K. Takayama, and A. F. Garito, \Ultra-Low-Loss Polymer Waveg-
uides," Journal of Lightwave Technology, vol. 22, no. 1, January 2004, pp. 154-158.
[36] I. A. Young, \Introducing Intel's Chip-to-Chip Optical I/O Technology," Technology
Intel Magazine, April 2004. www.intel.com/update/departments/initech/ito4o41.pdf
70
Vita
Divya Rengan completed her schooling in April 2000, at Holy Angels Anglo Indian Higher
Secondary School, Chennai, India.
She pursued her bachelor's degree at Sri SaiRam Engineering College, Chennai (aÆliated
to the University of Madras), majoring in information technology. She graduated with
distinction in May 2004.
She then joined the Department of Electrical and Computer Engineering at Louisiana
State University, Baton Rouge, to do her master's, in the Fall of 2004. She will be graduating
in December 2006 with the degree of Master of Science in Electrical Engineering.
71
