Minimum Delay Scheduling for Performance Guaranteed Switches with Optical Fabrics by Wu, Bin et al.
JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 16, AUGUST 15, 2009 3453
Minimum Delay Scheduling for Performance
Guaranteed Switches With Optical Fabrics
Bin Wu, Member, IEEE, Kwan L. Yeung, Senior Member, IEEE, Pin-Han Ho, and Xiaohong Jiang, Member, IEEE
Abstract—We consider traffic scheduling in performance guar-
anteed switches with optical fabrics to ensure 100% throughput
and bounded packet delay. Each switch reconfiguration consumes
a constant period of time called reconfiguration overhead, during
which no packet can be transmitted across the switch. To minimize
the packet delay bound for an arbitrary traffic matrix, the number
of switch configurations in the schedule should be no larger than
the switch size . This is called minimum delay scheduling, where
the ideal minimum packet delay bound is determined solely by
the total overhead of the switch reconfigurations. A speedup in
the switch determines the actual packet delay bound, which de-
creases toward the ideal bound as the speedup increases. Our ob-
jective is to minimize the required speedup   under a given
actual packet delay bound. We propose a novel minimum delay
scheduling algorithm quasi largest-entry-first (QLEF) to solve this
problem. Compared with the existing minimum delay scheduling
algorithms MIN and -SCALE, QLEF dramatically cuts down
the required   bound. For example, QLEF only requires
      for   , whereas MIN and -SCALE
require     	 	 and 27.82, respectively. This gives
a significant performance gain of 52% over MIN and 36% over
-SCALE.
Index Terms—Optical switch, performance guaranteed
switching, reconfiguration overhead, scheduling, speedup.
I. INTRODUCTION
R ECENT progress on optical switching technologies[1]–[4] has enabled the implementation of high-speed
scalable switches with optical switch fabrics, as shown in Fig. 1.
These switches can efficiently provide huge switching capacity
as demanded by the backbone routers in the Internet. Since the
input–output modules are connected to the central switch fabric
by optical fibers, they can be distributed over several standard
telecommunication racks. This reduces the power consumption
for each rack, and makes the resulting switch architecture more
scalable.
On the other hand, optical switch fabric usually needs a non-
negligible amount of time to change its switch configuration.
This reconfiguration overhead is due to three factors [5]. First,
Manuscript received December 31, 2007; revised June 04, 2008. First pub-
lished April 14, 2009; current version published July 24, 2009. This paper was
presented in part at the IEEE GlobeCom’05, St. Louis, MO, Dec. 2005, and the
IEEE GlobeCom’06, San Francisco, CA, Dec. 2006.
B. Wu and K. L. Yeung are with the Department of Electrical and Electronic
Engineering, The University of Hong Kong, Pokfulam, Hong Kong (e-mail:
binwu@eee.hku.hk; e-mail: kyeung@eee.hku.hk).
P.-H. Ho is with the Department of Electrical and Computer Engineering,
University of Waterloo, Waterloo, ON, Canada N2L 3G1 (e-mail: pinhan@bbcr.
uwaterloo.ca).
X. Jiang is with the Department of Computer Science, Graduate School of
Information Science, Tohoku University, Aramaki Sendai 980-8579, Japan
(e-mail: jiang@ecei.tohoku.ac.jp).
Digital Object Identifier 10.1109/JLT.2008.2005552
Fig. 1. Scalable switch with an optical switch fabric.
the optical switch fabric needs time to change its interconnec-
tion pattern, and this time varies from 10 ns to several millisec-
onds depending on the switching technology adopted [1]–[4].
Second, time (10–20 ns or more [5]) is required to resynchronize
the optical transceivers and the switch fabric. Finally, because
optical signals may arrive at their corresponding input ports at
different times, time is also needed to align the clock in order to
avoid data loss.
During the reconfiguration period, no packet can be trans-
mitted across the switch fabric (i.e., tune-transmit separability
constraint [6], [7]). To achieve performance guaranteed
switching [8]–[11] (i.e., 100% throughput with bounded packet
delay), the switch fabric has to transmit packets at an internal
speed higher than the external line-rate, resulting in a speedup.
The amount of speedup is defined as the ratio of the internal
packet transmission rate to the external line-rate.
Assume each switch reconfiguration takes an overhead of
slots and each slot can accommodate one packet. Conventional
slot-by-slot scheduling methods may severely cripple the per-
formance of optical switches due to frequent reconfigurations.
Hence, the reconfiguration frequency needs to be reduced by
holding each configuration for multiple time slots. Time-slot as-
signment (TSA) [8]–[11] is a common approach to achieve this,
where a switch works in a pipelined four-stage cycle: traffic ac-
cumulation, scheduling, switching, and transmission, as shown
in Fig. 2. Stage 1 is for traffic accumulation. A traffic matrix
is obtained at the input buffers every time
slots. Each entry denotes the number of packets arrived at
input and destined to output . Assume the traffic has been
regulated to be admissible before entering the switch, i.e., the
entries in each row or column of [defined as a line of
] sum to at most . In Stage 2, a scheduling algorithm
0733-8724/$26.00 © 2009 IEEE
3454 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 16, AUGUST 15, 2009
Fig. 2. Timing diagram for packet switching.
computes a schedule consisting of at most configurations in
time slots. Each configuration is denoted by a permutation
matrix . If , input is
connected to output , and we say that covers entry . A
weight is assigned to each , indicating the number of slots
that should be kept for packet switching in Stage 3. To en-
sure 100% throughput, the set of configurations must cover
, i.e., for any .
Packet switching takes place in Stage 3, where the switch
fabric is reconfigured according to the configurations ob-
tained in Stage 2. Under a speedup , the duration of a time slot
is shortened by times. A shortened slot (with duration
of a regular slot) is called a compressed slot. The total holding
time of the switch configurations is compressed
slots, or regular slots. Since speedup cannot re-
duce the reconfiguration overhead, the total overhead for the
reconfigurations is regular slots. Combining reconfigura-
tion overhead and configuration holding time, Stage 3 requires
regular slots. To ensure 100% throughput,
the speedup must satisfy
(1)
Rearranging (1), we have the minimum required speedup as
(2)
where
(3)
(4)
We can see that the overall speedup consists of two factors
and . In particular, compen-
sates for hardware inefficiency caused by the reconfigu-
rations, and compensates for algorithmic inefficiency
caused by the bandwidth loss. During the holding time of a con-
figuration (which is determined by the scheduling algorithm),
some input–output connections become idle (earlier than others)
if their scheduled backlog packets are all sent. As a result, the
schedule will contain empty slots and this causes bandwidth
loss, or algorithmic inefficiency.
Stage 4 takes another slots to dispatch packets from the
output buffers to the output lines in the first-in-first-out (FIFO)
order. Without loss of generality, we define a flow as a series
of packets coming from the same input port and destined to the
same output port of the switch. Since packets in each flow follow
FIFO order in passing through the switch, there is no packet
out-of-order problem within the same flow. (But packets in dif-
ferent flows may interleave at the output buffers.)
Consider a packet arrived at the input buffer in the first slot of
Stage 1. It suffers the worst-case delay of slots in Stage 1 for
traffic accumulation, and another delay of slots in Stage 2 for
algorithmexecution. In theworstcase, thispacketwillexperience
a maximum delay of 2 slots in Stages 3 and 4 (assume it is sent
onto the output line in the last slot of Stage 4). Taking all the four
stages into account, the delay experienced by any packet at the
switchisboundedby3 slotsasshowninFig.2.Assumethat
the algorithm execution time is a constant. The packet delay
bound 3 is dominated by the traffic accumulation time .
According to (1), depends on the number of configurations in
the schedule (i.e., ), the speedup , and the efficiency of the
scheduling algorithm (which determines ).
To minimize the packet delay bound 3 , we first con-
sider an ideal case where an infinite speedup can be deployed
in the switch. Then, in (1) can be ignored and the
packet delay bound is 3 . Therefore, a
smaller will lead to a lower packet delay bound. For per-
formance guaranteed switching with an arbitrary traffic matrix
must be no less than the switch size . This is
because each configuration can cover at most entries, and
the entries in must be covered by at least con-
figurations [8], [10]. Consequently, the ideal minimum packet
delay bound is , which can only be obtained with the
minimum number of configurations in the schedule.
Accordingly, if a scheduling algorithm always ensures at most
configurations in the schedule, we call it a minimum
delay scheduling algorithm. In contrast, if an algorithm requires
configurations (in the worst-case), the packet delay
bound must be larger than the ideal bound ,
even if the speedup is infinite.
Though an infinite speedup is infeasible in practice, the actual
packet delay bound in minimum delay scheduling can be made
as close to the ideal bound as possible if the speedup
is large enough. On the other hand, a large speedup means a
high implementation cost of the switch. Therefore, a key issue
is to minimize the required speedup for a given actual packet
delay bound (or equivalently a given traffic accumulation time
). Under the requirement of , this translates to mini-
mizing in (1) or in (4), because and
are given parameters and thus in (2)–(3) becomes a
constant.
In essence, the above minimum delay scheduling problem
is equivalent to a matrix decomposition problem with the con-
straint = , where an traffic matrix is de-
WU et al.: MINIMUM DELAY SCHEDULING FOR PERFORMANCE GUARANTEED SWITCHES WITH OPTICAL FABRICS 3455
composed into configurations and the sum of the
weights is minimized. In fact, traffic matrix decomposition
is a classic problem [8]–[17]. Algorithms based on Hall’s the-
orem [14] or Birkhoff-von Neumann decomposition [12], [13],
[15]–[17] generate a large number of configurations (e.g.,
in Birkhoff-von Neumann decomposition), and
thus are only favorable in scheduling problems without recon-
figuration overhead. For scheduling problems with reconfigu-
ration overhead, greedy algorithms such as LIST [7], [8], [18]
and decompositions based on graph theory [7]–[9] are invented
with a smaller number of configurations in the schedule. The
impact of reconfiguration overhead on the scheduling perfor-
mance is also studied in [19]. A common objective of those
works is to minimize the schedule length which includes time
for both reconfigurations and packet transmission. Minimum
delay scheduling in this paper differs from those works by en-
suring , with the objective of minimizing speedup
under a given packet delay bound. We also notice that algorithm
GOPAL [20] ensures , but it is designed for average
performance instead of worst-case performance guarantee.
To the best of our knowledge, there are only two existing
minimum delay scheduling algorithms MIN [8] and -SCALE
[10] that can achieve performance guaranteed switching.
Though -SCALE generally outperforms MIN, its
bound is still too high. In this paper, a novel minimum delay
scheduling algorithm quasi largest-entry-first (QLEF) is pro-
posed. Compared with MIN and -SCALE, QLEF requires
the lowest bound. For example, QLEF cuts down the
bound by 52% over MIN and 36% over -SCALE for
a switch with size . The rest of the paper is organized
as follows. In Section II, QLEF is proposed. In Section III, we
derive the speedup bound for QLEF. The paper is concluded in
Section IV.
II. QLEF ALGORITHM
A. Largest-Entry-First (LEF) Procedure
In minimum delay scheduling, we need to find config-
urations and the corresponding weights
to cover an arbitrary admissible traffic matrix . From
(2)–(4), minimizing speedup (or ) is equivalent to
minimizing the sum of the weights . Intuitively,
each configuration can be constructed by always covering
largest not-yet-covered entries, each from a distinct row and
column of . In this way, those entries can share the same
large weight. This potentially saves the weights of subsequently
constructed configurations. So our basic idea is to always cover
the “largest” entry first.
Specifically, the configurations
are constructed one by one as follows. Large
entries in are first considered. If an entry
is covered by a particular configuration , we set to 1
in and to 0 in . For simplicity, we still use
to denote the updated traffic matrix, though some of its entries
may have been set to 0. Each can cover only one entry in
each row and column of . To avoid covering multiple
entries in the same line of by the same configuration, we
define a shadowing operation. Particularly, if a “largest” entry
Fig. 3. “Largest-Entry-First” and “shadow”. (a) LEF procedure and shadowing
operation. (b) Non-LEF procedure.
is to be covered by , we use two dashed lines to shadow
row and column of . The next “largest” entry can
only be selected in the remaining not-yet-shadowed part. As an
example, the largest entry in Fig. 3(a) is first selected
and is covered by . The first row and the first column of
are shadowed. is set to 1 and is set to 0. After
that, is selected as the largest entry in the remaining
not-yet-shadowed part of . We repeat the above steps
until large entries are selected. At this point, all the lines
of are shadowed and the construction of completes.
Then, we un-shadow by removing all the dashed lines,
and repeat the above process to construct the next configuration
. We call this procedure LEF.
Fig. 3 shows two possible schedules for the same 3 3 .
The schedule in Fig. 3(a) is obtained using the above LEF pro-
cedure. Entries and are covered by
with a large weight of 10. The remaining entries are covered
by and with small weights 2 and 1, respectively. The
sum of the three weights is 13. The schedule in Fig. 3(b) is gen-
erated by some non-LEF procedure. Entries
and are covered by , which requires a large weight
of 10. As a result, the not-yet-covered large entries and
may become the weights for and . Compared
with the LEF procedure, this non-LEF procedure gives a much
larger weight sum of 27.
Unfortunately, the above LEF procedure is unlike MIN [8]
and -SCALE [10] which can ensure that configurations
are always sufficient to cover all the entries in . This is
because LEF cannot prevent configuration overlaps, i.e., several
configurations may cover the same entry. An example is shown
in Fig. 4, where the schedule returned by the LEF procedure
consists of four configurations instead of the minimum three.
The entries that are covered more than once (i.e., configuration
overlaps) are circled in the figure.
B. QLEF Algorithm
QLEF algorithm is summarized in Fig. 5. It is designed to
rectify the configuration overlap problem of LEF. This qualifies
3456 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 16, AUGUST 15, 2009
Fig. 4. Configuration overlap in the LEF procedure.
Fig. 5. QLEF algorithm.
QLEF as a minimum delay scheduling algorithm. Specifically,
we use an reference matrix to record the
status of each entry . If is not-yet-covered, then
; otherwise . is initialized to an all matrix.
Assume that the configurations are sequentially constructed
from to . When a configuration is obtained, the corre-
sponding entries in both and are set to 0. The updated
and are then used to construct the next configuration.
Without loss of generality, we first focus on the construction
of the first half configurations .
Assume that both and are updated and un-shadowed.
The construction process of is similar to the LEF pro-
cedure, but we only select largest entries from
the not-yet-shadowed part of (instead of selecting en-
tries as in LEF). We call them selected-entries. For each se-
lected-entry, the corresponding entry values in both and
are set to 0. At the same time, the row and column it re-
sides in are shadowed in both and . The remaining
not-yet-shadowed part of is a matrix
defined as (see Fig. 6). We then construct a bipartite graph
[21]–[23] from (see the example in Fig. 7). Note that
a “0” in corresponds to a covered entry which will not be-
come an edge in . Then, we find a perfect matching [8] in
by performing maximum-size matching (MSM) [22] (to be
proved later). As defined in Fig. 7, a perfect matching is a subset
of edges where each vertex is incident on exactly one edge in
that subset. So, the perfect matching obtained in contains
edges. It corresponds to not-yet-covered en-
tries, called MSM-entries. We also set these entries to 0 in both
and . Combining the MSM-entries with the
selected-entries, we get entries, each in a distinct
row and column. is obtained by setting those entries
to 1 and all other entries to 0.
In the above procedure, the selected-entries
can always be properly chosen from the not-yet-covered entries
in , as explained below. In constructing , as a result
of constructing the previous configurations , each
line of has not-yet-covered entries, and covered
WU et al.: MINIMUM DELAY SCHEDULING FOR PERFORMANCE GUARANTEED SWITCHES WITH OPTICAL FABRICS 3457
Fig. 6. Reference matrix R in QLEF algorithm.
Fig. 7. Bipartite graph, perfect matching and maximum-size matching (M SM).
entries (denoted by 0 in both and ). Here we only need
to select not-yet-covered entries (each from a
distinct row and column). By referring to , the success of this
selection is ensured because .
Next we prove that a perfect matching containing
edges always exists in . This is ensured by the following
Theorem (taken from theorem 7 of [8]).
Theorem: For a bipartite graph with
, there always exists a perfect matching in if its
minimum degree is greater than .
Since configurations are constructed ahead of , there
are at most (i.e., covered entries) in each line of . Because
is a matrix, the minimum degree of the
corresponding bipartite graph is at least
. Therefore, a perfect matching containing
edges exists in according to the Theorem.
When is obtained, we un-shadow and , and re-
peat the above procedure (i.e., Steps 2a–2c in Fig. 5) to construct
the next configuration. In Fig. 5, is an auxiliary variable that
helps to pick up the first-met largest entry as the weight for
. This ensures that the corresponding entries in can
be properly covered by with a sufficiently large weight.
So far, we have discussed all the key operations of QLEF
algorithm. QLEF also has two additional features. First, since
requires , we use the
above approach (i.e., Step 2 in Fig. 5) to construct only the first
configurations. This ensures that the se-
lected-entries in can always be properly chosen. Second,
after the first configurations are obtained, we find the
largest in the updated and use it as the constant weight
for each subsequent configuration (so as to cover the remaining
small entries). Because the bipartite graph of is always reg-
ular [8], [10], [21] after a configuration is constructed, each sub-
sequent configuration in can be obtained by
performing maximum-size matching in the updated .
Summarily, QLEF guarantees that there is no configuration
overlap in scheduling . So, can always be cov-
ered by configurations. The time complexity of QLEF is
3458 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 16, AUGUST 15, 2009
Fig. 8. Example of QLEF. The circled entries are select-entries.
dominated by running the maximum-size matching algorithm
[22] [which has a complexity of ] for times, re-
sulting in the same time complexity of as MIN [8] and
-SCALE [10]. But edge-coloring [8], [23] and partitioning in
MIN and -SCALE do not appear in QLEF.
C. An Example
To cover the 7 7 traffic matrix in Fig. 8, QLEF gen-
erates 7 configurations with a weight sum of 58. As an example,
the construction process of in Fig. 8 is detailed in Fig. 7. For a
given traffic matrix, we can also find an optimal schedule using
the Integer Linear Program (ILP) formulated in Appendix A.
(But ILP is generally impractical due to its extremely long ex-
ecution time.) Appendix A gives an ILP-based schedule for the
traffic matrix in Fig. 8. The weight sum 58 in QLEF is only
slightly above the ILP-based result.
III. SPEEDUP BOUND
In QLEF, the configurations are sequentially constructed
from to . We first focus on the first half configurations
. Fig. 9 shows the conceptual
QLEF scheduling procedure. In Fig. 9(a), we use a “scheduling
trace” to represent the trend of values covered in the ordered
configurations. Although QLEF gives priority in covering the
largest entry in the not-yet-shadowed part of , the sched-
uling trace is generally a wavelike curve rather than a mono-
tonically decreasing curve. This is due to the shadowing oper-
ation. In Fig. 9(b), assume that entry is first chosen as a se-
lected-entry. Then, entries and are shadowed, where
and . Next, entry is chosen as a selected-entry even
though or . So and can only be covered by other
configurations constructed later. On the other hand, we can see
that the selected-entries covered by the same configuration are
always chosen in a descending order of their entry values, e.g.,
in Fig. 9(b).
We now focus on the construction of configuration .
Note that (the weight of ) is also an entry in
, and it is the first selected-entry in . Hereafter,
we treat as an entry in rather than a weight. Let
denote the set of ordered configurations
constructed earlier than . If is shadowed in the con-
struction process (Step 2b of Fig. 5) of ,
then is called an s-configuration. Otherwise, is called a
g-configuration. Among the configurations ,
assume there are g-configurations and s-configura-
tions, as shown in Fig. 9(a).
WU et al.: MINIMUM DELAY SCHEDULING FOR PERFORMANCE GUARANTEED SWITCHES WITH OPTICAL FABRICS 3459
Fig. 9. Conceptual QLEF scheduling procedure.
A. General Idea for Speedup Bound Formulation
In the original , we define an entry larger than or
equal to as a large-entry (or LE). Let be the min-
imum number of LEs covered by the first configurations
. These LEs reside in lines (rows or
columns) of , and the line with the maximum number of
LEs must contain at least the average number of LEs.
As a result, the smallest LE in this line must be smaller than
or equal to the th largest entry of this line. Yet, this
smallest LE is not smaller than . Because the maximum
line sum of is , according to Lemma 1 in Appendix B,
we have
(5)
On the other hand, because is shadowed by s-con-
figurations, from Lemma 2 in Appendix B, we have
(6)
Combining (5) and (6), we can bound as follows:
(7)
The inequality in (7) indicates that no matter what is the value of
, the bound always holds because we have taken
the worst case into consideration (i.e., the “max” function).
For the remaining configurations constructed
in Step 3 of Fig. 5, QLEF uses a small constant as their common
weight. Since the weights in QLEF follow a monotonically de-
creasing order as shown by the dashed line in Fig. 9(a), this con-
stant is not larger than any weight of the first config-
urations. In fact, it can be bounded by the weight of .
That is
(8)
Consequently, we can replace the weights in (4) by the bounds
in (7) & (8) to get an upper-bound for . (Note that
in (4) for minimum delay scheduling.)
B. A Simple Speedup Bound
To get an bound, we still need to determine in
(7) which denotes the minimum number of LEs covered by
. In this part, we first count using a simple
approach, and a more in-depth analysis will be carried out in
Part C to render a refined speedup bound. The simple speedup
bound provides a reference to which we can check how much
additional gain is achieved by the refined speedup bound. It
also helps to better understand the refined speedup bound in
Part C.
According to QLEF, all the selected-entries in any g-configu-
ration must be LEs, and each s-configuration must cover at least
one LE in the same line as . However, due to the shadowing
operation, it is generally difficult to count how many additional
LEs in other lines are covered by an s-configuration. Here we
simply ignore the LEs contributed by the s-configurations,
and only count those LEs (or selected-entries) contributed by the
g-configurations. Then, can be minimized only if the
g-configurations are the last (out of the first ) configurations,
or . This is because decreases
as increases, and thus the number of selected-entries in each
subsequent configuration becomes less and less. As a result, the
minimum value of is
.
Note that all the counted LEs are contributed by the g-config-
urations, and thus none of them resides in the same line as .
In other words, these LEs reside in lines instead of
lines of . Replacing in (7) by and substituting
into (7), we get
(9)
3460 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 16, AUGUST 15, 2009
Fig. 10.   bound of the minimum delay scheduling algorithm s.
This is equivalent to (10), shown at the botom of the page. Conse-
quently, can be bounded by (11), shown at the bottom of
the page. The bound in (11) is plotted in Fig. 10 by a dashed line.
C. A Refined Speedup Bound
A lower bound can be obtained if the LEs cov-
ered by the s-configurations are judiciously counted. As-
sume is a set of consecutive s-configurations and is
the first g-configuration after . According to Lemma 3 in
Appendix B, the number of LEs covered by each s-configuration
is at least half of the number of the selected-entries
in .
In Fig. 9(a), the g-configurations and the s-configu-
rations may line up in any order. From Lemma 4 in Appendix B,
in order to minimize , the s-configurations should be
consecutively located at either the very beginning or the very
end of the configuration sequence .
Case 1: The s-configurations are consecutively located
at the very end of the configuration sequence . In
this case, all the selected-entries covered by the g-configu-
rations are LEs. Since no g-configuration follows the
s-configurations, the number of LEs contributed by the s-con-
figurations is trivial and is ignored when counting . So,
.
These LEs reside in lines of . Replacing in (7)
by and substituting into (7), we have
(12) and (13), shown at the bottom of the page.
Case 2: The s-configurations are consecutively located
at the beginning of the configuration sequence .
In this case, the first g-configuration (i.e., ) has
selected-entries. Therefore, each s-configuration
will cover at least LEs according to
Lemma 3 in Appendix B. Taking the LEs (or selected-entries)
covered by the g-configurations into account, we get
by simple calculation. From (7)
we have (14), shown at the bottom of the page. Again, this is
equivalent to
(15)
(10)
(11)
(12)
or
(13)
(14)
WU et al.: MINIMUM DELAY SCHEDULING FOR PERFORMANCE GUARANTEED SWITCHES WITH OPTICAL FABRICS 3461
Combining (13), (15), and (8), we get the bound for in
(16), shown at the bottom of the page. Then, we can replace the
weights in (4) by the bound in (16) to find the refined
bound for QLEF.
D. Performance Comparison and Discussion
Fig. 10 shows the bounds for the three min-
imum delay scheduling algorithms MIN [8], -SCALE
[10] and QLEF. The original bound for MIN algorithm is
in [8], which is refined in [10] to
produce the saw-toothed curve in Fig. 10. A tighter bound is
then provided by -SCALE algorithm [10]. This is followed
by the two QLEF bounds derived in this paper using (11) and
(16) respectively. We can see that if the LEs contributed by the
s-configurations are judiciously counted, a further cut of 10%
in bound can be achieved (i.e., solid vs dashed QLEF
bound in Fig. 10).
As an example, we consider a switch with size .
MIN requires and -SCALE re-
quires . However, QLEF only requires
. This gives a cut of 52% over MIN and 36%
over -SCALE.
QLEF is designed for delay sensitive applications, where the
packet delay bound is close to the ideal minimum bound of
. Note that two performance guaranteed scheduling
algorithms ADAPT and SRF are proposed in [9]. Both of them
are self-adaptive to the given switch parameters ( and )
by generating a proper number of configurations to mini-
mize the required speedup. However, ADAPT and SRF require
configurations in the schedule and thus are not min-
imum delay scheduling algorithms. Though they can generate
a schedule with slightly larger than (e.g.,
, etc.), the required speedup is significantly larger than
that required by QLEF. On the other hand, if the application is
not delay sensitive and is allowed, then minimum
delay scheduling is not necessary and ADAPT/SRF may be fa-
vorable. Therefore, QLEF complements ADAPT and SRF for
delay sensitive applications.
IV. CONCLUSION
Recent progress of optical switching technologies has en-
abled the implementation of high-speed scalable switches with
optical fabrics. Due to the reconfiguration overhead, packet
delay can be minimized by using the minimum number of
switch configurations (where is the switch size)
to schedule the traffic (i.e., minimum delay scheduling). In
general, minimum delay scheduling can be formulated as a
traffic matrix decomposition problem with the constraint of
. The traffic matrix is decomposed into the weighted
sum of nonoverlapping permutation matrices. In this paper,
a novel minimum delay scheduling algorithm QLEF (Quasi
Largest-Entry-First) was proposed. It minimizes the required
speedup for achieving performance guaranteed switching
(i.e., 100% throughput with bounded packet delay), while
ensuring the minimum number of configurations in
the schedule. Compared with the two existing minimum delay
scheduling algorithms MIN and -SCALE, QLEF dramati-
cally cuts down the required speedup bound with the same time
complexity of .
Although QLEF is presented based on optical switch sched-
uling, it may find wide applications in similar scheduling prob-
lems, such as those in Sattelite Switched Time-Division Mul-
tiple Access (SS/TDMA) [15], [20], [24], Time-Wavelength In-
terleaved Networds (TWIN) [25] and sensor surveillance net-
works [26].
APPENDIX A
ILP FORMULATION
To cover a given matrix , we can use
the ILP below to construct an optimal schedule consisting of
configurations with corresponding weights
. In the ILP, and are general integer
variables [27], is a binary variable, and is the maximum
line sum of the matrix
(17)
s.t.,
(18)
(19)
(20)
(21)
(22)
(16)
3462 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 16, AUGUST 15, 2009
Fig. 11. ILP-based schedule for C(T) in Fig. 8.
The sum of the weights is minimized in (17). Constraints
(18) and (19) define each configuration as a permutation
matrix. Constraints (20) and (21) define the auxiliary variable
. For an arbitrary entry takes 0 if ,
and if . Then, constraint (22) ensures that all the
entries in are properly covered by the configurations.
Note that the ILP formulated in (17)–(22) allows configuration
overlaps in the schedule.
We implement the above ILP in ILOG CPLEX 10.0 [27] on
a standard Pentium IV 2.2 GHz computer with 500 M memory.
Fig. 11 shows an ILP-based solution for in Fig. 8 with
. It is obtained in 34.79 h (125236.29 seconds) with a
gap-to-optimality of 45.78% (the optimization stops due to out
of memory), and the weight sum is 54. We also slightly modify
the above ILP to forbid configuration overlaps. Then, an optimal
solution can be obtained in 43.69 h (157284.51 seconds) with a
weight sum of 53.
APPENDIX B
LEMMAS
Lemma 1: If a set of nonnegative entries
sum to at most and is the th largest
entry in , then .
Proof: Assume that the entries in are arranged in a
monotonically decreasing order as . Be-
cause can reach its maximum possible value
only when and
. We then have
Lemma 2: In QLEF, if is shadowed by s-configura-
tions, then
Proof: In QLEF, is shadowed in the construction of
each s-configuration. Because can only be shadowed by
an LE in the same line with it, the s-configurations collectively
cover at least LEs in the same line with .
Assume that resides in row and column . Some of
those LEs may reside in row whereas others reside in column
(because a line may refer to either a row or a column). Without
loss of generality, we assume that out of the LEs reside in
row and the other LEs reside in column . As a result,
is (at most) the th largest entry in row and the
th largest entry in column . Because the entries in
either row or column sum to at most , from Lemma 1 we
have
WU et al.: MINIMUM DELAY SCHEDULING FOR PERFORMANCE GUARANTEED SWITCHES WITH OPTICAL FABRICS 3463
Fig. 12. s-configurations may also cover a considerable number of LEs.
That is
Lemma 3: Assume that is a set of consecutive
s-configurations, and is the first g-configuration after .
Then, any s-configuration must cover at least
LEs, where is half of the number of the selected-entries
covered by .
Proof: Since is a g-configuration, any selected-entry
covered by is an LE. Because is constructed earlier than
and is not covered by , either 1) all the selected-entries
covered by are not smaller than , or 2) is shadowed in
construction if some smaller selected-entries are covered by
. (Otherwise should first cover instead of those smaller
selected-entries.)
In case 1), all the selected-entries covered by are LEs,
because each of them is not smaller than (which is an LE).
Since is constructed earlier than , it contains more se-
lected-entries than . Therefore, the number of LEs covered
by is larger than . In case 2), any selected-entry covered
by must have been shadowed in construction. Since a
selected-entry in can shadow at most two smaller/equal se-
lected-entries covered by (in row and column directions, re-
spectively), must cover at least LEs, where is half of the
number of the selected-entries covered by . Obviously, this
is true for the first g-configuration after .
Lemma 4: To minimize (the total number of LEs), all the
s-configurations should be consecutively located at either the
very beginning or the very end of the configuration sequence
.
Proof: In Fig. 12, let -axis denote the number of se-
lected-entries covered in each configuration, and -axis denote
the configuration sequence. Without loss of generality, assume
there are three sets of consecutive s-configurations, denoted by
and . (Others are g-configurations.) Particularly,
and contain and s-configurations respectively,
and locates at the very end of the configuration sequence
. The first s-configuration in covers se-
lected-entries, and the first g-configuration after covers
selected-entries. Similarly, the first s-configuration
in covers selected-entries, and the first g-configuration
after covers selected-entries. From Lemma 3,
each s-configuration in and covers at least
and LEs respectively, as shown by the blank
rectangles in and (see Fig. 12). However, we do not
count any LEs covered by , because is right before
and there is no g-configuration after it. Although each
s-configuration in covers (at least) one LE in the same line
with , it is trivial and is ignored when counting .
We first consider and . Since all selected-entries cov-
ered by g-configurations are LEs, minimizing is equivalent
to maximizing the number of selected-entries in the shadowed
areas in and . That is
Let be the number of g-configurations between and .
We have . So, the above formula is equiv-
alent to
For any given and , the above value is maximized if
takes the maximum possible value and . This entails that
and should be consecutively located at the very begin-
ning of the configuration sequence . It is easy to
generalize this conclusion to the case where more sets of con-
secutive s-configurations are involved.
We still need to consider in Fig. 12. In fact, the
configurations may also line up as shown in
Fig. 13, where the g-configurations are in the middle and the
s-configurations are at the both ends. (Assume that
s-configurations are consecutively located at the beginning of
to minimize as discussed earlier.) In Fig. 13,
3464 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 27, NO. 16, AUGUST 15, 2009
Fig. 13. Then    s-configurations should be located at either end of
        .
minimizing is equivalent to maximizing the number of
selected-entries in the two blank echelons. That is
Because the above formula is a quadratic function of , it can be
maximized only if or . From Fig. 13, obviously all
the s-configurations should be consecutively located at
either the very beginning or the very end of
the configuration sequence . The specific location
is determined by the values of and .
REFERENCES
[1] J. E. Fouquet, “A compact, scalable cross-connect switch using total
internal reflection due to thermally-generated bubbles,” in Proc. IEEE
LEOS Annual Meeting, Dec. 1998, pp. 169–170.
[2] L. Y. Lin, “Micromachined free-space matrix switches with sub-
milli-second switching time for large-scale optical crossconnect,”
Proc. OFC’98 Tech. Digest, pp. 147–148, Feb. 1998.
[3] O. B. Spahn, C. Sullivan, J. Burkhart, C. Tigges, and E. Garcia, “GaAs-
based microelectromechanical waveguide switch,” in Proc. 2000 IEEE/
LEOS Intl. Conf. Opt. MEMS, Aug. 2000, pp. 41–42.
[4] A. J. Agranat, “Electroholographic wavelength selective crosscon-
nect,” in Proc. 1999 Dig. LEOS Summer Topical Meetings, July 1999,
pp. 61–62.
[5] K. Kar, D. Stiliadis, T. V. Lakshman, and L. Tassiulas, “Scheduling
algorithms for optical packet fabrics,” IEEE J. Sel. Areas Commun.,
vol. 21, no. 7, pp. 1143–1155, Sep. 2003.
[6] G. R. Pieris and G. H. Sasaki, “Scheduling transmissions in WDM
broadcast-and-select networks,” IEEE/ACM Trans. Netw., vol. 2, no.
2, pp. 105–110, Apr. 1994.
[7] H. Choi, H.-A. Choi, and M. Azizoglu, “Efficient scheduling of trans-
missions in optical broadcast networks,” IEEE/ACM Trans. Netw., vol.
4, no. 6, pp. 913–920, Dec. 2006.
[8] B. Towles and W. J. Dally, “Guaranteed scheduling for switches with
configuration overhead,” IEEE/ACM Trans. Netw., vol. 11, no. 5, pp.
835–847, Oct. 2003.
[9] B. Wu, K. L. Yeung, M. Hamdi, and X. Li, “Minimizing internal
speedup for performance guaranteed switches with optical fabrics,”
IEEE/ACM Trans. Netw. vol. 17, no. 2, pp. 632–645, Apr. 2009.
[10] B. Wu and K. L. Yeung, “Scheduling optical packet switches with
minimum number of configurations,” Proc. IEEE ICC’05, vol. 3, pp.
1830–1835, May 2005.
[11] X. Li and M. Hamdi, “On scheduling optical packet switches with re-
configuration delay,” IEEE J. Sel. Areas Commun., vol. 21, no. 7, pp.
1156–1164, Sep. 2003.
[12] G. Birkhoff, “Tres observaciones sobre el algebra lineal,” Univ. Nac.
Tucumán Rev. Ser. A., vol. 5, pp. 147–151, 1946.
[13] J. von Neumann, “A certain zero-sum two-person game equivalent to
the optimal assignment problem,,” in Contributions to the Theory of
Games. Princeton, NJ: Princeton Univ. Press, 1953, vol. 2, pp. 5–12.
[14] M. Hall, Combinatorial Theory. Waltham, MA: Blaisdell Publishing
Co., 1967.
[15] T. Inukai, “An efficient SS/TDMA time slot assignment algorithm,”
IEEE Trans. Commun., vol. COM-27, no. 10, pp. 1449–1455, 1979.
[16] C. S. Chang, W. J. Chen, and H. Y. Huang, “Birkhoff-von Neumann
input buffered crossbar switches,” Proc. IEEE INFOCOM’00, vol. 3,
pp. 1614–1623, Mar. 2000.
[17] J. Li and N. Ansari, “Enhanced Birkhoff-von Neumann decomposition
algorithm for input queued switches,” IEE Proc.-Commun., vol. 148,
no. 6, pp. 339–342, Dec. 2001.
[18] E. G. Coffman and P. J. Denning, Operation Systems Theory. Engle-
wood Cliffs, NJ: Prentice-Hall, 1973.
[19] M. Azizoglu, R. A. Barry, and A. Mokhtar, “Impact of tuning delay
on the performance of bandwidth-limited optical broadcast networks
with uniform traffic,” IEEE J. Se. Areas Commun., vol. 14, no. 5, pp.
935–944, Jun. 1996.
[20] S. Gopal and C. K. Wong, “Minimizing the number of switchings
in a SS/TDMA system,” IEEE Trans. Commun., vol. COM-33, pp.
497–501, June 1985.
[21] R. Diestel, Graph Theory, 2nd ed. New York: Spring-Verlag, 2000.
[22] J. E. Hopcroft and R. M. Karp, “An  algorithm for maximum
matching in bipartite graphs,” Soc. Ind. Appl. Math. J. Comput., vol.
2, pp. 225–231, 1973.
[23] R. Cole and J. Hopcroft, “On edge coloring bipartite graphs,” SIAM J.
Comput., vol. 11, pp. 540–546, Aug. 1982.
[24] Y. Ito, Y. Urano, T. Muratani, and M. Yamaguchi, “Analysis of a
switch matrix for an SS/TDMA system,” Proc. IEEE, vol. 65, no. 3,
pp. 411–419, Mar. 1977.
[25] K. Ross, N. Bambos, K. Kumaran, I. Saniee, and I. Widjaja, “Sched-
uling bursts in time-domain wavelength interleaved networks,” IEEE
J. Sel. Areas Commun., vol. 21, no. 9, pp. 1441–1451, Nov. 2003.
[26] H. Liu, P. Wan, C.-W. Yi, X. Jia, S. Makki, and N. Pissinou, “Max-
imal lifetime scheduling in sensor surveillance networks,” Proc. IEEE
INFOCOM’05, vol. 4, pp. 2482–2491, Mar. 2005.
[27] CPLEX Solver, [Online]. Available: www.ilog.com
Bin Wu (S’04–M’07) received the B.Eng. degree
in electrical engineering from Zhe Jiang University,
Hangzhou, China, in 1993, and M.Eng. degree in
communication and information engineering from
University of Electronic Science and Technology
of China, Chengdu, China, in 1996. He is currently
working toward the Ph.D. degree in electrical and
electronic engineering at University of Hong Kong,
Pokfulam, Hong Kong. His research is focused on
optical networking.
In 1996, he joined Huawei Tech. Co. Ltd., where
he was the Department Manager of TI-Huawei DSP co-lab during 1997–2001.
Kwan L. Yeung (S’93–M’95–SM’99) received
the B.Eng. and the Ph.D. degrees in information
engineering from The Chinese University of Hong
Kong, Shatin, New Territories, Hong Kong, in 1992
and 1995, respectively.
He joined the Department of Electrical and
Electronic Engineering, University of Hong Kong,
Hong Kong, in July 2000, where he is currently
an Associate Professor. His current research in-
terests include next-generation Internet, packet
switch/router design, all-optical networks, and
wireless data networks.
WU et al.: MINIMUM DELAY SCHEDULING FOR PERFORMANCE GUARANTEED SWITCHES WITH OPTICAL FABRICS 3465
Pin-Han Ho received his B.Sc., M.Sc., and Ph.D. de-
grees from the Electrical and Computer Engineering
Department, National Taiwan University, Taipei City,
Taiwan, in 1993, 1995, and 2002, respectively.
He joined the E&CE department in the University
of Waterloo, Waterloo, ON, Canada, where he is
currently an Associate Professor. He has authored or
coauthored more than 150 refereed technical papers
and several book chapters. He is the coauthor of a
book on optical networking and survivability. His
current research interests include a wide range of
topics in broad-band networks, including survivable network design, wireless
networks such as IEEE 802.16 networks and network security.
He is the recipient of the Distinguished Research Excellent Award and Early
Researcher Award, in 2005, the Best Paper Award in SPECTS’02, ICC’05, and
ICC’07, and the Outstanding Paper Award in HPSR’02.
Xiaohong Jiang (M’03) received the B.S., M.S., and
Ph.D. degrees from Xidian University, Xi’an, China,
in 1989, 1992, and 1999 respectively.
Currently, he is an Associate Professor in the
Department of Computer Science, Graduate School
of Information Science, Tohoku University, Aramaki
Sendai, Japan. Before joining Tohoku University, he
was an Assistant Professor in the Graduate School
of Information Science, Japan Advanced Institute
of Science and Technology (JAIST), Ishikawa, from
October 2001 to January 2005. He was a Japan
Society for the Promotion of Science (JSPS) Postdoctoral Research Fellow
at JAIST from October 1999 to October 2001. He was a Research Associate
in the Department of Electronics and Electrical Engineering, University of
Edinburgh, Edinburgh, U.K., from March 1999 to October 1999. He has
authored or coauthored more than 50 refereed technical papers in these areas.
His current research interests include optical switching networks, (WDM)
networks, interconnection networks, IC yield modeling, timing analysis of
digital circuits, clock distribution, and fault-tolerant technologies for very large
scale integration/wafer scale integration (VLSI/WSI).
