Scheduling optical packet switches with minimum number of configurations by Wu, B & Yeung, KL
Title Scheduling optical packet switches with minimum number ofconfigurations
Author(s) Wu, B; Yeung, KL
Citation Ieee International Conference On Communications, 2005, v. 3, p.1830-1835
Issued Date 2005
URL http://hdl.handle.net/10722/54059
Rights Creative Commons: Attribution 3.0 Hong Kong License
Scheduling Optical Packet Switches with Minimum 
Number of Configurations 
Bin Wu and Kwan L. Yeung 
Dept. of Electrical and Electronic Engineering 
The University of Hong Kong 
Pokfulam, Hong Kong 
E-mail: {binwu, kyeung}@eee.hku.hk 
 
Abstract—In order to achieve the minimum traffic delay in a 
performance guaranteed optical packet switch (OPS) with 
reconfiguration overhead, the switch fabric has to use the 
minimum number of configurations (i.e. N configurations where N 
is the switch size) for traffic scheduling. This requires a very high 
speedup in the switch fabric to compensate for the loss in 
scheduling efficiency. The high speedup requirement makes the 
idea of using N configurations (to schedule the traffic) impractical 
under current technology. In this paper, we propose a new 
scheduling algorithm called αi-SCALE to lower the speedup 
required. Compared with the existing MIN algorithm [5], αi-
SCALE succeeds in pushing the speedup bound (i.e. worst-case 
speedup requirement) to a much lower level. For example, when 
N=200, the speedup bound required to compensate the loss in 
scheduling efficiency is 30.75 for MIN, whereas 23.45 is sufficient 
for our αi-SCALE. 
Keywords-Optical packet switch(OPS); speedup; performance 
guaranteed scheduling; reconfiguration overhead. 
I.  INTRODUCTION 
The rapid progress on IP and WDM research has resulted in 
a coalescence of these two technologies, leading to strong and 
wide interests in optical packet switches (OPS). OPS can offer 
many advantages at relatively low cost, such as scalability, high 
bandwidth utilization, high line-rate and low power 
consumption. Despite of the recent achievements on optical 
switching technologies [1-3], a major implementation hurdle of 
OPS is its relatively large reconfiguration overhead, which is 
the amount of idle time required to change the OPS 
configuration state because of some time-consuming operations 
involved, such as mechanical settling and synchronization.  
Lying in the core of an optical packet switch is the packet 
scheduling algorithm. Following the approach of batch-based 
time slot assignment (TSA), many efficient algorithms, namely, 
EXACT [5,7], DOUBLE [5], MIN [5] and ADAPTIVE [6], are 
designed for packet scheduling. With the batch-based TSA, 
incoming packets are periodically (say, every T time slots, i.e. 
batch size = T) accumulated at the input ports of an OPS to form 
a traffic matrix C(T). Then a scheduling algorithm (such as any 
one above) is used to determine a set of switch configurations 
for forwarding the collected packets to the output ports. If all the 
packets in C(T) can be forwarded to their corresponding output 
ports within a bounded (i.e. worst-case) delay, the resulting OPS 
is called a performance guaranteed OPS, and the corresponding 
algorithm is called a performance guaranteed scheduling 
algorithm. Notably, existing algorithms, EXACT, DOUBLE, 
MIN and ADAPTIVE, all fall into this category. They differ in 
requiring different number of configurations to schedule the 
traffic matrix. 
To realize a performance guaranteed OPS, the OPS switch 
fabric, which is responsible for the actual delivery of packets 
from input ports to output ports, must operate at a higher speed 
than each individual input/output line. This speedup is used to 
compensate for the idle time due to switch reconfiguration, and 
the possible loss of scheduling efficiency due to a particular 
scheduler implementation [4-6] (also refer to Section II). 
Therefore, the worst-case speedup requirement of a performance 
guaranteed OPS (i.e. speedup bound) depends on the scheduling 
algorithm adopted.  
Because each reconfiguration is associated with an overhead, 
scheduling OPS traffic with minimum number of configurations 
can minimize traffic delay. For performance guaranteed OPS, N 
(where N is the switch size) is the minimum number of 
configurations required. This is because an N×N traffic matrix 
C(T) has N2 entries, and each configuration can cover at most N 
of them [5]. So, at least N configurations are needed. On the 
other hand, as pointed out in [5,6], using less number of 
configurations makes the scheduling more inefficient (i.e. the 
packet transmission in each configuration cannot fully utilize the 
available switch bandwidth), and thus requires a higher switch 
fabric speedup than algorithms using more configurations. 
Among all the proposed performance guaranteed scheduling 
algorithms [5-7], MIN [5] has the unique advantage of providing 
the minimum bounded traffic delay because it requires only N 
configurations for any traffic matrix C(T). But, as discussed 
above, the speedup required by MIN is extremely high, and 
seems to be prohibitive under current technology. 
Obviously, scheduling OPS traffic with the minimum 
number of N configurations can be practical only if some new 
scheduling algorithm with lower speedup bound is available, or 
the current difficulties of high speedup implementation are 
overcome. In this paper, we put our effort on designing a more 
efficient algorithm than MIN. The new scheduling algorithm we 
proposed is called αi-SCALE. We show that for small and 
medium size optical packet switches (N<100), αi-SCALE 
complements the performance of MIN, requiring a lower 
speedup in roughly half of the switch size range. When switch 
size is large, e.g. N=200, the speedup bound required to 
compensate for the inefficient scheduling is 30.75 for MIN, 
This work was supported by Competitive Earmarked Research Grant 
HKU 7048/02E. 
18300-7803-8938-7/05/$20.00 (C) 2005 IEEE
whereas 23.45 is sufficient for our αi-SCALE. Besides, αi-
SCALE provides a new matrix decomposition method using N 
permutation matrices, which can be useful for future research in 
this area. 
II. ARCHITECTURE 
The same OPS switch architecture as in [4-6] is assumed in 
our work. In this architecture, batch-based TSA approach is 
applied to determine a set of N configurations to deliver the 
collected packets. Fig. 1 shows the scheduling procedure in four 
stages. In Stage 1, incoming packets are accumulated in the 
input buffers over T time slots to construct the traffic matrix 
C(T). Each entry cij of C(T) denotes the number of packets 
received at input i and destined to output j. Assume all the line 
sums (either row or column sum) of C(T) are not larger than T. 
The scheduling algorithm takes H time slots in Stage 2 to 
generate N configurations P1, …, PN to cover 1  C(T). 
Configuration Pk ={p(k)ij} is an N×N matrix with at most a single 
“1” in each line (row or column). p(k)ij=1 indicates that a packet 
can be sent from input i to output j; p(k)ij = 0 otherwise. Pk is 
called a perfect matching if it has exactly N “1” elements. In 
Stage 3, the switch fabric is reconfigured according to these N 
configurations. An internal speedup S is applied to ensure that 
this stage occupies only T regular slots. After the speedup is 
applied, the switch fabric holds each Pk for φk compressed slots 
for packet transmission. Finally in Stage 4 packets are sent onto 
the output lines from output buffers (in T slots).  
From the tagged packet in Fig. 1, we can see that the 
bounded delay of any packet is 2T+H slots. Assume each switch 
reconfiguration takes δ regular slots and T>Tmin=δN. Since δN 
slots must be used to reconfigure the switch for N times, only T-
δN slots are left for transmitting C(T) in Stage 3. So, a speedup 
factor denoted by Sreconfigure=T/(T-δN) is necessary to 
compensate solely for the idle time caused by reconfiguration. 
At the same time, the scheduling algorithm may produce many 
empty slots (i.e. underutilize the bandwidth provided by the 
configuration). Thus another speedup factor, Sschedule=(1/T)∑Nk=1
φ k, is required to compensate solely for the inefficient 
scheduling. The overall internal speedup S is then given by S= 
Sreconfigure×Sschedule= TSschedule/(T-δN) [5-6]. 
                                                           
1 C(T) is covered by configurations P1, …, PK, each weighted by a non-
negative integer φ1, …, φK, if and only if ∑Kk=1φk p(k)ij≥ cij for any i,j∈{1, …, 
N}. Note that Pk ={p(k)ij}. 
III. ALGORITHM 
A. General Idea 
αi-SCALE takes a similar framework as MIN [5] but differs 
in its underlying design principle. Fig. 2 summarizes αi-SCALE 
which is detailed in Parts B&C. The idea is to schedule large 
entries in C(T) first. The execution of αi-SCALE consists of an 
inner-loop iteration (Steps 4-7) embedded inside an outer-loop 
iteration (Steps 2-8). In the i’th outer-loop iteration, αi-SCALE 
uses a threshold T/αi to identify large entries in C(T), where α is 
a real number. Note that when entering the i’th outer-loop 
iteration, the algorithm has scheduled all the entries larger than 
T/αi-1 in previous steps and thus they have been converted to 
zeros in C(T). In the i’th outer-loop iteration, αi-SCALE selects 
large entries that satisfy T/αi-1≥cij>T/αi for scheduling.  
After that, αi-SCALE performs an edge-coloring [8] based 
on the selected large entries. Then, each color is scheduled by 
using two configurations in the inner-loop iteration, where each 
configuration covers half of the edges of the particular color. In 
addition, αi-SCALE determines at most N/4 configurations in its 
outer-loop iterations (Steps 2-8), and leaves the task of 
determining the remaining configurations (for small entry 
scheduling) to Step 9. The above two mechanisms are taken to 
ensure that the N configurations can be properly constructed as 
non-overlapping perfect matchings (no any two of them cover 
the same entry of C(T)).2  
The key issue of αi-SCALE is to determine a most suitable 
(α,m) pair (to be discussed more later) for any given switch size 
N, so that the resulting speedup factor Sschedule can be minimized. 
We define a scale function 
scheduleminimizesandsuitsbest
),( SNα
iiNf α= . 
Based on this scale function, we first calculate the best values of 
(α,m) for any given switch size N, and then substitute them into 
αi-SCALE shown in Fig. 2 to schedule C(T). (α,m) can be found 
offline, as discussed in Part C. They are constants for any 
specific N. The online execution of αi-SCALE is dominated by 
N maximum-size matchings, resulting in a total time complexity 
of O(N3.5). 
B. Design Principle 
We first define 
 1)( −= ii αγ  and = −1)( iTi αω .                     (1) 
Since all the line sums of C(T) are not larger than T, and 
    iiii αααγ ≥≥+−=+ 111)( , 
then in each line of C(T), there can be at most γ(i) unscheduled 
entries greater than the threshold T/αi in the i-th outer-loop 
iteration. These unscheduled large entries are indicated by “1”s 
in a matrix L for each loop (other entries are zeros), and the 
corresponding bipartite unigraph [4] GL is edge-colored. 
                                                           
2 This is guaranteed because, for an r-regular bipartite graph G=(X∪Y,E) 
with |X|=|Y|=n and r>3n/4, any partial matching M of G with |M|≤n/2 is a subset 
of a perfect matching of G. Please also refer to Theorem 8 in [5]. 
T T+H 2T+H 3T+H
Packet delay=2T+H 
St
ag
e 
Fig. 1.  Optical packet switch scheduling stages. 
Switch reconfiguration δ 
Traffic sending period 
Time 1 
2 
3 
4 
1831
According to the classical König theorem [9], GL can be edge-
colored in γ(i) colors. Then each inner-loop iteration schedules 
one color by using two non-overlapping perfect matchings. Each 
perfect matching is weighted by ω(i) and it schedules half of the 
edges of the color. Since all the entries cij in C(T) are integers, 
those entries greater than ω(i) must have been scheduled in 
previous iterations. As a result, the weight ω(i) is sufficient for 
the i-th iteration. Consequently, the i-th outer-loop iteration 
determines 2γ(i) configurations and introduces 2γ(i)ω(i) weight. 
After each configuration is determined, it is removed from an 
indicator matrix B (which is initialized to an all-1-matrix). Large 
entries that have already been scheduled in previous steps are 
also removed from C(T) (by referring to the indicator matrix B). 
Note that this procedure terminates before the total number of 
configurations that have been previously determined (Nm) 
exceeds N/4 (i.e. Nm<N/4) in order to guarantee that non-
overlapping perfect matchings can always be found in B. 
Assume that m is the number of outer-loop iterations required to 
generate these Nm configurations. After m iterations, the 
remaining small entries in C(T) are scheduled by using another 
N-Nm configurations with a fixed weight of ω(m+1). Each of 
these N-Nm configurations can be extracted (by performing 
maximum-size matching) and then deducted from B. This 
operation is guaranteed to be valid because B is always a 
regular matrix. Let SEschedule represent the Sschedule value produced 
exactly by the algorithm. We have 



+−+= ∑
=
)1()()()(21
1
schedule mNNiiT
S m
m
i
E ωωγ  
  






−+∑ 


×−=
=
− mm
m
i
i
i TNNT
T αα
α )(121
1
1
.             (2) 
Since minimizing SEschedule in (2) appears to be a mixed-
integer nonlinear optimization problem and is obviously 
intractable, we apply an approximation method here. 
Specifically, we use an approximation SAschedule≈SEschedule, where 
mm
m
mi
m
i
iA NmNS
ααα
αα
α
αα
α
11
4
3
)1(
)1(2211
4
31)1(2
1
1
schedule 


++
−
−
−=


++−=
−
=
∑ . (3) 
We can prove that the following inequality is true:3 
1
13
scheduleschedule
−
−
<−
α
αAE SS .                          (4) 
As we will discuss later, the typical value of α is α≈2.5. As a 
result we have |SEschedule-SAschedule|<4.4. Inequality (4) guarantees 
that our approximation SAschedule is close enough to the exact 
SEschedule. Consequently, if we minimize SAschedule, SEschedule is also 
roughly minimized. 
According to the basic idea of the algorithm, the following 
condition has to be satisfied in order to guarantee the existence 
of non-overlapping perfect matchings in the indicator matrix B: 
  412)(2 11
NiN
m
i
i
m
i
m <−== ∑∑
==
αγ .                     (5) 
                                                           
3 See Appendix A. Assume that T>N, N/T+2m/αm≤1, and use the boundary 
condition (6) to determine (α,m) in αi-SCALE. 
Particularly, αi-SCALE uses the following equation as its 
boundary condition (constraint): 
1
4
2
1
−=∑
=
Nm
i
iα .                                (6) 
According to Lemma 1 in Appendix A, when using (6) as the 
boundary condition, we have 
4
21
4
NNmN m <≤−− .                             (7) 
Thus not only (5) is satisfied, but also Nm is guaranteed to be 
close enough to N/4 in αi-SCALE. This ensures that large entries 
in C(T) are sufficiently analyzed and scheduled. 
Up to this point, the flow of αi-SCALE can be concisely 
summarized as follows. Under the constraint of the boundary 
condition (6), we find an (α,m) pair to minimize our objective 
function SAschedule in (3). α can be a fraction but m has to be an 
integer (because m is the number of iterations). At the same 
time, (α,m) pair is expected to be dynamically optimized for 
different switch size N (so as to approximately minimize 
αi-SCALE algorithm online part 
 
Step 1. Initialization: Create an N×N all-1 matrix B={bst}. Get (α,m) pair 
from αi-SCALE offline calculation. Use C={cst} to denote the traffic 
matrix. Set i=1, scale=α and count=1. 
Step 2. Select large entries: In an N×N all-0 matrix L={lst}, set lst=1 if 
cst>T/scale and bst=1. 
Step 3. Edge-coloring: Construct a bipartite unigraph GL from L. Edge-
color GL into γ(i) colors where γ(i) can also be calculated from (1). Set the 
color identifier k=1. 
Step 4. Partition edges: For color k, equally divide its edges into two sets 
Ea and Eb. If the total number of edges of color k is an odd number, Ea 
can have one more edge than Eb. 
Step 5. Schedule Ea: For each edge in Ea, shadow its corresponding lines 
(row and column) in B. Find a maximum-size matching MB in the 
remaining un-shadowed sub-matrix of B. Then un-shadow all the lines of 
B. The matching MB combines with all the edges in Ea to form a perfect 
matching Pi. Set Pi’s weight as ω(i) defined in (1). Set B-Pi→B and 
i+1→i. Set cst=0 in C if bst=0. 
Step 6. Schedule Eb: Repeat Step 5 for Eb. 
Step 7. Loop over colors: Set k+1→k. Loop to Step 4 until all the γ(i) 
colors are scheduled. 
Step 8. Outer-loop iteration: Set scale×α→scale and count+1→count. 
Loop to Step 2 until count>m.  
Step 9. Schedule small entries in C: Repeat this step to sequentially 
extract the remaining N-Nm perfect matchings (by performing maximum-
size matching) from the indicator matrix B, where Nm is the total number 
of configurations determined in Steps 1-8. After each perfect matching is 
extracted, deduct it from B and set the constant ω(m+1) as its weight.  
 
αi-SCALE algorithm offline part: (α,m) searching procedure 
 
1) Search for α’s approximate value α*: Search for α*. α* is the value 
that minimizes SAschedule in (8). 
2) Calculate m’s approximate value m*: Apply α* to (9) to calculate m*. 
3) Determine m: Set m=ROUND(m*), where ROUND( ) represents the 
function of taking the nearest integer. 
4) Determine α: Substitute m in (10) by the value found in 3) and solve 
the equation for α. 
Fig. 2.  αi-SCALE algorithm. 
1832
SEschedule in (2)). The values of (α,m) can then be substituted into 
the online part of αi-SCALE in Fig. 2 to schedule C(T). 
C. Determining (α,m) Pair 
We now consider how to determine the suitable (α,m) pair 
for a specific N. There are many possible methods. Aiming at 
providing a complete solution for αi-SCALE, we suggest the 
(α,m) searching procedure as listed in Fig. 2. It adopts the 
following formulas (8)-(10). The correctness proof is given in 
Appendix B. 
42)4(
)1)(4(2
lg
8lg)]4()4lg[(2
2schedule
−+−+
−−
−
−−−+
=
NNN
NNNS A
αα
αα
α
αα
α  
)4()4(
86
−−+
+
+
NN
N
α
αα                                (8) 
α
αα
lg
8lg)]4()4lg[( −−−+
=
NNm                                         (9) 
1
41
)1(2
−=
−
− Nm
α
αα                                                           (10) 
Fig. 3 plots SAschedule in (8). We can see that SAschedule is 
usually minimized around α≈2.5 instead of α=2 (note that MIN 
[5] uses 2i as the threshold in the i’th iteration to select large 
entries). It is also very important to note that the performance of 
α=2 in Fig. 3 does not stand for the performance of MIN. The 
reason is that, all the curves in Fig. 3 satisfy (7), which indicates 
that the large entries in C(T) are scheduled fine enough with as 
many configurations as possible (N/4-1-2m≤Nm<N/4). For MIN, 
usually the first m outer-loop iterations generate less 
configurations and leave more configurations to be weighted by 
the constant weight. It may increase Sschedule. Intuitively, MIN 
does not guarantee its Nm value to be as close to N/4 as αi-
SCALE does. This is because MIN uses a fixed threshold 2i 
which cannot self-adjust according to the switch size N. So its 
schedule for large entries is usually not fine enough and the 
Sschedule performance may be worse than that (shown as α=2) in 
Fig. 3. In the worst case, we can show that MIN may lead to a 
bias of 2(2m+1-1) configurations less than N/4. 
IV. PERFORMANCE ANALYSIS 
Fig. 4 shows the Sschedule bounds due to the inefficient 
scheduling for MIN and αi-SCALE. The bound for MIN in the 
figure is derived and plotted by strictly following the steps of 
MIN in [5]. 4  It is clear that the original bound 
Sschedule=4(4+log2N) in [5] is too conservative (and thus 
inaccurate) to represent MIN’s performance. For example, when 
N=460, Sschedule=26.94 is sufficient for MIN. However, the 
bound from [5] is Sschedule=4(4+log2N)=51.38. 
From Fig. 4a, it is also obvious that αi-SCALE (shown in the 
broad-brush curve) outperforms MIN in general. For example, 
when N=200, Sschedule needed by MIN is 2+6×(1/2)+14×(1/4)+ 
(1/8)×(200-2-6-14)=30.75. However, for the same switch size, 
because (α,m)=(2.5,3), Sschedule needed by αi-SCALE is just 
4+12×(1/2.5)+30×(1/2.52)+(1/2.53)×(200-4-12-30)=23.45. 
The above performance difference is due to the dynamic 
scale function f(N,i)=αi adopted by αi-SCALE. αi-SCALE 
                                                           
4 The i-th outer-loop iteration generates 2(2i-1) configurations with each 
weighted by T/2i-1. The boundary condition is Nm<N/4. The remaining N-Nm 
configurations are weighted by a constant T/2m, where m is the number of outer-
loop iterations needed to determine the first Nm configurations. For example, 
when N=460 (N/4=115), m=5 iterations are needed and 2, 6, 14, 30, 62 
configurations are determined in the first 5 iterations respectively. Thus Sschedule 
=(1/T)×[∑mi=12(2i-1)T/2i-1+(460-2-6-14-30-62)T/25]=26.94. The effects of 
roof and floor functions are ignored for simplicity. 
S s
ch
ed
ul
e 
N 
αi-SCALE  
MIN’S Gaps
MIN 
Fig. 4.  Performance comparison for MIN and αi-SCALE.
(a) 
(b) 
Sschedule=4(4+log2N) 
N
S s
ch
ed
ul
e 
MIN 
αi-SCALE 
S s
ch
ed
ul
e 
Fig. 3.  Relationship between SAschedule and α when N changes 
from 200 to 5000 in a step of 200. 
α
1833
guarantees that Nm (the total number of configurations used to 
schedule large entries) is close enough to its maximum possible 
value of N/4 (i.e. N/4-1-2m≤Nm<N/4). In comparison, MIN does 
not have such a guarantee. In the above example, Nm for αi-
SCALE is 4+12+30=46 (49-2×3≤Nm<50), but MIN uses only 
2+6+14=22 configurations, which is much smaller than N/4=50. 
This increases the scheduling inefficiency of MIN. 
Fig. 4b shows the performance of MIN and αi-SCALE for 
small N. In this range, although αi-SCALE is less advantageous, 
it does complement MIN’s performance. Since αi-SCALE 
involves approximation, this explains its poorer-than-MIN 
performance in half of the switch size range studied. 
V. CONCLUSION 
In this paper, we showed that the speedup bound Sschedule= 
4(4+log2N) given in [5] does not accurately represent the 
performance of MIN algorithm. We recalculated the actual 
performance from MIN and got a much lower Sschedule. A new αi-
SCALE algorithm was proposed for performance guaranteed 
OPS scheduling, which also uses the minimum number (N) of 
configurations to minimize traffic delay. By employing a 
dynamic scale function, the new algorithm is optimized for 
different switch sizes. Our results showed that αi-SCALE pushes 
the speedup bound to an even lower level in general, while for 
small and medium size OPS it effectively complements the 
performance of MIN. 
Future work may take three possible directions. The first one 
is based on the same framework presented in this paper. The 
goal is to find another better scale function or some discrete 
series to analyze the traffic matrix. The second one is to devise 
some new algorithms which totally abandon the framework of 
MIN and αi-SCALE. The third direction is to enhance the OPS 
architecture discussed in Section II. For example, we can 
consider a parallel switching architecture, in which some 
switching layers work in transmission phase while others are 
reconfigured. This extra space diversity can help to overcome 
the difficulties involved in realizing speedup in time domain. 
APPENDIX A 
CORRECTNESS PROOF OF INEQUALITY (4) 
Lemma 1: 
If formula (6) is used as the boundary condition for αi-
SCALE, Nm must satisfy the following inequality: 
4
21
4
NNmN m <≤−− . 
Proof: 
Because 
  ∑=∑ +−≤ ∑ −=≤∑ − ====
m
i
im
i
im
i
i
m
m
i
i N
1111
2]1)1[(212)1(2 αααα ,    (11) 
According to (11) and (6) we have 
1
4
21
4
−≤≤−− NNmN m .                        (12) 
That is 
4
21
4
NNmN m <≤−− . 
□ 
We define 
   −+∑ ×−= = − mm
m
i
i
i NNΩ
αα
α
1)(112
1
1
,                  (13) 
Because 
 ∑ −=
=
m
i
i
mN
1
12 α , 
from (13) and (2) we have 
  ΩSNNTΩT
NΩ Esm
m
i
i ≤≤


−+−−=− ∑
=
chedule
1
121 α .       (14) 
According to (13) and (3), there exists 
 ( ) 


−−+


−−−=− ∑
=
−
mm
m
i
ii
i
A NNSΩ 1
4
1)1(12
1
1schedule α
αα
α
. (15) 
From (12) we know that both terms on the right hand side of 
(15) are non-negative. According to (12) and (15) we have 
mmm
mm
i
mi
A mmmSΩ
αα
α
ααα
αα
αα
2
1
22
)1(
)1(222
1
1schedule +
−
<+
−
−
=+≤− ∑
=
−
. (16) 
From (14) and (16) we get 
)()( scheduleschedulescheduleschedule
AEAE SΩΩSSS −+−=−  
m
AE m
T
NSΩΩS
αα
α 2
1
2
scheduleschedule +
−
+<−+−≤ . 
Let T>N. Usually 2m/αm is a very small value. Further assume 
that N/T+2m/αm≤1. We can see that the inequality (4), as 
rewritten below, is true for any α and N. 
1
13
1
21scheduleschedule
−
−
=
−
+<−
α
α
α
αAE SS . 
APPENDIX B 
CORRECTNESS PROOF OF (α,m) SEARCHING PROCEDURE 
SAschedule, α, m and N are linked together by a complex 
function (3). It is difficult to entirely substitute α by m in 
SAschedule expression. However, substituting m by α is quite easy. 
Thus, we first substitute m by α in SAschedule expression (3) to 
reduce the number of variables and to get (8), in which α is the 
sole variable for any specific N. Taking minimizing SAschedule as 
our goal, we can find a solution α* for (8) by searching 
calculation. According to our previous analysis, this value of α* 
also approximately minimizes SEschedule in (2). But, after we get 
α*, the corresponding m* calculated from (9) is usually not an 
integer. However, the number of iterations m has to be an 
integer. Thus we let m be the nearest integer of m* and calculate 
α again, but this time we use the boundary condition (10) 
(equivalent to (6)) to calculate α in order to guarantee that the 
algorithm generates as many configurations as possible in the 
first m outer-loop iterations (This is ensured by (7)). At this 
point, we get (α,m). This (α,m) pair allows the algorithm to 
1834
generate N/4-1-2m≤Nm<N/4 configurations in the first m outer-
loop iterations and approximately minimizes SAschedule and 
SEschedule (because α≈α*). 
According to the boundary condition (6), we have 
1
41
)1(22
1
−=
−
−
=∑
=
Nmm
i
i
α
αα
α . 
The above equation is actually identical with (10). From this 
condition, it is easy to see that 
α
α
α
8
)4()4( −−+
=
NNm .                        (17) 
This directly leads to (9). i.e. 
α
αα
α
α
α lg
8lg)]4()4lg[(
8
)4()4(log −−−+=−−+= NNNNm . 
Substituting m and αm in (3) by (9) and (17), we get 
mm
m
A NmS
ααα
αα
α
11
4
3
)1(
)1(22schedule 


++
−
−
−=
 
)4()4(
86
42)4(
)1)(4(2
lg
8lg)]4()4lg[(2 2
−−+
+
+
−+−+
−−
−
−−−+
=
NN
N
NNN
NNN
α
αα
αα
αα
α
αα
α . 
Thus (8) is correct. Consequently, the group of formulas used in 
αi-SCALE (i.e. (8)-(10)) is correct. In fact, because (10) is 
equivalent to the boundary condition (6), according to Lemma 1 
in Appendix A, Nm of αi-SCALE is always smaller than N/4 but 
greater than or equal to N/4–1–2m. αi-SCALE takes this as an 
advantage to analyze large entries in C(T) with sufficient 
granularity, and to generate a fine schedule. 
It is obvious that (8) depends only on α (consider N as a 
system constant). We can find an α* to minimize SAschedule and 
estimate the corresponding m* using (9). However, m* is 
usually a fraction. So, we have to take m=ROUND(m*) and 
calculate the new α value again accordingly. In order to satisfy 
the boundary condition (6), we use its counterpart (10) to 
calculate α. This ensures that N/4–1–2m≤Nm<N/4. 
A further analysis indicates that the following restriction in 
Lemma 2 exists for αi-SCALE algorithm: 
Lemma 2: 
For any switch size N and α≥2, any (α, m) pair in αi-SCALE 
has to satisfy the following inequality: 
mm NN
11
16
4
8
4 

 −
>>

 −
α . 
Proof: 
From the boundary condition (6), we know that 
4
41
4
2
1
−
=−=∑
=
NNm
i
iα  and .
8
4
1
−
=∑
=
Nm
i
iα  
Thus 
8
4−
<
Nmα .                                 (18) 
At the same time, for any α≥2, there exists mm
i
i αα <∑−
=
1
1
, thus 



−=∑=+∑>
=
−
=
1
42
12
1
1
1
Nm
i
imm
i
im αααα . 
That is 
16
4−
>
Nmα .                                 (19) 
Combining (18) and (19), we have 
mm NN
11
16
4
8
4 

 −
>>

 −
α .                         (20) 
□ 
The above formula (20) holds for both before ROUND(m*) 
and after ROUND(m*). Let 
mN
1
max 8
4 

 −
=α  and 
mN
1
min 16
4 

 −
=α , 
We have 
12
1
min
max
≈=
m
α
α .                               (21) 
For ROUND( ) function, it can change m’s value by at most 
0.5 and thus 1/m changes only a little. As a result, αmax and αmin 
will be quite stable. In this case, formula (21) indicates that our 
final α is close enough to α*, that is α≈α*. Consequently, (α, m) 
pair can still approximately minimize SAschedule and SEschedule. 
REFERENCES 
[1] A. Neukermans and R. Ramaswami, “MEMS technology for optical 
networking applications”, IEEE Commun. Mag., vol. 39, pp. 62-69, Jan. 
2001. 
[2] J.E Fouquet et. al, “A compact, scalable cross-connect switch using total 
internal reflection due to thermally-generated bubbles”, IEEE LEOS 
Annual Meeting,  pp. 169-170, Dec. 1998. 
[3] O. B. Spahn,  C. Sullivan, J. Burkhart, C. Tigges, and E. Garcia “GaAs-
based microelectromechanical waveguide switch”, Proc. 2000 
IEEE/LEOS Intl. Conf. on Optical MEMS, pp. 41-42, Aug. 2000. 
[4] Xin Li and Hamdi, M., “On scheduling optical packet switches with 
reconfiguration delay”, Selected Areas in Communications, IEEE Journal 
on , vol. 21, issue 7, pp. 1156-1164, Sept. 2003. 
[5] B. Towles and W. J. Dally, “Guaranteed scheduling for switches with 
configuration overhead”, IEEE/ACM Trans. Networking, vol. 11, no. 5, 
pp. 835-847, Oct. 2003. 
[6] Bin Wu and Kwan L. Yeung, “Minimizing internal speedup for 
performance guaranteed optical packet switches”, GLOBECOM '04 
IEEE, Vol. 3, pp. 1742-1746, 29 Nov.-3 Dec. 2004. 
[7] T. Inukai, “An efficient SS/TDMA time slot assignment algorithm”, IEEE 
Trans. Commun, vol. COM-27, no. 10, pp. 1449-1455, 1979. 
[8] R. Cole and J. Hopcroft, “On edge coloring bipartite graphs”, SIAM 
Journal on Computing, vol. 11, pp. 540-546, Aug. 1982. 
[9] R. Diestel, Graph Theory, 2nd ed. New York: Spring-Verlag, 2000.
 
1835
