Performance analysis of iterative matching scheduling algorithms in ATM input-buffered switches. by Cheng, Sze Wan. & Chinese University of Hong Kong Graduate School. Division of Information Engineering.
PERFORMANCE ANALYSIS OF 
ITERATIVE MATCHING SCHEDULING 
ALGORITHMS 
IN ATM INPUT-BUFFERED SWITCHES 
BY 
CHENG SZE WAN 
A THESIS 
SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS 
FOR THE DEGREE OF MASTER OF PHILOSOPHY 
DlVISION OF INFORMATION ENGINEERING 
THE CHINESE UNIVERSITY OF HONG KONG 








/ ^ S ^ ^ 
m 14 [F8 a i %^^ � ^ J 
^ \ l^ -iVFR3iTY / ^ / 
\^ d^iKLIBRARY SYSTEM^《J ^^^^ 
Acknowledgement 
I would l ike to express m y deepest grat i tude to m y supervisor Professor Tony 
T . Lee for his continuous support, guidance and invaluable suggestions dur ing 
m y study at C U H K . 
Specifically, I would also l ike to thank Dr . Soung-Yue Liew for his in-depth in-
spirat ion and discussion on m y research work. He worked very closely w i t h me 
in the past two years and helped me through al l the diff icult ies I encountered in 
m y research. 
A number of people assisted me in various ways dur ing m y master study. Spe-
cial thanks to M r . Man-Ch i Chan, who accompanied me in at tending the I E E E 
A T M Workshop 1999 in Kochi City, Japan. His comprehensive preparat ion for 
our t r i p released many of m y worries throughout our stay in Japan. I n addi t ion, 
I must thank those members of Broadband Communicat ion Laboratory whom I 
have got along w i th . They are al l very k ind in personalities and would definitely 
give me a hand when I am in need. 
Last but not least, I am grateful to my parents, grandmother, aunt, M r . Thomas 
H. Ip, M r . Dennis 0 . Yeung, and Mr . Eric C. Ma. They al l support and love 
me whole-heartedly in every aspect of my life. 
ii 
Abstract 
Traff ic scheduling algor i thms play an impor tan t role i n t ransport systems em-
ploying Asynchronous Transfer Mode ( A T M ) switching. I n broadband inter-
grated services packet networks, a wide range of applications w i t h diversif ied 
Qual i ty of Service (QoS) requirements are to be supported by ind iv idua l switches, 
and this relies signif icantly on the scheduling algor i thms adopted by these switches。 
The funct ions of these scheduling algori thms is to, on one hand, resolve the con-
tent ion problem when mu l t ip le cells compete for the same output and, on the 
other hand, allocate the network bandwid th according to the QoS required by 
the connections fair ly. The design of these algori thms also depends on the spe-
cific switch architecture and extensive research has been done w i t h respect to 
this area. 
The focus of this thesis is on analysing the performance of various traf-
fic scheduling algori thms adopted by A T M input-buffered switches. A n algo-
r i t h m called Enhanced Parallel I terat ive Match ing ( E P I M ) is proposed and i t is 
found that this a lgor i thm outperforms its previous version, the Paral lel I terat ive 
Match ing a lgor i thm, in terms of switch throughput and average delay experi-
enced by the cells. To cater for the need of support ing bandwid th guarantee to 
ind iv idua l connections, we modi fy the E P I M algor i thm by incorporat ing i t w i t h 
iii 
«t 
the Stat ic Scheduling a lgor i thm, which is capable of reserving enough bandw id th 
to connections dur ing call setup t ime. Under un i fo rm and independent t raf f ic, 
the incorporat ion of the two algor i thms allows provision of bandw id th guarantee 
w i thou t affecting the delay and throughput performance of the switch. 
Recently, a large-scale A T M switch architecture called Cross-Path switch 
is proposed and is proven to be able to handle mu l t i ra te and mul t icast traff ic 
efficiently. The mer i t of this switch architecture is the in t roduc t ion of a quasi-
static rout ing scheme, namely the Path Switching, to the t rad i t iona l three-stage 
Clos network to accomplish A T M switching. A n input module is considered 
v i r tua l l y connected to an output v ia a v i r t ua l pa th w i t h certain number of 
tokens. A token is equivalent to a physical connection w i t h i n a central module. 
This number determines how many cells are to be t ransmi t ted over this v i r t ua l 
path w i t h i n a f rame in each t ime slot. The input ports of an inpu t module are 
thought as compet ing for tokens that "b r ing" the cells to the desired output 
modules. As a result, we propose a logical model for the input modules and the 
v i r tua l paths, which enables us to apply E P I M on scheduling the tokens among 
compet ing input ports of an input module w i t h minor adjustment. This also 
allows each input module to do the scheduling task in d is t r ibuted and paral lel 
manner. The performance of an input module using the adjusted version of 
E P I M is studied to complete our proposal. 
iv 
摘要 
在非同步傳送模式(八87001^^0口5 Transfer Mode, ATM)*，流量安排(1^£【化 
Schedul ing)是一個非常重要的考慮。在寬頻縱合服務封包網路出 [0&(^303 
integrated services packet networks)，每個交換器要支援不同的應用程式，及保證 




之性能表現。本文針對已有的平行循環配對演算法(?&瓜1161 Iterative Matching 
algorith ， P M ) ，提出了改良化的平行循環配對法 ( E n h a n c e d Parallel Iterative 
















1 Introduction 1 
1.1 Background . . 。。。。。. . 。 . . . 1 
1.2 Traff ic Scheduling in Input-buf fered Switches . 。 . . 3 
1.3 Organizat ion of Thesis •。。。 . .。。 .。。。。。。 7 
2 Principle of Enchanced P I M Algorithm 8 
2.1 In t roduc t ion 。。. .。。。。 .。。。 8 
2.1.1 Switch Model 9 
2.2 Enhanced Parallel I terat ive Match ing A lgo r i t hm ( E P I M ) . . . . 10 
2.2.1 Mot iva t ion 10 
2.2.2 A lgo r i t hm 12 
2.3 Performance Evaluat ion 16 
2.3.1 Simulat ion 16 
2.3.2 Delay Analysis 18 
3 Providing Bandwidth Guarantee in Input-Buffered Switches 25 
3.1 In t roduct ion 25 
3.2 Bandwid th Reservation in Static Scheduling A lgo r i t hm 26 
V 
3.3 Incorporat ion of Dynamic and Static Scheduling A lgo r i thms . 。 32 
3.4 S imulat ion 34 
3.4.1 Switch Mode l . .。，。。。。. • . . • . . 35 
3.4.2 Simulat ion Results .。•。。。。。。。。。》。。 36 
3.5 Comparison w i t h Ex is t ing Schemes 42 
3.5.1 Stat ist ical Match ing 42 
3.5.2 Weighted Probabi l is t ic I terat ive Match ing 45 
4 EPIM and Cross-Path Switch 50 
4.1 In t roduc t ion 50 
4.2 Concept of Cross-Path Switching 51 
4.2.1 Pr inc ip le 51 
4.2.2 Support ing Performance Guarantee in Cross-Path Switch 52 
4.3 Imp l i ca t ion of E P I M on Cross-Path switch 55 
4.3.1 Problem Re-defini t ion . . . . 55 
4.3.2 Scheduling in Input Modules w i t h EP IM。。。 .。。。。。 58 
4.4 Simulat ion 63 
5 Conclusion 70 
Bibliography 72 
vi 
List of Figures 
1.1 (a) Example of H O L Block ing in Input-buf fered Swi tch w i t h sin-
gle F I F O queue. (b)Window-based Solut ion for e l im ina t ing H O L 
Block ing 。 . . 。 . . • . 。。。 . . • . . 4 
i 
1.2 Relat ionship between Switch Scheduling and B ipa r t i t e Graph Match- I 
ing 5 
1.3 A single i te ra t ion of P I M - F i rs t I terat ion. , . 6 
2.1 A 2 x 2 I n p u t - b u f f e r e d s w i t d i w i t h V O Q . 。 。 。 。 。 。 。 。 。 。 . 。 。 9 
2.2 (a) P I M : f irst i terat ion, t ime slot 1. (b) Request phase of second 
i terat ion, t ime slot 1 11 
2.3 E P I M : second i terat ion, t ime slot 1 14 
2.4 E P I M : f irst i terat ion, t ime slot 2 16 
2.5 Delay performance of P I M , E P I M and Ou tpu t Queueing. .。。。 17 
3.1 Capacity M a t r i x and its corresponding B ipar t i te Graph 28 
3.2 A n example of Static Scheduling A lgo r i t hm (N=2，F=6) 29 
3.3 Capacity M a t r i x and its corresponding B ipar t i te Graph 32 
3.4 Example of Incorpor ta t ing E P I M w i t h Statci Scheduling A lgo r i t hm. 34 
vii 
3.5 (a) Swi tch Mode l and Buf fer Management for E P I M w i t h s tat ic 
scheduling, (b) Traf f ic characterist ics used i n s imulat ions 35 
3.6 Overa l l Delay Performance of E P I M and P I M w i t h Stat ic Schedul-
ing - E v e n l y D i s t r i bu ted Token Assignment 38 
3.7 Overa l l Delay Performance of E P I M and P I M w i t h Stat ic Schedul-
ing - A l i g n e d Token Assignment . 。 39 
3.8 Delay Performance of Stat ica l ly Scheduled Cells - Even ly Dis-
t r i b u t e d Token Assignment 40 
3.9 Delay Performance of Stat ica l ly Scheduled Cells - A l i gned Token 
Assignment . 40 , 
I 
3.10 Delay Performance of Dynamica l l y Scheduled Cells - Even ly Dis- ‘ 
t r i b u t e d Token Assignment 41 
3.11 Delay Performance of Dynamica l l y Scheduled Cells - A l i gned To-
ken Assignment . 41 
3.12 Examp le of Weighted Paral le l I te ra t ive Ma tch ing ( W P I M ) Algo-
r i t h m 47 
3.13 Delay Performance of E P I M w i t h Stat ic Scheduling and W P I M . 48 
3.14 Delay Performance of W P I M and P I M 49 
4.1 A n N x N Cross-Path Swi tch 51 
4.2 Logical Mode l of an Inpu t Modu le i n Cross-Path Swtich. , 。 。 . 56 
4.3 Logical Modu le of I npu t Modu le (a) 0 (b) 1 58 
4.4 Middle-stage Route Schedule and corresponding B ipa r t i t e Graph 
i n t ime slot (a) 0 (b) 1 59 
4.5 Scheduling cells at I npu t Modu le 1 - The Logical Mode l 60 
viii 
4.6 F ind ing the Frame Schedule for Inpu t Modu le 1. . . 。 。 . . . . 。 62 
4.7 Delay Performance of E P I M and P I M : 17 tokens 65 
4.8 Delay Performance of E P I M and P I M : 18 tokens 66 
4.9 Delay Performance of P I M : 16, 17, and 18 tokens .。 • 。 •。。。 . 67 
4.10 Delay Performance of E P I M : 16, 17, and 18 tokens 68 
4.11 A Cross-Path Switch w i t h n = 3 , k = 2 , and m = 3 68 









Over these years, the te lecommunicat ion indust ry has witnessed a tremendous 
growth in bo th the level of technology and demand for fast, efficient communica-
t ion services. Large technological progress has occurred in electronics and fiber 
optics, which enhances the development of u l t ra performance network ing equip-
ment to support h igh speed transmission of data over huge networks. On the 
other hand, the requirement of the network users keeps on changing when new 
types of appl icat ion evolve. Designing the next generation networking solut ion 
that takes al l these factors into consideration is a challenge to the industry. 
I n the past, nearly every ind iv idual te lecommunicat ion service employs its 
own network to transport the service. For example, computer data is usu-
ally delivered over packet switched local area networks ( L A N ) and wide area 
networks ( W A N ) whi le POTS (plain old telephone service) is t ransported by 
publ ic switched telephone network (PSTN). Another example is the transport 
‘ 1 
Chapter 1 Introduction 
of television signals. These signals can be broadcasted in media l ike air (v ia 
radio waves), coaxial tree network of the communi ty antenna T V ( C A T V ) , and 
satell i te. Yet , nowadays, new applications keep on emerging, which make the 
d is t inct ion between these transport systems less obvious than before. People 
are developing real t ime video (video conferencing), television ( W e b T V ) , voice 
signals over the internet. Meanwhi le, internet service providers ( ISP) are using 
the publ ic telephony network extensive in prov id ing internet access service to 
subscribers. As a result, bu i ld ing an integrated t ransport system that can cater 
for a wide category of applications w i t h diversified, or even unknown, require-
ments at h igh speed, is the common goal for researchers。This is the mot iva t ion 丨 
I 
of the development of Broadband Integrated Services D ig i ta l Network (B ISDN) . 
Asynchronous Transfer Mode ( A T M ) has been selected as the u l t ima te solu-
t ion for data switching in B I S D N by C C I T T in late eighties. I t is a connection-
oriented packet-based protocol that allocates as much bandwid th needed by 
the control channels need to be assigned. I n A T M , variable-sized packets are 
t runcated in to short, f ixed-length, 53 bytes packets called cells. As a connection-
oriented transfer mode, A T M requires v i r tua l circuits between sources and des-
t inat ions to be set up before rout ing of cells. A number of A T M switch ar-
chitectures have been proposed in the l i terature, and many of them have been 
implemented as commercial products. The type of switch fabric of our interest is 
the internal ly non-blocking switch architecture, which is commonly implemented 
by crossbar switching network. 
For those internal ly non-blocking switching networks, since arr ival of cells at 
the input ports is not cooperative, output contention would be resulted when 
more than one input port has cells for the same output . Since only one of those 
2 
Chapter 1 Introduction 
inputs can have its cell t ransmi t ted, those who lose in the content ion resolut ion 
w i l l either drop their cells (loss system) or backlog t hem in the buffer (wait 
system). I n wai t system, we can place the buffer i n switch inpu t or ou tpu t side, 
which are known as input buffer ing and output buffer ing respectively. 
1.2 Traffic Scheduling in Input-buffered Switches 
Despite the need to resolve output contention, an A T M switch must be capable 
of al locat ing bandwid th among calls effectively in order to meet the qual i ty-
of-service (QoS) guarantees needed by ind iv idua l calls and main ta in ing high 
u t i l i za t ion of switch bandwidth . This would be achieved by the proper use of i 
scheduling algor i thms, which is dependent on the switch arch i tecture。 In the 
past, a lot of a t tent ion in designing scheduling algor i thms has been drawn to 
output-bufFered switch, as i t can have 100% throughput . However the cost 
for achieving this is h igh - switching speed has to be increased to N (N is the 
number of input ports) t imes of transmission rate of lines and the algor i thms are 
usually computat ional ly expensive in high speed and large scale implementat ion。 
Input-buffered switch w i t h single F I F O queues, on the contrary, is good for its 
s impl ic i ty but constrained by its low throughput of 58.6% [13]. I t is the result of 
head-of-line (HOL) blocking at an input queue, forb idding cells other than the 
H O L one to be t ransmi t ted when more than one H O L cell f rom different inputs 
contends at the same output (Figure 1.1(a)). 
Several different scheduling techniques have been proposed to cope w i t h the 
H O L blocking in input-bufFered switches. The essence of those schemes is to 
allow the switch to choose cells other than the H O L ones to be t ransmit ted. 
3 
Chapter 1 Introduction 
Only one of the competing Window size = 3 
i packets will win. A 
I I ‘ 、 
丨…一一 B B T ! ^ ^ “ • 0 S ^ K . ^ ^ ^ ^ 
「—••_0 g _ L J ^ - ^ ^ ' ^ _L_ 0 0 ^ ^ ^ ^ ^ ^ ^ ^ -
^ ^ 0 2 \ _ ^ - 0 H 0 - ^ ^ _ ^ -
Can't be accessed! ^ ^ - ^ ^ ^ ^ - ^ ^ ^ ^ > ^ ； ； ^ ^ 
Blocked by the _ , ^ ^ ^ " ^ ^ ^ , , _ _ _ , ^ ^ ^ ^ ^ ^ ^ 3 
FIRSTpackets \ J ] ^ ^ ^ ^ ^ _ i _ \ T \ 0 0 ^ ^ ^ ^ ^ J _ 
Internally Nonblocking Internally Nonblocking 
Switch Switch 
(a) (b) 
Figure 1.1: (a) Example of H O L Blocking in Input-buf fered Switch w i t h single 
F I F O queue. (b)Window-based Solut ion for e l iminat ing H O L Blocking。 
i 
A n earlier scheme for improv ing the performance of an input-buf fered switches i 
is the window-based selection of cells in F I F O input queues. I n this scheme, 
a windowed buffer allows any of the first w packets to be t ransmi t ted , where 
w is the window size of the queue (Figure 1.1(b)). Simulat ions and analysis 
found that even a small window size can induce remarkable improvement over 
the t rad i t iona l F I F O queues. The larger the window size, the better throughput 
performance w i l l be, but at the same t ime the scheduling speed w i l l be restr icted 
by the access speed of the memory in the buffer。 
One of the most representative approach is the Parallel I terat ive Match ing 
( P I M ) a lgor i thm proposed by Anderson, Owicki , and Saxe [1]。In this paper, 
choosing which cell to be sent is modeled as matching input and output nodes 
i 
i n a b ipar t i te graph matching (Figure 1.2). Specifically, in order to maximize 
the throughput of the switch, a max ima l matching ^ should be found. This idea 
was first investigated in Andersen's paper, whose authors have proposed the use 
iA Maximal Matching is one for which pairings cannot be trivially added; each node is 
either matched or has no edge to an unmatched node. 
4 
Chapter 1 Introduction 
of Paral lel I terat ive Match ing ( P I M ) a lgor i thm to achieve the task. 
0 B B ^ | \ ^ . 0 
1 1 ^ * ^ * 1 
^ a a - | * ^ ^ ^ * 2 
3 囚 図 囚 m ^ ^ 3 
Connection Requests / 
Buffer Occupancy 
！ 
^ ^ ^ ^ ^ Connection setup / • 
Schedule for this time slot | 
I 
Figure 1.2: Relat ionship between Switch Scheduling and B ipar t i t e Graph Match-
ing. 
The P I M a lgor i thm uses paral lel ism, randomness, and i tera t ion to compute 
the schedule of the switch. There are 3 phases in the a lgor i thm, namely the 
Request, Grant , and Accept phase. The schedule w i l l i terate al l the steps un t i l 
a fixed number of i terat ions have been invoked or a max ima l match ing for the 
b ipar t i te graph has been found. Here are the three phases of P I M : 
Request Each unmatched input sends a request to every output 
for which i t has a bufFered cell. This notifies an output of al l i ts 
potent ia l partners. 
Grant I f an unmatched output receives any requests, i t chooses 
one randomly to grant. The output notifies each input whether its 
request was granted. 
5 
Chapter 1 Introduction 
Accept i f an i npu t receives any grants, i t chooses one to accept 
and notif ies tha t ou tpu t . 
F igure 1.3 i l lustrates an i te ra t ion how the a lgo r i t hm works. A f t e r th is i ter-
at ion, two connections have been set up and the result w i l l be brought fo rward 
to the next i te ra t ion. I n tha t case, only i npu t 3 w i l l send request to ou tpu t 3. 
Since i t is the only request received by ou tpu t 3，the request w i l l be granted 
direct ly. As a resul t , connect ion between inpu t 3 and ou tpu t 3 w i l l be set up 
and a m a x i m a l match ing of this connection b ipa r t i t e graph w i l l be determined. 
I C D # ^ ^m 0 m^ • o • ^ # o 
參 、 參 1 • / 參 1 參 7 參 1 
n n Z y y 
3 1 E ] # c ^ • 2 0 ^ • 2 # ^ • 2 
3 • ^ ^ 3 參 ^ ^ 3 參 鲁 3 
Request Grant Accept 
Figure 1.3: A single i terat ion of P I M - First I terat ion . 
A lot of attent ioM was drawn a f terwards , and various m o d i f i c a t i o n s liave been 
(le\-isecl. T h e m o d i f i c a t i o n s are niost ly d o n e l)y chang ing the way of niakirig 
dec is ion in ( ; rant aiul A c c c p l phases. Kor instance , aii a l ternat ive algori l l ini 
cal led I terat ive R o u n d R o b i n M a t c h i n g wi lh slip ( S L I P - I R R M ) is p r o p o s e d iii 
2]. S i m p l y speak ing , instead of iising r a n d o m n e s s in m a k i n g se lec t ion . ‘SLIP-
I R R M se lec ls the ccll l o l)e t ransmi t ted through ll ie use of round robin scheduler 
at each input and output poi,t. O t h e r der ivat ives incluf lr LRlJ and LQtJ [3. 6 . 
6 
Chapter 1 Introduction 
I n essence, these modif icat ions rel inquish the use of randomness in the selection 
process. Instead, they use a varied p r io r i t y mechanism which is based on the 
state in fo rmat ion tha t is mainta ined f r om one t ime slot to the next . There are 
also algor i thms derived f rom P I M which are capable of p rov id ing bandw id th 
guarantees to ind iv idua l connections. 
1.3 Organization of Thesis 
This thesis aims at invest igat ing the performance issues of various scheduling 
algor i thms employed in the class of A T M input-buf fered switches. I n par t icu lar , 
an enhanced version over the or iginal P I M a lgor i thm ( E P I M ) is proposed in 
Chapter 2. We have studied the improvement in delay performance of E P I M 
over P I M through conduct ing simulations and analysis. Due to the demand 
for support QoS in A T M switches, we i l lust rated the capabi l i ty of E P I M in 
prov id ing bandwid th guarantee to connections by incorporat ing i t w i t h static 
scheduling a lgor i thm. The details of how i t is done is discussed in Chapter 
3. Moreover, a comparison w i t h exist ing schemes which are based on i terat ive 
matching a lgor i thm to provide per-connection QoS is made in the same chapter. 
The impl ica t ion of E P I M on Cross-Path switch in scheduling cells at input 
modules is the focus of Chapter 4. Lastly, Chapter 5 concludes the findings 








The u l t ima te goal of scheduling best-effort service cells in h igh speed packet 
switches are to resolve output contention and, at the same t ime, maximize the 
number of cells to be t ransmi t ted in each t ime slot. The scheduler needs not 
provide performance guarantee of any k ind, but a good scheduler should be able 
to d is t r ibute the switch bandwid th in a fair and high speed manner. This is also 
our ma in concern in designing our proposed algor i thm. 
This section w i l l present the problem def ini t ion framework and switch model 
to be used in the rest of this thesis. The core of this section w i l l be on our 
a lgor i thm - the Enhanced Parallel I terat ive Match ing ( E P I M ) a lgor i thm - which 
performs scheduling of cells in generic input-buffered switches. Simulations and 
performance analysis have also been included at the end of this section. 
8 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
2.1.1 Switch Model 
Consider an N x N input-buf fered switch. Figure 2.1 shows the case when N = 2 . 
The switch consists of N input and N output ports, an in ternal ly non-blocking 
switching fabric, buffer at each input por t and a scheduler. As ment ioned earlier, 
t rad i t iona l packet switches w i t h buffer only at input ports employ a single First-
in -F i rs t -Out ( F I F O ) queue at each input por t to store backlogged cells. In that 
case, head-of-l ine ( H O L ) blocking at input ports w i l l l i m i t switch throughput to 
about 58.6%. To overcome this l im i ta t ion , we improve the buffer organizat ion 
in such a way tha t H O L blocking can be el iminated. 
I 
Virtual Output Intemally Non-blocking ！ 
Queueing (VOQ) switching fabrics 
i i 
I 1 
i ————U i 
i ;t"T \ 7 ~|~ 
！ —=T]丨 •、， ..-i j ••••-, / . I~— U i 、.\ _,.-' 
input 1 '••••..-''' 
,八、. 
p d / \ 
input 2 
~~• 4~ 
Scheduler • 4 
Figure 2.1: A 2 x 2 Input-bufFered switch w i t h V O Q . 
Instead of keeping a single F I F O queue, every input port maintains N logical 
queues, each of which is dedicated for one of the N output ports. In doing 
so, cells destining to different output ports can be considered simultaneously 
dur ing scheduling, therefore entirely e l iminat ing H O L blocking at input ports. 
This technique is known as V i r t ua l Outpu t Queueing (VOQ) because the switch 
behaves as i f i t is an output queueing switch. 
9 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
The scheduler is responsible for resolving i npu t and ou tpu t confl icts wh i le 
sat isfy ing the performance guarantee to cells as promised dur ing cal l setup t ime. 
I t is the actua l un i t of the swi tch to imp lement the schedul ing a lgo r i t hm de-
scribed i n subsequent sections. The decision of the scheduler is then passed 
to the swi tch ing fabrics, and connect ion among inpu t and ou tpu ts w i l l be set 
up accordingly. I t is how the scheduler comes up w i t h th is decision the m a i n 
concern of th is thesis. 
Every ou tpu t por t is considered to have some tokens available i n each t i m e 
slot. O u t p u t por t j gets k tokens in t ime slot t means tha t at t i m e t , ou tpu t 
j can t ransmi t at most k cells to the corresponding outgoing l ink . Besides, an 
inpu t can send at most 1 cells to the ou tpu t . I n our discussion, we assume tha t 
each ou tpu t por t gets 1 token per t ime slot. I n fact , we can easily extend this 
to more than 1 token per t ime slot. Some switches are s t ruc tured as a cascade 
of several stages of modules, one of wh ich is the Clos network. For these types 
of switches, tokens can be viewed as the in terna l bandw id th between the inpu t 
module and each of the ou tpu t modules in tha t par t icu lar t i m e slot. 
2.2 Enhanced Parallel Iterative Matching Al-
gorithm (EPIM) 
2.2.1 Motivation 
As ment ioned in chapter 1，there are three phases in or ig inal P I M a lgor i thm, 
namely Request, Grant and Accept phases. The a lgor i thm makes use of random-




Chapter 2 Principle of Enchanced PIM A lgorithrn, 
b ipar t i te graph, wh ich corresponds to the transmission schedule in a part icu-
lar t ime slot. Figure 2.2(a) demonstrates the f irst i te ra t ion of the a lgor i thm in 
t ime slot 1. I n each phase, only the transmission of cells made in current t ime 
slot is considered. Tha t is, i f input i has a cell for ou tpu t j , once ou tpu t j has 
been assigned to another input in current t ime slot, the request f r om i to j w i l l 
never be processed. On the other hand, a current ly matched inpu t w i l l not send 
any more requests to outputs for which i t has unscheduled cells i n subsequent 
i terat ions. 
~ m m i , , - " r n m , ~ m a , a a i 
ml^ ^ mh： • ml ^ ^ a * f ^ 
~ ~ ^ \^ ~~m \ ~ ^ ~ " ^ / 
= ^ 2 x O ^ 2 = s , \ ^ : : ^ 2 = a 2 / 
~ ~ a V J ? | s ^ 、 ^ ^ i v • m K j y ^ 
^ i ^ ^ l \ ^ l \ . ^ ^ l ^ ^ 
m m m a 
Request Grant Accept Request 
(a) (b) 
Figure 2.2: (a) P I M : first i terat ion, t ime slot 1. (b) Request phase of second 
i terat ion, t ime slot 1. 
Figure 2.2(b) shows the Request phase of the second i terat ion. A f te r the 
first i terat ion, connections between input 1 and output 1, and input 2 and 
output 3 have been set up. Input 3 remains unmatched for the t ime being, 
therefore, i t w i l l send requests to output 1, 2, and 3. Since outputs 1 and 3 have 
already been matched, i t w i l l neglect those requests and postpone scheduling of 
the corresponding cells to the next t ime slot. Only output 2 w i l l consider the 
request sent by input 3 in this i terat ion. 
The above restr ict ion is definitely unnecessary because we can make use of 
11 
Chapter 2 Principle of Enchanced PIM Algorithm 
this in fo rmat ion to schedule cells for subsequent t ime slots i n later i terat ions. As 
a result , at the end of current t ime slot, we would come up w i t h an inpu t -ou tpu t 
schedule for th is par t icu lar t ime slot and a par t ia l l y f inished schedule for t ime 
slots thereafter. Th is would reduce the size of the match ing prob lem faced by 
subsequent t ime slots and thus fewer i terat ions is needed to provide the same 
performance as the or ig inal P I M a lgor i thm. We w i l l describe how this can be 
done in later sections. 
2.2.2 Algorithm 
I n this subsection, Enhanced Paral lel I terat ive A l g o r i t h m ( E P I M ) w i l l be dis-
cussed in detai l , and its concept w i l l be visualized v ia a simple example. E P I M 
by itself cannot provide performance guarantees to connections, mak ing i t suit-
able for scheduling transmission of cells belonging the best-effort service cate-
gory. 
The essence of E P I M is the advance booking of tokens and br ing ing forward 
of transmission schedule f rom previous t ime slots. How these can be achieved 
w i l l be demonstrated in the fol lowing a lgor i thm description. 
The concept of paral lel ism, randomness, and i terat ion in or ig inal P I M is st i l l 
in use in our a lgor i thm. In part icular , paral lel ism ensures that the a lgor i thm is 
a d is t r ibuted one and each i npu t / ou tpu t port can accomplish its task w i thout 
knowing what others are doing. Each i terat ion in our a lgor i thm consists of three 
phases, the Request, Grant , and Accept phase. The fol lowing three phases would 
be repeated for a fixed number of t imes or un t i l a max ima l match ing is found 







Chapter 2 Principle of Enchanced PIM A lgorithrn, 
R e q u e s t Every input which is unmatched in current t ime slot w i l l 
send a request to every ou tpu t for which i t has an unscheduled cell. 
G r a n t I f an ou tpu t receives any requests, i t checks for the f irst 
available token and the corresponding t ime slot number. The ou tpu t 
then grants to one of the requests chosen randomly and i t notif ies 
the corresponding input por t which t ime slot this grant belongs to. 
Accept I f an input receives any grants, it w i l l randomly accept 
one grant f r om each of the t ime slots to which the grants belong. 
I n the or ig inal P I M algor i thm, each input w i l l be matched to a part icu lar 
output por t i n the current t ime slot. However, owing to the fact tha t advance 
booking of tokens at output ports is allowed in E P I M , an input por t w i l l be 
associated w i t h more than one output ports in different t ime slots at any t ime, 
To achieve this, every input has to keep track of i ts connection to outputs in 
various t ime slots. W i t h this in format ion on hand, an input should know i f 
i t is eligible to send requests at the beginning of each t ime slot. The Request 
phase of our a lgor i thm ensures that scheduling cells in current t ime slot is of 
higher p r io r i t y to later t ime slots. This means that once an inpu t is matched 
to an output in current t ime slot, i t should allow other inputs to become more 
probable to be matched by not sending any more requests fur ther i n this t ime 
slot. 
Grant phase of E P I M is quite different f rom the or iginal P I M in the sense 
that token belonging to t ime slots other than the current one w i l l be considered. 
A token is equivalent to a transmission oppor tun i ty f rom an input to an output 
in a t ime slot. Since advance booking of output ports is possible, the first 
available token may be located somewhere after the current t ime slots。It is this 
13 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
proper ty tha t resolves the problem of neglecting inpu t requests by a matched 
ou tpu t and hence makes i t possible to schedule cells for subsequent t ime slot in 
later i terat ions. 
E P I M allows mu l t i p le grants to be accepted by an inpu t simultaneously in 
each i terat ion. I n Accept phase, an input can accept more than one grant as 
long as the tokens come f rom different outputs at different t ime slots。 
Figure 2.3 shows an example of how E P I M really works。We continue to demon-
strate the a lgor i thm using the example in section 2.2。1. Here, we assume that 
the scheduler would invoke 2 i terat ions in every t ime slot. I t should be noted 
that the f irst i te ra t ion of E P I M is the same as the or ig inal P I M , which has been 
shown in Figure 2.2. Thus, we start f rom the second i tera t ion of the t ime slot 
1. 
Time Time ^ ^ Time ^ ^ 
~ m T i . Slot - ^ ¾ Slot —^m 1 Slot 
a i f a • rni 1 7« a • a * # a m 
兰 / ^ f ^ 叫 
a \ ^ , • B ^ N y V ^ 口 ^ ^ y ^ ^ 回 
]J?^ ] 3 ^ ^ ] J}^ 
— a V^——Xs • ~ ~ ^ s r s i o t 2 \ g ] • — j V ^ m • 
Bl Bl B 
Request Grant Accept 
Figure 2.3: E P I M : second i terat ion, t ime slot 1. 
Based on the results of Figure 2.2, output ports 1 and 3 have been matched 
w i t h input ports 1 and 2 respectively. I n the second i terat ion of t ime slot 1, input 
3, which is st i l l unmatched, w i l l send requests to al l output ports for which i t has 
unscheduled cells. Upon receiving any request, an output w i l l check for its first 
14 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
available token wh ich w i l l be granted to an i npu t i n Grant phase. For ou tpu t 
ports 1 and 3，their f i rst available tokens reside i n t i m e slot 2. Since ou tpu t 
3's requests are the only requests they receive, ou tpu t 1 and 3 w i l l grant tokens 
belong to t i m e slot 2 i n the Grant phase. For ou tpu t 2, i t is s t i l l unmatched i n 
current t i m e slot and so i t grants the token of t ime slot 1 to i npu t 3. 
I n Accept phase, i npu t 3 w i l l randomly accept one grant f r o m each of the 
t i m e slots to wh ich the grants belong. The grants received by i npu t 3 are f r o m 
2 t i m e slots: 1 grant ( f rom ou tpu t 2) of t i m e slot 1, and 2 grants ( f rom ou tpu t 1 
and 3) of t i m e slot 2. For the grant belonging to t i m e slot 1, i t must be accepted 
r ight away. For the other 2 grants of t i m e slot 2, only one of t h e m w i l l be chosen 
randomly to be accepted. I n the example ou tpu t l ' s grant is picked. Th is is our 
result obta ined at the end of t ime slot 1 - a fu l l y determined swi tch ing schedule 
for t i m e slot 1 and a par t ia l l y f inished schedule for t ime slot 2. 
The par t ia l l y f inished schedule for t ime slot 2 is brought fo rward to next 
t ime slot. The f i rst i te ra t ion of E P I M in t ime slot 2 is shown i n F igure 2.4. I n 
the Request phase, inpu t 3 can no longer send requests to i ts desired outputs 
because i n t i m e slot 1, i t has been scheduled to connect w i t h ou tpu t 3 i n t ime 
slot 2. The other inputs send requests to thei r desired outputs as usual. I n 
Grant phase, outputs again check for thei r first available tokens - i npu t l ,s is at 
t ime slot 3, and inpu t 2's and 3's are at current t ime slot. These outputs w i l l 
fol low simi lar rules as described previously. As a result , i npu t 1 gets 2 grants, 
each of t hem belonging to slot 2 and 3 respectively. Inpu t 2 received only 1 
grant f r om ou tpu t 2. 
The Accept phase for this part of the example is done simi lar ly. Again, an 
inpu t w i l l accept grants f r om more than one output i n each i terat ion, provided 
15 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
Time Time 3 Time g 3 
~ g Slot 2 3 _ • ^ siot ~ ~ g , siot 
~ ~ a L ^ ~ ~ ^ [ 1 • ~ ~ a ^ slot3 回 • ~ ~ ~ ^ \ slot3 回四 
•等。•餐。_i^-
^ / \ n ^ / \ o ^ / \ ™ 
Bl Bl B 
Request Grant Accept 
Figure 2.4: E P I M : f irst i terat ion, t ime slot 2. 
that these grants belong to different t ime slot. I n that case, no inpu t confl ict w i l l 
happen, as the input is matched to mu l t ip le outputs in dif ferent, non-overlapping 
t ime slots. 
2.3 Performance Evaluation 
2.3.1 Simulation 
A 16x16 input-buffered switch w i t h Bernoul l i cell arr ival process at each input 
port is used in our simulations. In asynchronous transfer mode, t ime axis is 
div ided in to t ime slots, which is equivalent to the t ime needed to t ransmi t 53-
byte cells. Basically, cell arrivals in different t ime slots are independent. The 
probabi l i ty that there is a cell arr iv ing at an input port is referred to as the 
offered load of the switch. The destination of a new incoming cell is chosen 
randomly among al l output ports. 
We have simulated bo th the original P I M algor i thm and our enhanced version 
of i t . Mean cell delay is the performance parameter of our interest. The number 
16 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
of i terat ions to be done in a t ime slot vary f r om one to four. 
I t should be noted tha t in E P I M , one i te ra t ion scenario is effectively ident ical 
to the or ig inal P I M w i t h single i terat ion. The reason is obvious - in i t ia l ly , the 
a lgor i thm intends to compute schedule for later t ime slots f r om second i tera t ion 
and onwards. I f only one i tera t ion is invoked in each t ime slot, no advance 
booking of tokens can ever be made f rom the very beginning. As a result, the 
single i te ra t ion of E P I M can only per form scheduling of cells i n current t ime 
slot. Therefore, we would concentrate on the performance difference in 2 to 4 
i terat ions. 
Figure 2.5 is the graph of our s imulat ion results. We varied the offered load 
f rom 0 to 1 and studied the switch mean delay (cell t ime slots) experienced by 
a cell. 
30| 1 1 1 i 1 [ 1 1 ： 1 ^ 
^ ^ 丨 
^¢ - EPIM: 2-4 iterations : ‘ EPIM 
• a PIM: 1-4 iterations -L' 2-4 
； ；拿 
i PIM ； PIM ： I i 
- ^15 - 1 iter. 2 iter. .. : i -
i ； I I' 
§ ^ i ‘ 
' 1 � - , / / -
5 - / • ^y _ 
.^^_^^^<«^一.,.7 
oi 0 啦,巾 ^^t^^^^H#^^^=t^=^^f^^^t^#"*^T^"".Ty I , 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 2.5: Delay performance of P I M , E P I M and Outpu t Queueing. 
17 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
I t is found tha t i n or ig inal P I M , the delay performance improves signif icant ly 
w i t h more i terat ions. Four i terat ions are needed for i ts performance to approach 
tha t of ou tpu t queueing switch. W i t h respect to E P I M , the fo l lowing results are 
obtained: 
1. Mean cell delay experienced by connections is insensit ive to the number of 
i terat ions invoked by the switch. 
2. Delay performance of E P I M w i t h any number of i terat ions is comparable 
to tha t of or ig inal P I M a lgor i thm having 4 i terat ions. 
The above two findings imp l y that w i t h enhanced P I M , 2 i terat ions are 
enough to achieve a delay performance as good as output queueing switch, and 
this can only be achieved w i t h 4 i terat ions in or iginal P I M algorittuTL I t is a 
very remarkable improvement over the or iginal P I M , meaning that i f E P I M is 
used, scheduler needs to invoke two i terat ions only in order to a t ta in simi lar 
performance of a 4- i terat ion P I M scheduler. Fewer number of i terat ions means 
higher speed of cells scheduling, which is essential to meet today's increasing 
demand for high-speed switching and transmission. 
2.3.2 Delay Analysis 
I n this section, the delay analysis of P I M and E P I M w i l l be discussed。It is 
found that the m a x i m u m throughput calculated in our analysis matches w i t h 
the findings in our simulations. The delay and throughput performance of E P I M 
starts to overwhelm that of P I M when more than iterations are invoked, which 
is in paral lel w i t h the graph shown in Figure 2.5. 
18 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
The performance of E P I M and P I M in an N x N input-buf fered swi tch under 
saturat ion is of our interest. Let p be the offered load of the switch, and under 
saturated state, p=l. Un i f o rm and independent incoming traf f ic pa t te rn is as-
sumed, which means that each packet, on ar r iv ing at an inpu t por t , are equally 
probable to be destined at any one of the N output ports. We also assume that 
the size of the switch (i.e. N) is large in our analysis. 
Analytical Results of PIM 
To begin w i t h , consider the case when only one i tera t ion is invoked in each t ime 
slot. When the switch is saturated, each output por t must receive a request 
f rom every unmatched input por t in the Request phase. Since there is only one 
i terat ion in a t ime slot, an output port w i l l receive N requests each t ime, and 
hence i t w i l l randomly choose one out of these N requests to be granted. I n the 
Accept phase, a connection w i l l be set up i f an input por t receive any grant. 
The probabi l i ty that an input port would receive a grant is also the probabi l i ty 
of sett ing up a connection. 
For an input por t not to receive any grant in the Grant phase, the probabi l i ty 
is given by: 
. . ( l、N 
P r { i n p u t receives NO grant } = ^1 — — J (2.1) 
The probabi l i ty that an input would receive at least one grant durn ig the first 
i terat ion is: 
^ . . / 1 \ N 
p = 1 — P r { i n p u t receives NO grant} = 1 - |^ 1 — — j (2.2) 
where p is the m a x i m u m throughput of an input-buffered switch that invokes one 
i terat ion of P I M in each t ime slot. When N is large, the m a x i m u m throughput 
19 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
wi l l approach to 0.632. 
p ^ 1 — e_ i = 0.632 when N — 00 (2.3) 
A 
W h e n two i terat ions are to be invoked each t ime, N = (1 — p)N input -
ou tpu t pairs s t i l l remains unmatched at the beginning of the second i terat ion. 
As a result, only N out of the N input and output ports w i l l par t ic ipate in 
A 
P I M a lgor i thm dur ing the second i terat ion. Subst i tu t ing N w i t h N, we have 
the number of inpu t -ou tpu t pairs that could be matched in the second i tera t ion 
as follows: 
( / 1 \ ^ \ . . , \ 
1 - ( 1 - ^ ) 'N = p-N = p ' i l - p ) N 2.4 
\ V NJ ； 
Let f>x denote the m a x i m u m throughput at ta ined when x i terat ions of P I M are 
invoked in each t ime slot. I n the case when x = 2 , we have 
p, = p ^ p { l - p ) (2.5) 
w i t h p = 0.632. 
The above is s imi lary done for other value of X , the number of i terat ions in a 
t ime slot. The general expression of m a x i m u m throughput of P I M w i t h different 
value of X is given by 
h = p E { i - p y - ' (2.6) 
a ; = l 
Table 2.2 summarises the numerical result of Eq. (2.6) for X equals to 1 to 5. 
These results match w i t h those obtained in our s imulat ion and can be vali-
dated by refering to Figure 2.5. W i t h a single i terat ion of P I M , i t can be seen the 
switch begins to saturate when the offered load approaches 0.6. The si tuat ion 
is similar for the cases w i t h 2 iterations and more. 
20 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
Table 2.1: Number of I terat ions of P I M versus M a x i m u m Throughpu t 
Number of I terat ions 1 2 3 4 5 
M a x i m u m Throughput || 0.632 0.865 0.950 0.981 0.993 
Analytical Results of EPIM 
The proper ty of advance booking of tokens in E P I M is its key d is t inc t ion f r om 
the or ig inal P I M . Theoret ical ly, there is no l i m i t for the number of t ime slots to 
be booked in advance by an output por t . However, i t is also hard for us to analyse 
when there is no l im i t a t i on to this. So, for s impl ic i ty reason, let us assume 
that the ou tpu t ports can book at most 1 token in advance. A l ternat ive ly , the 
locat ion of the f irst available at an output port is restr icted to current t ime and 
the one next to i t . For the rest, we adopt the same traff ic loading and switch 
size assumption as the previous analysis. 
Consider the case when two i terat ions of E P I M are called in each t ime slot. 
Suppose tha t , at the beginning of each t ime slot, a f ract ion p* of the N in-
p u t / o u t p u t ports are matched due to the schedule brought forward f rom previ-
ous t ime slot. The m a x i m u m throughput in this case is: 
h 二 /^* + ( 1 _ " * ) [ 々 + 卢 ( 1 一 卢 ) ] 
= / ) * + 0.865(l - / ) 
= 0 . 8 6 5 + 0.135/)* (2.7) 
where p=0.632 
To f ind p2^ we have to determine the value of p* because we cannot te l l how 
many of the input ports have been scheduled in previous t ime slot for the t ime 
being. In the first i terat ion of this t ime slot, since p* of the input ports are 
21 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
scheduled previously, only the remain ing unmatched inpu t ports, wh ich sums 
up to be ( l - />*)N, w i l l be eligible to send requests dur ing the Request phase 
in the same i terat ion. I n tha t case, / N output ports, wh ich are al l current ly 
scheduled, w i l l randomly select one of the ( l - / ) N requests to grant. These 
grants w i l l be belonging to next t ime slot. Thus, i n the Accept phase of this 
i terat ion, the probab i l i t y tha t any of the ( l - p * ) N inpu t ports w i l l receive this 
type of grants (i.e. grants belonging to next t ime slot) is given by: 
Pi 二 P r { rece ive grants that belong to next t ime slot i n 1st iteration」-
( 1 v*^  
= 1 — 1 
V ( 1 - 摩 ） 
p* 
' / 1 X ( l - p * ) i V l ( 1 - P * ) 
= 1 - 1 I _v (w*)W _ * 
^ 1 - e ~ ^ (2.8) 
A f te r the first i terat ion, only (1 — /?*)(! — p)N input ports remains unmatched 
and w i l l send requests to their destinat ion output ports in the next i terat ion. 
Since an output ports can book one token in advance, only [p* + p*{l — p) — Pi]N 
of the N output ports w i l l make a grant to the requesting input ports for the 
next t ime slot. Hence, the probabi l i ty that an input por t w i l l receive grants 
which are issued by these output ports and belong to the next t ime slot is given 
by: 
P2 = Pr{ rece ive grants that belong to next t ime slot in 2nd i te ra t ion} 
( 1 x [ p * + p * ( l - p ) - P i ] N 
= l — � l — (l-,)(l-/5)7vJ 
__ p*+p*(i-p)-Pi 
/ 1 \ ( l - p * ) ( l - p ) A ^ 1 ( i - p * ) ( i - p ) 
= ^ " [ v ( l - p ^ ) ( l - / 3 ) i v j 
22 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
p * + p * ( i - p ) - P i , 、 
记 1 — e-~(i-p*)(i-p) when N 一 oo (�.y) 
The sum of P i and P2 is the f ract ion of ou tpu t ports tha t are scheduled in 
the next t ime after two i terat ions of E P I M in current t ime slot. Therefore, we 
have 
/ 二 尸 1 + 户2 (2.10) 
Solving Eq. (2.8), (2.9) and (2.10) would y ie ld p^ = 0.49, P! 二 0。315, and 
P2 = 0.175. Subst i tu t ing /?* = 0.49 in to Eq. (2.7), we have 
f>2 = 0.865 + 0.135 X 0.49 = 0.931 (2.11) 
This is the m a x i m a u m throughput attainable by an input-buf fered switch em-
ploying E P I M w i t h 2 i terations. Note that in this analysis, we assume that only 
the token belonging to the next t ime slot can be booked in advance。 
B y using simi lar rat ionale, we have calculated the analyt ical results of the 
m a x i m u m throughput of E P I M w i t h different number of i terat ions invoked in 
each t ime slot. These results are summarised in the fol lowing table: 
Table 2.2: Number of I terations of P I M versus M a x i m u m Throughput 
Number of I terat ions 1 2 3 4 
M a x i m u m Throughput || 0.632 0.931 0.979 0.992 
In E P I M , there should be no difference f rom the or iginal P I M in terms of 
performance when only one i terat ion is invoked in each t ime slot. However, 
when two or more iterations are to be called, the advance booking of token in 
E P I M allows us to achieve better delay and throughput of the switch. From 
23 
Chapter 2 Principle of Enchanced PIM A lgorithrn, 
the above analysis, i t can be seen that the m a x i m u m throughput of E P I M w i t h 
two to four i terat ions varies f rom 0.931 to 0.992. This is in agreement w i t h the 
result shown in Figure 2.5. 
24 
Chapter 3 
Providing Bandwidth Guarantee 
in Input-Buffered Switches 
3.1 Introduction 
This section w i l l be dedicated to demonstrate how E P I M can provide bandwid th 
guarantee to calls through incorporat ion w i t h static scheduling a lgor i thm。 I t 
should be noted that certain applications, especially the mu l t imed ia ones, re-
quire a f ixed m i n i m u m bandwid th to be allocated along the v i r tua l circuits 
f rom sources to destinations. On in i t ia t ing a connection, an appl icat ion has to 
declare its offered load and bandwid th requirement expl ic i t ly to the network 
switches. These parameters w i l l be used by the call admission control protocol 
which judges the admissibi l i ty of the request by the switch. The feasibi l i ty of 
this method is based on the assumption that the connection pat tern at call level 
is not to be changed frequently and i t pays for the switch to pre-determine a 
25 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
schedule dur ing call setup to handle the reservations. The call admission crite-
r ion and the mechanism for determin ing a schedule w i t h bandw id th guarantee 
w i l l be described shortly. 
3.2 Bandwidth Reservation in Static Schedul-
ing Algorithm 
I n our approach, we assume the capacity of l inks to be normal ized to 1, which is 
also the upper l i m i t of the bandwid th requirement of inpu t -ou tpu t connection. 
Let bij be the bandwid th requirement between input i and output j . Let B = (¾" 
be the bandwid th requirement ma t r i x which contains al l bandwid th reservation 
between any possible input -ou tput combinations. 
( bo,o ^o,i . . . i>o,N-l 
W,o h,i • • • i>i,N-i 
JD — 
• • • • 
• . • • 
\ bN-l,0 ^ V - 1 , 1 . . . ^ V - l , i V - l 
Since no over-subscription of transmission l ine is allowed, 
N-1 
E ^ < 1 > r a l l j (3.1) 
i=0 
N-1 
Y , b j^ < 1 , for all i (3.2) 
j=0 
where (4.1) and (4.2) reflect that the the to ta l normalised bandwid th available 
in each l ink is l im i ted by 1. 
The next step is to realize the bandwidth requirement in our schedule. T ime 
axis is div ided into frames consisting of f ixed number of t ime slots, where a t ime 
26 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
slot is equivalent to the transmission t ime of one A T M cell. We then convert 
the above bandw id th requirement in to the number of cells t ransmi t ted in each 
frame. I t is done by s imply mu l t i p l y i ng kj w i t h f rame size F . B y reserving 
dj = \bij X F] slots in every frame will ensure a bandwidth requirement k:j on 
average in the long run. Not ice that Cij must be a non-negative integer for al l 
i and j . The fo l lowing m a t r i x summarizes the number of cells to be sent f r om 
input i to ou tpu t j w i t h i n a frame: 
C = 「F X B 
/ \ 
C0 ,0 C o , i . . . C o , 7 V - l 
C l , 0 C l , l • • 。 C l , 7 V - l 
• • . • 
• • • • 
\ CAT-1,0 CiV-1,1 。. . CN-1,N-1 / 
We call the above ma t r i x the capacity ma t r i x C — [ c i j ] . I t should be noted 
that the sum of al l co lumn entries should be smaller than F, and so does the 
sum of al l row entries. The reason is that the upper bound of the number of 
cells sent(received) by an inpu t (ou tpu t ) w i t h i n a frame is F. Tha t is: 
N-1 
Y ^ c y S F , for al l j (3.3) 
i=0 
N-1 
J2 c”. < F , for all i (3.4) 
j=o 
where (4.3) and (4.4) means that at most F cells can be processed by an input 
or output port in each frame. 
The capacity ma t r i x is then used to determine a conflict-free schedule w i t h in 
each frame. B}^ conflict-free we mean that in the same t ime slot, no more than 
27 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
one i npu t should be connected to a single ou tpu t . The connect ion p a t t e r n of the 
cross-bar swi tch is set up and varies f r o m slot to slot. The whole schedule w i l l 
be repeated every F t i m e slots. Such schedule w i l l be recalculated when any of 
the entries i n the capaci ty m a t r i x is altered. The so lut ion is we l l f o rmu la ted and 
has been described i n the l i te ra tu re [1, 14, 19, 24]. Here, we adopt the approach 
of b ipa r t i t e graph edge color ing to handle the prob lem. 
Consider an N x N in te rna l l y non-b lock ing swi tch wh ich imp lemen ted V〇Q 
at i npu t por ts . T h e re lat ionship between th is swi tch and the capaci ty m a t r i x 
can be v isual ized i n the fo l lowing f igure: 
Capacity Matrix ^ ^ ^ ； Coo s • 0 
X| 0 1 … “ ^ ^ ^ x 
0 Coo Coi ••• CoN \ ^ \ C o i \ 
1 Cio Cii • • • CiN \ ^ \ X ^ ^ K 
:::...: 1 • 乂 i ^ 1 
- ^ - - i \ < 丨 
N CNO Cwi • • • CNN I \ \ I 
i � 
(^jj I number of toknes allocated for \ \ 
input i to outputj N • \ ^ N 
Figure 3.1: Capaci ty M a t r i x and i ts corresponding B ipa r t i t e Graph. 
I n the above diagram, the nodes represent the inpu t and ou tpu t por ts , whi le 
an edge between inpu t i and ou tpu t j is considered as a connect ion between this 
i npu t -ou tpu t pair . Let G 二 G ( / , 0) be the above b ipa r t i t e graph, where I and 
0 are the sets of inpu t and ou tpu t nodes respectively. B y Eq (4.3) and (4.4), i t 
is obvious tha t the degree of G is F. 
Figure 3.1 shows the aggregate connection requests between the i npu t and 
28 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
output sides w i t h i n a frame's t ime {F t ime slots). As stated before, no two 
inputs (outputs) should be connected to the same ou tpu t ( inpu t ) in a t ime slot. 
Determin ing a confl ict-free schedule of the switch can be modeled as the prob lem 
of f ind ing an edge coloring for this b ipar t i te graph w i t h F colors, where each 
color represents a t ime slot w i t h i n a frame. Being a confl ict-free schedule, no two 
adjacent edges i n this edge coloring can be of the same color. I n graph theory, 
i t has been proven that a regular b ipar t i te graph of degree F is F colorable, 
meaning tha t only F d ist inct colors are enough to f ind an edge coloring w i t h such 
property.. The result ing edge coloring is one-one correspondent to a conflict-free 
frame schedule of the switch. This can be i l lust rated more clearly by using an 
example in Figure 3.2. 
^ \ , / 穷 T i m e ^ ^ ~ ~ F = 6 - ^ H 
' W \ / / / / 溫—丄丄」..+22_4- 4 i 5 + ” • 
\ \ \ \ / / 々 Color Time Slot 0 o ： , _ , ^ _、 ‘、 _ _ , : 
^^^/,/ 0 ^ — — Eimimimimim； ••• 
:^^^^\ 二二： ^ 
//" \ � \ z~z 1 丄 i [T] im:siE:E:ra��* 
,d—.—%^ 1 ^ " ^ ^ ~ ^ 
I i I tokne reserved for input i 
(a) (b) 
Figure 3.2: A n example of Static Scheduling A lgo r i t hm ( N = 2 , F = 6 ) . 
In our example, suppose that the switch N = 2 ^ F = 6 the fol lowing is the 
bandwid th requirement ma t r i x and corresponding capacity ma t r i x faced by the 
29 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
switch: 
( 1 2 \ 
B = 3 3 
2 1 V 3 3 / 
/ 1 2 \ 
C 二 6 X 3 3 
2 1 
V 3 3 / 
_ (2 4 \ 
U 2； 
The above capacity ma t r i x w i l l be converted to the b ipar t i te graph of degree 
six as shown in Figure 3.2(a). We then apply edge-coloring technique to this 
graph using six colors, each of them represents a specific t ime slot w i t h a frame. 
This colored b ipar t i te graph w i l l be used to construct the f rame schedule of the 
switch. The result ing schedule could fu l f i l l the bandwid th requirement of the 
connections on average.(Figure 3.2(b)) 
Another example is given in Figure 3.3. Here, we consider a 3 x 3 switch w i t h 
frame size 10 w i t h the fol lowing bandwid th requirement and capacity matrices: 
( 0 . 2 0 0 . 1 、 
B = 0.1 0.2 0 
、 0 0.2 0 j 
丨 0.2 0 0.1、 
C = 10 X 0.1 0.2 0 
、 0 0.2 0 ^ 
30 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
( \ 
2 0 1 
二 1 2 0 
、0 2 0 j 
Unl ike previous example, the inpu t ports d id not reserve al l the b a n d w i d t h 
of the ou tpu t l inks. I t is because 
E ^ . < 1 , for al l j (3.5) 
i-Q 
2 
Y ^ b,j < 1 ， f o r al l i (3.6) 
j=o 
Figure 3.3 shows connection b ipar t i te graph and corresponding stat ic f rame 
schedule of this example. Note tha t only 3, instead of 10, colors are enough to 
complete the edge-coloring task as the degree of this b ipa r t i t e graph is 3。 I t is 
equivalent to a m a x i m u m reservation of 3 tokens per f rame in each ou tpu t por t . 
B o t h inpu t and ou tpu t sides w i l l be not i f ied of this schedule. Connections 
requir ing bandw id th w i l l t ransmi t their cells in corresponding reserved t ime slots 
accordingly. I t can be seen that the larger the f rame size granu lar i ty w i l l be. 
For example, let F equal to 10 and bij be | . Since number of slots al located 
must be an integer, 2 t ime slots must be reserved for this connection, despite the 
fact that bij x F equal to 1.6. In the long run, a por t ion (0.4 on average) of the 
bandwid th reserved exceeds that of being required. However, i f F equals to 16, 
bij X F gives 2 exactly, therefore avoiding the excessive reservation due to round-
off. The down-side of large F is that the degree of the corresponding b ipar t i te 
graph is higher and the size of the schedule to be kept t rack by the scheduler 
31 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
0 释 • 0 
\ Z " Color Time Slot 
、..z 
.,、.. ：::^ :^  0 
1 ‘ : -r：：：;^ 1 5 
^ , ^ | - - - - l 9 
2 4 ^ 參 2 
^ F = 10 • 
Time 
Slot 0 1 2 : 3 4 i 5 6 丨 7 ： 8 9 10 • • • 
i ...... .-.-! -..- —--...- -.丨--..----... i - ^ • i I I 
0 0 ^__^ i ： : : 丨 I__> I 丨 ：J_J； (__I 
—— —— 0 i ： 0 i 1 丨 0 • • • 
1•_I~~I 1 I I—I i *—^ i — 
I I ； i ： ： 
！ I • • I I i 1 • : I 二 ： 
1 i : • I ‘ ； ： 
1 1 _ _ I : _ ； ： _ _ . _ _ , 
—— ——r^ i : :nr ; s : s 
I i I I : I 1 I I 5 ： 
1 . ! : • I ： ； : 
i 二 ; ； I ‘ ； 
i i - ！ 
； i . • ‘ I 
2_ J _ 丨 ：囚i : • “ 
^ , i . ‘ ‘ ‘ • ‘ • 1 i ‘ ！ ‘ ‘ 
~T~| tokne reserved for input i 
Figure 3.3: Capacity Ma t r i x and its corresponding B ipar t i te Graph. 
w i l l be greater. I n addit ion, larger frames introduces longer delay to cells. As 
a result, there is trade-off between bandwidth granular i ty and resources needed 
to obtain and mainta in the resulting schedule as well as delay experienced by 
cells. 
3.3 Incorporation of Dynamic and Static Schedul-
ing Algorithms 
Supporting connections of both best-effort and bandwidth guaranteed service 
categories by switches is essential in real networking scenarios. Researchers has 
32 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
suggested a number of methods to deal w i t h the issue in inpu t buffered switch 
base on P I M [1, 8]. Right now, we are ready to introduce how to incorporate 
the two scheduling algori thms together to achieve our scheduling object ive. 
As stated before, the transmission oppor tun i ty of an ou tput i n each t ime slot 
can be viewed as a token, and each token is labeled by the t ime slot to which i t 
belongs. We call the value of this t ime slot label the expiry t ime of the token。 
For cells belonging to the best-effort service category, E P I M resolves the output 
contention problem v ia its three-phrase a lgor i thm and allocate as much tokens to 
inputs as possible in each i terat ion. I t also allows advance booking of tokens so 
as to reduce the number of i terat ions needed to a t ta in max ima l matching. Static 
scheduling a lgor i thm reserves certain number of t ime slots, which can also be 
visualized as tokens, w i t h i n a frame to those connections requir ing bandwid th 
guarantees. These reserved tokens are dedicated solely to the corresponding 
connection, otherwise, that connection cannot receive the bandwid th guarantee 
i t deserves. 
To combine these two schemes, we have to make sure that the advance book-
ing of tokens employed by E P I M would not affect the schedule pre-computed by 
static scheduling algor i thm. This can be done by forb idding E P I M to book in 
advance those tokens which have already been reserved. I f the reserved token's 
expiry t ime becomes the current t ime slot and the associated connection has got 
a cell for i t , the input w i l l send a Connect Signal. That means a connection must 
be set up between this input -output pair regardless of any request received by 
the output . Nevertheless, i f no cell f rom reserved connection destines to the out-
put when the token reaches its expiry t ime, we w i l l release the token and allow 
inputs to compete for i t using E P I M as usual. Let us i l lustrate the modi f icat ion 
33 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
th rough an example (Figure 3.4). 
Time Time | ^ime | 
Slpt k 丨 k+1 ..SM....^......“..?^L.. 一 .SlQtk , ^ 
m 0 .eonnect • — . • — [ ^ 。 《 J^ _ • 。 • ^ _ © 
mm i < i ^ , , * 囚丨 [I]Q] 1 . < ^ ( t k.囚丨 S[I] 1 . ^ ( < ? . ①丨 
[ : [ “ 入 . 1 [ [ " 7 X . m . / X . i o 
Request Grant Accept 
— C o n n e c t S . g n a l • 二 二 • ™ e „ 
OrdinarySignaIusedin 「......... r \ Tokenbookedvia 
——^ DynamicScheduling i.—i Best-effortCells ^ DynamicSchedule 
Algorithm Algorithm 
Figure 3.4: Example of Incorpor ta t ing E P I M w i t h Statci Scheduling A lgo r i t hm. 
We expect tha t the number of i terat ions needed to achieve m a x i m a l match ing 
should be comparable to or even smaller than tha t of E P I M alone. I t is because 
part of the schedule (i.e w i t h bandwid th reservation) is pre-determined by the 
scheduler dur ing call setup, and the remain ing b ipar t i te graph should be of 
smaller size. As a result, the problem size of f inding a max ima l match ing is 
fur ther reduced, hence fewer i terat ions can be invoked to achieve the task. 
3.4 Simulation 
To investigate the performance of incorporat ing E P I M w i t h stat ic scheduling 
a lgor i thm, a series of simulations w i t h different token al locat ion patterns have 
been run. We would compare the performance of E P I M w i t h tha t of P I M 
after the in t roduct ion of static scheduling in both cases. The token d is t r ibu t ion 
patterns of interest are: (i) evenly d is t r ibuted token al location, and ( i i ) aligned 
34 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
token al locat ion pat tern. 
3.4.1 Switch Model 
The swi tch model used in our simulat ions is an extended version of the one 
described in section. Again, the performance of a 16x16 input-buf fered swi tch 
is studied. The f rame size of our switch is 32. V i r t u a l Ou tpu t Queueing ( V O Q ) 
is s t i l l used in the inpu t buffer at each input por t . However, since the swi tch w i l l 
now accommodate for two types of cells, each v i r t ua l ou tpu t queue at an inpu t 
por t w i l l be spl i t in to two separate queues, in which one for best-effort cells 
whi le the other for performance guaranteed cells. The new buffer management 
scheme is depicted in Figure 3.5(a). 
VOQ for Bandwidth |\ 
Guarantee cells | \ 
• 口 I \ Intemally Non-blocking 
j I I I i \. Input Buffer switching fabrics 
； I ‘ —-— Virtual Output 
i I I II I Queues 
" j = r p = T | ! I B ^ d w i d h & ^ r ^ t e e 
I ^ ^ 虛 . 」 \ / 『 〒 ， 
VOQ for Best-Effort / Inp"t 1 +••..,/ 1/2/^1^\ » | Servicecels / ,..„ , offered_^^ 1^^~~~^16 . 
幽 / | 平 , / . . .\ l。ad ^ X f ^ — 
• • I / | ^ g J " T ^ ^ i B ： 
！ i I / L—ii-p&-y-_-」 Best-effort • 
i ！ i / 
i I——I,——n i / ~~• *~^  
i L J L J | | | / L_» ™ e r ^ _ —16 
Detai led picture of Input 
Buffer Organistaion 
(a) (b) 
Figure 3.5: (a) Switch Model and Buffer Management for E P I M w i t h stat ic 
scheduling, (b) Traff ic characteristics used in simulations. 
As i n section, un i fo rm incoming traff ic is assumed in the simulat ions. Let 
p be the offered load at each input l ink. A cell w i l l arr ive at an input l ink 
w i t h probabi l i ty p and i t w i l l belong to either best-effort or bandwid th guar-
anteed service categories by random selection. Each output is equally probable 
35 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
to become the dest inat ion of the cell. This relat ionship is also shown in Figure 
3.5(b). 
The stat ic scheduling a lgor i thm int roduced in this chapter uses f raming strat-
egy and token al locat ion a lgor i thm to achieve the bandwid th reservation. I t 
should be noted tha t no mat te r how the tokens are d is t r ibu ted over the frame, 
the bandwid th reserved to an input -ou tpu t connection is guaranteed provided 
that the number of tokens allocated to the connection is correct ly calculated 
according to section. For example, w i t h f rame size equals to 32, inpu t i requires 
a normal ized bandwid th reservation of 0.3 to output j , then a to ta l of (0 .3x32) 
= 1 0 tokens of output j have to be allocated to input i in each frame。As long 
as bandwid th reservation is concerned, i t does not mat te r how the 10 tokens 
is d is t r ibuted over the frame. However, when considering the delay and delay 
j i t t e r experienced by a connection, the token d is t r ibut ion pa t te rn and hence the 
a lgor i thm used to determine i t does mat ter . 
‘ For s impl ic i ty, we assume the bandwid th reservation ma t r i x B = [bij] where 
bij equals to 嘉 for al l i and j . Therefore, the number of token allocated to each 
input -output pair per frame w i l l be 1 ( = * x 32). Yet, how these tokens are 
d ist r ibuted over a frame are to be determined. 
3.4.2 Simulation Results 
I n this section, we w i l l compare the results obtained f rom E P I M and P I M after 
incorporat ing w i t h static scheduling a lgor i thm described previously. I n part ic-
ular, two types of token allocation patterns, the evenly d is t r ibuted and aligned 
ones, are used in our simulations under un i form and homogeneous traffic。 
For the sake of clarity, the token allocation at each output port w i t h i n a frame 
36 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
wi l l be represented by token d is t r ibu t ion m a t r i x D 二 [(¾], where d“ equals to 
reserved inpu t por t number by ou tpu t i i n the 产 t ime slot of the f rame, or 
equals to -1 i f there is no bandwid th reservation associated w i t h ou tpu t i i n the 
产 t ime slot. The dimension of D is NxF , where N is number of ou tpu t ports 
of the switch and F is the frame size of our a lgor i thm. In our simulat ions, we 
used the fo l lowing two matrices as the token d is t r ibu t ion for those connections 
requir ing bandwid th guarantee. 
A. Evenly Distributed Token Assignment 
By evenly d is t r ibuted token assignment we mean that the tokens are d is t r ibuted 
over a f rame in a manner that is as un i fo rm as possible. Thus, for the bandwid th 
reservation described in previous section, we have the token d is t r ibu t ion m a t r i x 
Deven of thc fol lowing form: 
丨 0 - 1 15 - 1 … 1 - 1、 
1 - 1 0 —1 ... 2 - 1 
n — 
J^even — 
、15 —1 14 - 1 ... 0 - 1 》 
B. Aligned Token Assignment 
Aligned token assignment is the exact opposite of evenly d is t r ibuted token as-
signment described above. A l l the reserved tokens are aligned together instead 
of being dist r ibuted un i formly over the frame. As a result, we have the token 
37 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
d is t r ibu t ion ma t r i x , D—ned, of the fol lowing form: 
^ 0 15 14 13 . . . - 1 - 1 、 
1 0 15 14 . . . - 1 —1 
Daligned 一 。 • • • 。 , « 。 
, . • • • • • 
、15 14 13 12 . . . —1 —1 j 
Figure 3.6 and Figure 3.7 show the overall mean cell deiay of E P I M and 
P I M w i t h stat ic scheduling versus offered load. I n bo th cases, the performance 
of E P I M w i t h two i terat ions is comparable w i t h that of P I M w i t h four i terat ions. 
This implies that we can reduce the number of i terat ions used and hence improve 
the scheduling speed by employing E P I M w i t h static scheduling to cater for bo th 
best-effort and bandwid th guarantee connections. Moreover, on incorporat ing 
static scheduling a lgor i thm, bo th E P I M and P I M per form at least as wel l as 
before. 
3 0 1 1 1 1 1 1 1 ： i 1 
』 ^ EPIM: 2—4 iterations f 
o- PIM: 1-4iterations f 
2 5 - : I -
: EPIM 
。 | Z 2 _ 4 
^ 20 - ；• K ^ iterations 
1 :『 
1 
1^5- : I -
I PlM PIM : I 
1 1 iter. .. 2 iter. ； J 
i^o- .口 :i -
5- ...... _ ^ -
_ _ _ _ ^ _ ^ ^ ^ " * ^ ^ ^ ^ ^ " ^ * ^ ^ ^ " " " " " ^ 
0 1 1 _ • _ i i _ _ •* B 由••?- “ I- I I I I 
0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 
Offered Load 
Figure 3.6: Overal l Delay Performance of E P I M and P I M w i t h Static Scheduling 
- E v e n l y Dis t r ibuted Token Assignment. 
38 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
30| 1— [ 1 1 1 1 I ; " I ~ ~ r | I EPIM 
+ EPlM:2-4 iterations | f 2 -4 iter. 
• g- PIM: 2 -4 iterations | f piM 
25- • j p ^ 3 - * 
‘ w r : r -
I PIM PIM ; ：： 
" ^ 1 5 - 1 iter. ... 2 iter. :J:: ‘ 
I .4 
OJ • K . 
① ！‘ 
^ 10 - P / 6 -
y^n 
5- . , .^i'''' -
=i^i^^*"""^^ 
i Hi»_^fr"~aT-itr~fr:"r"i!r^^trrr"f""^...。， , , 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 3.7: Overa l l Delay Performance of E P I M and P I M w i t h Stat ic Schedul ing 
- A l i g n e d Token Assignment。 
I n our swi tch model , we store the best-effort and b a n d w i d t h guarantee cells 
w i t h ident ica l ou tpu t dest inat ion in to dif ferent i npu t queues. Th is allows us to 
‘ s tudy the delay performance of these two types of cells under dif ferent token 
d is t r i bu t ion pat terns separately. The results are shown in F igure 3.8, 3.9, 3.10， 
and 3.11. I t is found tha t under our i npu t t raf f ic and b a n d w i d t h requirement 
assumption, there is not much d is t inc t ion i n mean delay experienced by these 
two types of cells i n the evenly d is t r ibu ted and al igned token assignment. I n 
addi t ion, the results are more or less the same as the overal l delay performance of 
the switch. Th is means tha t the in t roduc t ion of bandw id th guarantee cells and 
stat ic scheduling to the swi tch does not affect the performance of the scheduling 
a lgor i thm, assuming tha t the traf f ic pa t te rn and bandw id th requirement of the 
connections fol low the one described i n previous section. 
39 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
30| 1 T 1 1 1 1 I ：~I f i 
~*~ EPIM: 2 - 4 iterations | ‘ 
• -Ei- PIM: 1 ^ iterations | 
25- : -
I EPIM 
0 / X ' 2 - 4 iter. 
2^0- ^ .： K-
•g PIM .: PIM 




•1。- / f -
s - / J -
^ \ ^ 
_ _ _ ^ ^ _ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 
OS!_a_a _ _ *-#-*--"T"""^i 1 1 1 i 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 3.8: Delay Performance of Stat ical ly Scheduled Cells - Evenly D is t r ibu ted 
Token Assignment. 
30| 1 1 1 1 1 1 1 : — I 1~"^n 
j ~*~ EPIM: 2 -4 iterations $ 
Q- PIM: 1 -4 iterations 1 EPIM 
: I ^ 2 -4 iter. 
25 - : fe^ 7 
: r 
f 20 - 0 .. /: -
I PIM PIM :：； 
~ 1 iter. •‘ 2 iter....：： 
| l5- , 1 -
S :J: 
§ / . : 
① r . 
^ 10 - • J ° -
•^•*•"^ "^^ "*•^ "^•"^ 且^ 
01 ^^~fr~is^~jfr-t~:~r^"t^"^~r"^"^ ..。.. I , , , 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 3.9: Delay Performance of Stat ical ly Scheduled Cells - Al igned Token 
Assignment. 
40 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
30| 1 — 1 1 1 1 1 1 ； 1 ！ j| 
^ EP!M: 2-4 iterations | 
• g- PIM: 1-4 iterations | f 
25- P ： f -
• I: EPIM 
: : / ' 2 -4 iter. 
c^20- ： t ^ -
% PIM ... PIM ： 
E 1 iter. : 2 i t e r .； 
i : 
1^5- • I -
f ； , 
^io- ,° :i -
5- .° V ^ -
•.... ^^<<i^. 
__^^^^^*^^^H^>Hf*^t-^^"""^""^*^^"*^"^ 
Ot)~~m~ft _ •辨，B tt"^-f^^^ T 1 1 1 1 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 3.10: Delay Performance of Dynamica l ly Scheduled Cells - Evenly Dis-
t r i bu ted Token Assignment. 
3 0 | T " 1 1 1 1 1 1 r - ~ I 1 ~ r r 
/ ^ EPIM: 2-4 iterations | … ， 
-o- PIM: 1 ^ iterations : EPIM 
/ • 乂 2-4 iter. 
2 5 - 口 ^ / 
: r 
f。_ :丨 _ 
I PIM : PIM :[： 
r 1 iter. ; 2 iter. : |:: 
"53 •• 
r : j _ 
§ : : / 
"^ / 1 _ 
= , 1 ^ 。 • 
__»»>"S-^*^®.. 
c l > _ ^ ^ ~ _ f ^ ~ ^ ^ ~ ~ f r - r : t r r r r ~ r ^ T " " ^ 』 . . , , , 
0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 
Offered Load 
Figure 3.11: Delay Performance of Dynamical ly Scheduled Cells - Al igned Token 
Assignment. 
41 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
Owing to the un i fo rm i ty of incoming traff ic pat tern and bandw id th require-
ment , the average normal ized bandwid th allocated for an inpu t -ou tpu t pair is 
^ . where ^ is f rom static bandwid th reservation and the other • ( = | f x 吉） 
io oZ 
is al located by the dynamic E P I M scheduling a lgor i thm dur ing non-reserved 
t ime slot. On the other hand, the amount of ou tput bandw id th al located for 
each input by or ig inal P I M is also ^ . This accounts for the s imi la r i ty in delay 
performance in al l the different cases. 
3.5 Comparison with Existing Schemes 
3.5.1 Statistical Matching 
A generalization of paral lel i terat ive match ing called Stat ist ical Match ing was 
proposed in [1] for support ing performance guaranteed connections. I t was 
claimed that stat ist ical match ing assumes a better and more systematic use of 
randomness in choosing which request to grant and which grant to accept. More-
over, w i t h stat ist ical matching, the scheduler can dynamical ly provide real- t ime 
bandwid th guarantee to A B R traff ic flows whi le ensuring fairness in network re-
source allocation. The pair ing of inputs to outputs is chosen independently for 
each t ime slot, but on average, each flow is scheduled according to its specified 
throughput rate. 
Statist ical matching is very much alike the P I M in the sense that i t makes 
use of randomness and iterations to compute the matching among inputs and 
outputs. The key difference between the two algorithms is the absence of Request 
phase in stat ist ical matching. 
Unl ike P I M , a connection request here is in i t ia ted by output ports in Grant 
42 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
phase. I n stat ist ical matching, al locatable bandwid th is d iv ided in to X discrete 
uni ts per l ink , and X i j is used to denote the number of uni ts al located to traff ic 
f rom input i to output j . The ma in point is that the scheduler would arrange the 
random weight ing factors at the inputs and outputs so tha t each inpu t receives 
up to X v i r t ua l grants, each made independent ly w i t h probab i l i t y 如 Xij of 
those X potent ia l v i r t ua l grants to input i are associated w i t h ou tpu t j . 
The out l ine of stat ist ical match ing is as follows: 
1. Each output randomly chooses one input to grant; ou tpu t j chooses input 
j ^ . , • 
i w i t h probabi l i ty 子 propor t ional to the bandwid th reservation. 
2. Each input chooses at most one grant to accept ( i t may accept none) in a 
two-step process: 
• Each input i reinterprets each grant when i t receives a random num-
ber rri i j of v i r tua l grants, chosen between 0 and X i j according to a 
b inomia l d is t r ibut ion w i t h parameter 如 
/ V \ Y 
X i j ( 1 \ (X - l、A。一m X 
P r { m , j = m , 0 < m < X ” . } 二 [ j x ( ^ — j x i ^ — ^ J x ^ 
Pr{rui,j 二 0} 二 1 — Pr{l < rriij < X i j } 
When j does not grant to i , rri i j is set to zero. 
• I f an input receives any v i r tua l grants, the input chooses one randomly 
to accept. I n other words, the input chooses among grant ing outputs 
w i t h probabi l i ty proport ional to the number of v i r tua l grants f rom 
each output : 
Pr{i accepts j } = 
Ek m^j 
43 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
I t was shown in [1] that stat ist ical match ing can reserved up to 72% of l ink 's 
th roughput , i n which the al locat ion can be of any pat te rn provided tha t the sum 
of th roughput at any input or output is less than 72%. Any network bandwid th 
that is not used by stat ist ical match ing w i l l be f i l led w i t h other t raf f ic by paral lel 
i terat ive matching. 
Despite the fact that stat ist ical match ing is capable of dynamica l ly al locat ing 
bandwid th to ind iv idua l connections, there is a possibi l i ty that an ou tput w i l l 
grant to an input which has no cells destining at this ou tput . This i tera t ion is 
then wasted because there may be other inputs that have cells wai t ing to be 
sent to tha t output por t . I t is this property that l im i t the percentage of l ink 
throughput reservation to 72%. 
For our proposed scheme, the l im i t in del ivering l ink throughput to traff ic 
flows as described above is completely el iminated. The static scheduling algo-
r i t h m allows 100% l ink throughput reservation to the inputs by an output port。 
How an output 's tokens are allocated among the input ports over a f rame w i l l 
be determined using edge-coloring a lgor i thm [14] or method described in [24 . 
Apar t f rom the upper bound of l ink throughput ut i l izat ion, another short-
coming of stat ist ical matching is its immense use of est imat ion to b inomia l 
d ist r ibut ion. Such est imat ion has been a challenge to processors, especially 
when i t has to be done in an accurate manner, yet s t i l l being able to catch 
up w i t h the extremely high speed of the external l inks. Bu t for our static 
scheduling approach, since the static schedule is pre-computed and stored in a 
table, choosing which cell to be t ransmi t ted in reserved t ime slot is noth ing but 
table look-up. Of course, the cost of this simpl i f icat ion is the effort we paid 
dur ing call setup t ime to find out this non-confl ict ing static schedule. 
44 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
3.5.2 Weighted Probabilistic Iterative Matching 
Another derivat ive of P I M to provide bandwid th guarantee in input-buf fered 
switches is the Weighted Probabi l ist ic I terat ive Match ing ( W P I M ) [8]. W P I M 
follows most i terat ive match ing algori thms by having 3 phases in its a lgor i thm -
Request, Grant , and Accept phases. A n addi t ional Mask phase is in t roduced in 
the Grant phase so as to achieve bandwid th guarantee. I t allocates the output 
bandwid th among the inputs based on reservations made dur ing the connection 
setup phase, and can guarantee that traff ic f rom each input receives its promised 
share of the bandwid th of the output l ink. The a lgor i thm can also isolate mis-
behaving flows f rom the rest so as to protect the bandwid th guarantees to the 
well-behaving traff ic flows. 
To characterize the output bandwidth , t ime axis in W P I M is d iv ided in 
frames w i t h a f ixed number of slots per frame. A slot here is equal to the t ime 
taken to t ransmi t a 53-byte cell. The bandwid th reserved for a par t icu lar input -
output connection is in terms of al locating a certain m i n i m u m number of slots 
per frame to i t . In the context of W P I M , this number of slots per frame is 
called the credits of the connection. I f Cij is the number of credits allocated to 
the connection f rom input i to output j , and f is the length of the frame, then the 
objective of W P I M a lgor i thm is to ensure that at least C{j packets on average 
can be t ransmi t ted fo rm input port i to output port j dur ing each frame. 
Below is the out l ine of the basic operation of W P I M : 
Request Every input port of the switch that has not yet been 
matched w i t h one of the output ports sends a request to al l the 
output ports corresponding to destinations of packets in its queues. 
45 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
M a s k On receiving the requests f r om input ports, each ou tpu t 
por t creates a mask consisting of one b i t per request as follows: For 
those inputs that t ransmi t ted at least as many packets as thei r credit 
to the ou tput por t i n the current frame, the mask b i t is set to 1. For 
others, the mask b i t is set to 0. Among the requests received by the 
ou tput por t , only those or ig inat ing at unmasked input ports (mask 
b i t of 0) are used in the match ing process and the rest are ignored. 
G r a n t From the requests tha t remain f r om the masking stage, 
the output port selects one randomly with uniform probability and 
sends a grant signal to its or ig inat ing input por t . 
Accept Every unmatched input por t that receives one or more 
grants selects one w i t h equal probabi l i ty , and notifies the correspond-
ing output por t . The input and output ports are now matched and 
can be removed f rom subsequent i terations. 
The a lgor i thm is repeated for a f ixed number of t imes or un t i l no more 
pending unmasked request exists. The addi t ional masking stage can be merged 
to the Grant phase dur ing actual implementat ion. This masking procedure 
w i l l protect those connections obeying their bandwid th reservations f rom the 
misbehaving ones. Only residual bandwid th is made available to misbehaving 
connections. Figure 3.12 is a numerical example demonstrat ing the operat ion of 
W P I M . 
However, as a result of the Mask phase, there may be a possibi l i ty that some 
output ports remain unmatched even when there are packets available to send. 
I t is because the corresponding requests were blocked dur ing the Mask phase 
of the algor i thm. To maximize the number of packets scheduled, i f an output 
46 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
port remains unmatched but al l incoming requests are masked, the scheduler 
would clear al l i ts mask bits for that i terat ion. Such modi f icat ion w i l l share the 
residual bandwid th equally among those pending connection requests w i thou t 
affecting the or ig inal al location of bandwidth . 
The algor i thms of W P I M , P I M and E P I M w i t h performance guarantee are 
simulated to study their delay performance. We assume un i fo rm and inde-
pendent incoming traff ic loading and the same bandwid th requirement as in 
previous simulations. Figure 3.13 and Figure 3.14 compares the mean delay 
experienced by a cell in each of the algori thms. I t is found that the difference 
in the performance of W P I M and P I M is negligible - they bo th approach the 
performance of output queueing switch w i t h 4 iterations。This is due to the fact 
that traff ic f rom the inputs follows their bandwid th reservations exact ly and 
any non-reserved bandwid th is d is t r ibuted equally among al l input and output 
connections. None of the connections is masked out in W P I M and therefore, 
its performance is indist inguishable f rom that under P I M in this case. The im-
provement of performance in E P I M over these two algori thms is the result of 
the advance booking of tokens by input and output ports. Again, the scheduler 
^ 0 0 ^ 0 0 ^ 0 0 ^ 0 0 0 • 二參 0 • ••• 0 m^  • 令 • »•• 
= z z ^ ^ J 
画 k ^ i g @| i 1 g [g| 1 i g @| i i 
~ ~ ^ i j > ^ ~ ~ ^ 1——^s ~ ~ ^ 1^ 1 ~ ~ i j 1 . i 
" ^ ^ 1 ^ ^ ,s "^“^ ^ ^ 1 ~ ^ 1 1 ~ ^ 1 1 
Request Mask Grant Accept 
Figure 3.12: Example of Weighted Parallel I terat ive Match ing ( W P I M ) Algo-
r i t hm. 
47 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
needs to invoke 2 i terat ions in each t ime slot to achieve the same performance 
when E P I M w i t h stat ic scheduling a lgor i thm is implemented in the switch. 
30 1 1 1 1 1 1 1 1 ： 1—r^ 
• ••• WPIM: 2-4 iterations : / EPIM 
~*~ EPIM with Static Scheduling: 2-4 iterations .-f 2 -4 
25 - : f - ^ i t e ra t i ons 





- 1 5 - -
盡 2 iterations ‘ 
i p i 
/ -
5 _ ^ ‘ ‘ _ 
j^><^' 
^ ^ _ _ _ _ _ ^ l ^ ^ ^ f r ^ ' 
ol__H~ih a ai • m 州 jr-^^i|h^ --7'^ '^ " _ , , , i 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 3.13: Delay Performance of E P I M w i t h Static Scheduling and W P I M . 
48 
Chapter 3 Providing Bandwidth Guarantee in Input-Buffered Switches 
30| 1 1 1 1 1 1 1 —I~~r； 
^K- WPIM: 1-4 iterations 
• ••• PIM: 1-4 iterations 3_4 
25 - - ^i terat ions 





^ 1 iteration 2 iterations 
1 _jj -oil~~•~t~~_ • _ • • "i^ """^ *^ "^~""""~"~"I 1 1 1 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 3.14: Delay Performance of W P I M and P I M . 
49 
Chapter 4 
EPIM and Cross-Path Switch 
4.1 Introduction 
The mer i t of A T M switching is its capabi l i ty to support large-scale telecom-
municat ion networks at h igh speed. The evolut ion of current network ing envi-
ronment is mov ing towards the Broadband Integrated Services D ig i ta l Network 
(B ISDN) , which can work w i t h teleservices w i t h different, sometimes yet un-
known requirements. Meanwhile, the fast development of the semi-conductor 
and opt ical technology makes i t possible for network components to operate 
at higher speed and higher qual i ty applications. As a result, the constraints 
f rom hardware components can be considered v i r tua l l y e l iminated and bui ld ing 
large-scale, h igh performance switches becomes the u l t imate challenge of the 
telecommunicat ion industry. 
I n this chapter, we w i l l brief ly study a recently proposed large-scale A T M 
switch architecture called Cross-Path Switching [14, 19] and investigate how the 
E P I M algor i thm w i l l benefit the Cross-Path switch in performing its scheduling 
50 
Chapter 4 EPIM and Cross-Path Swifxh 
task at the inpu t side. Simulat ions on the performance of E P I M a lgor i thm in 
this swi tch architecture are done to show the possibi l i ty of merging the two 
schemes together. 




S n X m k X k “ x m 二 
~^ ^ ^ ^~ 
k input modules m central modules k output modules 
Figure 4.1: A n N x N Cross-Path Switch. 
The Cross-Path switch proposed in [14] is essentially a three-stage Clos network 
which adopts a quasi-static technique called Path Switching as the rout ing algo-
r i t h m w i t h i n the switch. Figure 4.1 shows an N x N Cross-Path switd i。The first 
stage consists of k ( = ^ ) input modules, each of dimension nxm. The dimension 
of each central module in the middle stage is k x k and the dimension of each 
output module is m x n . There is a unique physical l ink connecting modules in 
adjacent stages. 51 
Chapter 4 EPIM and Cross-Path Swifxh 
The rout ing of cells in pa th switching is based on the concept of v i r t ua l path 
w i t h i n the Clos network. A v i r tua l pa th in the cross-path switch is equivalent 
to a logical l ink comprised of al l v i r t ua l circuits connecting a pair of input and 
output module and there are altogether P v i r tua l paths in a cross-path switch. 
A v i r tua l pa th between input i and output j containing x tokens means tha t this 
pair of inpu t -ou tpu t is connected v ia x central modules. Every token is labeled 
w i t h an output module identi f ier and the central module used to connect the 
input and output module together. As a result, an interconnect ion pat te rn of 
the Clos network of a cross-path switch is determined in each t ime slot. I n pa th 
switching, the number of tokens allocated to a v i r tua l pa th in each t ime slot is 
predetermined according to the request of the incoming traff ic and corresponding 
bandwid th requirement for a f ixed number of t ime slots called frame. The f rame 
schedule is to be repeated un t i l new connection requests are accepted. This 
technique is called time-space interleaving. 
4.2.2 Supporting Performance Guarantee in Cross-Path 
Switch 
The pr inciple of how cross-path switch can provide bo th bandwid th and delay 
guarantee in v i r tua l paths w i l l be studied. A n a lgor i thm which ensures delay 
bound and preserves traff ic characteristic when cells pass through the cross-path 
switch is proposed in [16]. By incorporat ing a wel l-known service discipline 
called stop-and-go queueing strategy, this a lgor i thm is able to find the token 
assignment for al l v i r tua l paths of the switch under suitable admission policy. 
Again, let us assume that bandwidth of each l ink is normalized to 1. The 
52 
Chapter 4 EPIM and Cross-Path Swifxh 
admission pol icy of a new connect ion request f r o m an inpu t to an ou tpu t por t 
is simple: the request w i l l be accepted as long as the add i t ion of th is new 
connect ion w i l l not over-subscribe the external bandw id th of an modu le (i.e. 
n) . Let Xij denote the aggregate bandw id th requirement of the v i r t u a l pa th 
f r o m inpu t i to ou tpu t j . The fo l lowing m a t r i x is called b a n d w i d t h requi rement 
m a t r i x and is used extensively i n admission contro l of cross-path switch: 
/ \ 
Ao,o ^0,1 . • • • . . ^0,k-l 
Ai ’o • . . : 
: A. • : 
• Aha • 
: • •。 AA;—2，fc-l 
、 A f c _ l , 0 ^k-l,k-2 ^k-l,k-l ! 
where 
£ 、 S 1 , for a l l j (4.1) 
i=o 
k-i 
Y , Xij < 1 , for a l l i (4.2) 
j=o 
Eq. (4.1) means tha t there are n ports per inpu t module whi le Eq. (4.2) means 
that there are n por t i n the output module. 
Let Cij be the number of tokens al located to the v i r t ua l pa th of i npu t i 
and ou tpu t j . I n order to make the cross-path swi tch stat is t ica l ly stable, we 
have to ensure tha t the t ime averaged bandwid th should be at least as much as 
the aggregate bandw id th requirement of a v i m t a l path. Tha t is, the fo l lowing 
inequal i ty must be satisfied: 
a - . = y > A . „ for al l i , j (4.3) 
53 
Chapter 4 EPIM and Cross-Path Swifxh 
where C i j is called the capacity of the v i r t u a l pa th connect ion i npu t i and ou tpu t 
j . Not ice tha t a ] must be a non-negative integer and A,j is a non-negat ive real 
number . For a given A,-j, we w i l l calculate the number of tokens assigned to th is 
v i r t u a l pa th i n the fo l lowing way: 
c., 二「A”. X F] (4 .4 ) 
where「a::] is the ceil ig of x. 
Using Eq. (4.1), (4.2) and (4.4), we can conclude tha t 
k~i 
Y^ c,j < n X F + k - 1, for a l l i (4.5) 
j=o 
k-i 
Y^ c,j < n X F + k — 1，for all j (4.6) 
i=0 
The number of central modules, m , has a one-one correspondence to the number 
of tokens available in each t ime slot, wh ich determines the in terna l bandw id th 
of a switch. Thus, the m a x i m u m number of tokens tha t can be assigned to the 
v i r t ua l paths is upper-bounded by m. W h e n we consider the whole f rame, the 
token m a t r i x is constrained by the fo l lowing inequal i ty 
k-i 
y ^ Cij < m X _F, for all i (4.7) 
j=o 
k-i 
^ Cij < m X F , for al l j (4.8) 
i=o 
Base on the above token ma t r i x , we can only te l l at most how many cells 
(tokens) can be t ransmi t ted f rom an input module to al l the other ou tpu t mod-
ules in a frame. Yet , how these tokens are d is t r ibuted over a f rame is the next 
issue to deal w i th . 
This prob lem is very much simi lar to the stat ic scheduling a lgor i thm studied 
in chapter 3 - each output por t has to d is t r ibute a f ixed number of tokens over 
54 
Chapter 4 EPIM and Cross-Path Swifxh 
a f rame i n a non-conf l ict ing manner. As in chapter 3, this token al locat ion 
can again be done by edge coloring a lgor i thm of the corresponding connection 
b ipar t i te graph. However, as far as the delay and delay j i t t e r experienced by 
the cells are concerned, edge coloring a lgor i thm cannot provide any control over 
these parameters. A study on how token al locat ion would affect the burstiness 
of outgoing traf f ic of cross-path switch is done in [23]. I t was found that on 
passing through the cross-path switch, the traff ic always becomes more bursty 
than i t was when i t entered the switch at the input side. To opt imize its delay 
performance, one should design a token al location a lgor i thm which is capable 
of producing a token d is t r ibu t ion pat tern which is as un i fo rm as possible. I n 
response to this cr i ter ia, [16] has delivered an a lgor i thm which employs a mu l t i -
level stop-and-go queueing strategy for f inding a suitable token al locat ion for 
the switch. This a lgor i thm provides bo th the rout ing and scheduling funct ions 
whi le a t igh t delay bound and desirable traff ic characteristics at the same t ime 
w i l l be guaranteed. A detai led analysis was given in [16 . 
4.3 Implication of EPIM on Cross-Path switch 
4.3.1 Problem Re-definition 
Owing to the internal structure of cross-path switch, which is indeed a three-
stage Clos network, and the quasi-static nature of path switching, an input 
module can be connected to each of the output modules v ia different number 
of central modules in each t ime slot w i t h i n a frame. This connection pat tern is 
periodical and is computed again only when new connection request is accepted. 
Every module of the cross-path switch w i l l be noti f ied of this pat tern and w i l l 
55 
Chapter 4 EPIM and Cross-Path Swifxh 
set up the necessary connections accordingly. 
A cell a r r iv ing at an inpu t por t only knows its dest inat ion ou tpu t por t num-
ber. The switch w i l l then translate i t to the associating ou tpu t module and por t 
numbers. I n order to reach the destined output module, a t rans lat ion table tha t 
indicates which central module(s) a cell should traverse is stored in an inpu t 
module. (Figure 4.2) Note tha t the in format ion needed for rou t ing a cell w i t h i n 
an input module is the cell's dest inat ion ou tpu t module address only. Since cells 
f rom various inpu t ports of an input module may be target ing at the same out-
put module, the input module has to choose which cells get t ransmi t ted . This 
is i n fact an ou tput content ion problem in which mu l t i p le inpu t ports would 
compete for the same output module in each t ime slot. Such content ion w i l l be 
resolved by adequate scheduling a lgor i thm for input-buf fered switches. 
Header Translation Table T^m^s"ot 
Look-up 
,- - • O M 0 CM 0 , 2 -, 
i ‘ i 
I 丨…：：二二二二二： O M 1 CM 1 : : : : : = : : : : : : ; ) I 
i 1 i … … i t t Input Module 
i I i O M O M C M 
I i L-. 1 • • 1 0 • • C M 0 
I I 
I ---| 1 • • 1 2 •�. ，）CM1 
i .、-. - - , 
i \ z ： ..^  、.-
i- 0 • • 0 1 •- " 、 、 — C M 2 
Figure 4.2: Logical Model of an Input Module in Cross-Path Swtich. 
Resolving the above type of contention, can be simpli f ied by re-defining the 
rout ing problem in the fol lowing manner. I n the actual Clos network, an input 
module is physically connected to central modules and so does the output mod-
ule. Bu t f rom a cell's point of view, taking what path to reach its destination 
56 
Chapter 4 EPIM and Cross-Path Swifxh 
outpu t module is not i ts ma in concern so long as the switch can per fo rm the 
rout ing for i t successfully. Therefore, instead of d i rect ly connecting to the cen-
t ra l modules, the ou tpu t ports of an input module can be thought as logical ly 
connecting to the ou tput modules. The dimension of this logical inpu t mod-
ule would become n x k , instead of n x m . As stated in section 4.2.1, a central 
module can be visualized as a token representing the in ternal bandw id th of the 
switch. The number of tokens allocated to a pair of i n p u t / o u t p u t modules is 
equal to the number of central modules used to connect this pair of modules in 
a part icu lar t ime slot. I n other words, i f in t ime slot t , a to ta l of k connections 
are set up between input i and output j v ia the central stage, the number of 
tokens appeared at output j of the logical input module i w i l l be k。This logical 
model is i l lust rated through an example given in Figure 4.4 and 4。3。In this ex-
ample, a cross-path switch w i t h n = 2 , m = 3 , k = 2 and frame size of 2 is used. We 
demonstrate the token and route assignment for homogeneous incoming traff ic 
w i t h bandwid th requirement ma t r i x 
^ 0.5 0 . 5 、 
、0.5 0.5 j 
and token ma t r i x 
/ 3 3 \ 
U 3j 
The above ma t r i x can be converted to a frame schedule of the input modules 
by means of token al location algor i thm, for example, described in previous sec-
t ion. This frame schedule is the token d is t r ibut ion pat tern at the output side of 
the logical input module, and is one-one correspondent to a connection pat tern 
in the middle stage of the cross-path switch. 
57 
Chapter 4 EPIM and Cross-Path Swifxh 
j No. of Tokens (Central Modules) in that time slot = j 
Time 0 / Time ^ , 
slot ‘ / slot 
一 | o M o — — f 7 ] n n … —— o M o — — 1 丄••• 
一 |oMl——[T] [ T ] … —— OMl——2 1 ••• 
^ I 
" “ V J “ / ‘ 
_ 、• , Frame size = 2 
Frame size = 2 
(a) (b) 
Figure 4.3: Logical Module of Inpu t Module (a) 0 (b) 1. 
The advantage of re-defining the input modules is two-fold: f i rst ly, i t allows 
easy incorporat ion of scheduling task w i t h E P I M (to be described in next sec-
t ion) ; secondly, the overall scheduling of cells in the whole cross-path switch can 
be decomposed to ind iv idua l scheduling task of each of the input modules, which 
can be done separately in a d is t r ibuted manner. This w i l l certainly decrease the 
size of the scheduling problem and hence increase the speed of the switch. 
4.3.2 Scheduling in Input Modules with EPIM 
As for the implementat ion of the input module, [14] has proposed the use of tra-
d i t ional input-buffered switch w i t h look-ahead selection. I n this type of input-
bufFered switch, the throughput performance is l im i ted due to look-ahead block-
ing. W i t h the in t roduct ion of v i r tua l output queueing (VOQ) at the input ports, 
the wastage of bandwid th in the outgoing l inks can be avoided under suitable 
scheduling algor i thm. On combining V O Q and the scheme proposed in previous 
section, we can readily apply E P I M separately in each input module to schedule 
cells competing for tokens at its output side. 
58 
Chapter 4 EPIM and Cross-Path Swifxh 
^ ~ ^ " ^ L ^ Input Output 
^ • ^ J><Q ^ v Module Module 
gSg:x: 
1 \ Z 1 
(a) 
^ ^ v ^ ^ / ~ \ ^ ^ Input Output 
^ ^ ^ ^ * < ^ ^ ^ Module Module ^ ¾ 又 ^ ^ ^ - ^ 7 ^ ½ ^ 1 ^ . . — ^ 1 
^ ^ x O > ^ 
(b) 
Figure 4.4: Middle-stage Route Schedule and corresponding B ipar t i t e Graph in 
t ime slot (a) 0 (b) 1. 
Figure 4.5 is a cont inuat ion of the example given in Figure 4A and 4,3。 
I t shows the logical module of input module 1. Note that i t is effectively a 
modif ied input-bufFered switch. Unl ike the switch scenario described in chapter 
3, an output port in this model can have more than 1 token in each t ime slot, 
meaning that the output side can receive more than 1 cell at that t ime slot. 
(Nevertheless, an input port is eligible to t ransmi t one cell in each t ime slot.) 
This is indicated by the number inside the box at the output side of the switch 
in Figure 4.5. 
I n Figure 4.5, an input port may have cells destining for different output 
59 




i ^ n 1 Input Module 1 T™e 1 
I i slot I " . OMO| j _ _ 
i I ~ ^ O M O — — 1 2 • • • 
i i ———— 
1參鲁鲁 G M l i 
i i 
•I 丨 
! • • • OMO| 
I I ~ ^ O M l — — 2 1 • • • 
! • • • o M i | !^z1 
i I 
L -」 
An ATM cell 
j No. of Tokens (Central Modules) in that time slot = j 
Figure 4.5: Scheduling cells at Input Module 1 - The Logical Model . 
modules, bu t the output modules may not accommodate al l these connection 
requests due to the l im i ted number of tokens. Here, we concentrate on mod-
i fy ing E P I M in this mu l t ip le tokens scenario and comparing its performance 
improvement over P I M . 
The key d is t inct ion of mul t ip le- token scheduling problem f rom single-token 
one in i terat ive a lgor i thm is that an output port can issue more than one grant 
in each i terat ion as long as its tokens have not been used up. For the input 
side, since an input port is allowed to t ransmi t one cell at a t ime, there need 
not be any modi f icat ion in the Request and Accept phases. I n response to this 
dist inct ion, we br ing about the fol lowing adjustment in the Grant phase of P I M 
and E P I M : 
A. PIM 
Grant I f an unmatched output receives any requests, i t chooses x of them to 
60 
Chapter 4 EPIM and Cross-Path Swifxh 
grant, where x is the number of tokens available in the current t ime slot. The 
ou tpu t notif ies those selected inputs whether their requests were granted. 
B. EPIM 
Grant I f an ou tpu t receives any requests, i t checks for the f irst available to-
ken and the corresponding t ime slot number. The output then grants to x of 
the requests chosen randomly, where x is the number of tokens available in the 
selected t ime slot. I t notifies the corresponding input por t wh ich t ime slot this 
grant belongs to. 
To support the above adjustment, an output por t also has to keep track on 
the number of tokens remained unused after f ix ing the schedule of the switch. A l l 
these modif icat ions are i l lust rated in Figure 4.6, which show a single i terat ion 
of the adjusted a lgor i thm. In this example, n = k = 2 and the to ta l number of 
tokens in al l two output ports are 3 ( m = 3 ) . Figure 4.6(b) is the corresponding 
cross-path switch to which this logical input module belongs. For s impl ic i ty, we 
assume that the first available token at each output por t is located in current 
t ime slot. Dur ing the Grant phase, the number of grants that an output port 
can issue is given by the m i n i m u m of its number of available tokens and the 
number of requests received at the selected t ime slot. As a result, ou tput 0 and 
1 w i l l issue 1 and 2 grants respectively. A n input can accept only one grant at 
any t ime, and the accepted output w i l l update the number of available tokens 
at that t ime slot accordingly. A l l these operations f inish in one single i terat ion 
and they w i l l be repeated un t i l a f ixed number of i terat ions is invoked. 
61 
Chapter 4 EPIM and Cross-Path Swifxh 
available 
tokens 
Request y ^ Grant Accept 
"v | r r | | 0 0 Z ~ f r p ^ o o _ ~ H S i i _ 「 
3 y ^ s = ^ v ^ 
3么日口 3‘、日口3*^8口 
r p ^ Output 1 has issued 2 grants 
M l because it has2available • usedtoken 
^ ^ tokens in this time slot ^ " 
available token 
(a) 
¢ ^ ¾ = ¾ ^ ? ^ © ^ W ^ i V > ^ ^ ¾ ¾ ^ 
^ ^ ^ Q ^ ^ ^ ^ ¾ ^ ^ 
(b) (c) 
Figure 4.6: F ind ing the Frame Schedule for Input Modu le 1。 
The example shows the cell scheduling for one of the two logical input mod-
ules only. The same procedures w i l l be carried out i n other modules simulta-
neously, where the token d is t r ibut ion pat tern is determined by the al location 
algor i thm. 
Result obtained f rom the above is only tai lor-made for the logical model of an 
input module. I n order to f i t into the original structure of a cross-path switch, 
one has to convert this result back to the middle stage connection pat tern. Back 
to the rout ing property of path switching, the periodic connection sett ing of the 
central modules is predetermined dur ing call setup. Each input module has got a 
translat ion table similar to the one shown in Figure 4.2, te l l ing the input module 
62 
Chapter 4 EPIM and Cross-Path Swifxh 
what central module(s) to be traversed in order to reach an ou tpu t module. The 
a lgor i thm proposed i n this subsection helps an inpu t module to decide which 
cell and where the cells are to be sent. The rou t ing in fo rmat ion of th is logical 
module w i l l be mapped w i t h the t ranslat ion table for determin ing the match ing 
between the inpu t and ou tpu t ports of the inpu t module. Figure 4.6(c) shows 
the corresponding connections to be made inside the cross-path switch. Do t ted 
lines represent the connection pat tern already set up in the central modules in 
current t ime slot, whi le solid lines are the paths taken by the selected cells in 
input module 0 to be t ransmi t ted to ou tpu t modules 0 and 1. 
4.4 Simulation 
We have s imulated the performance of an 16x16 input-buf fered switch w i t h 17 
and 18 tokens per f rame that consists of 16 ( = F) t ime slots. Th is is equivalent 
to scheduling an input module in a cross-path switch w i t h n = k = 1 6 and m = 1 7 
and 18 respectively. 
The incoming traff ic is assumed to be un i fo rm and independent. The mean 
delay experienced by a cell is investigated for each of the cases and is compared 
w i t h the result of single-token scenario. The token d is t r ibut ion patterns can be 
represented by means of a n x F mat r i x T ^ = [t{j]^ where t i j is the number of 
tokens for connecting this input module to output i at t ime slot j w i t h i n the 
frame and m is the to ta l number of tokens (central modules) available. The 
entries of Tm is constrained by these inequalit ies: 
63 
Chapter 4 EPIM and Cross-Path Swifxh 
n - l 
Y^ tij < m , for al l j (4.9) 
i=0 
^' Uj < m X F , for al l i (4.10) 
i = o 
Eq. (4.9) means that at most m cells ( = number of central modules) can be 
t ransmi t ted in each t ime. Eq. (4.10) implies that in each frame, there can be 
at most mxF cells got delivered to a part icular output modudle. 
The fol lowing shows T i j and Tig： 
A. 17 tokens per frame 
1 \ 
2 1 1 1 . . . 1 1 ^ 
1 2 1 1 … 1 1 
Ti7 = 1 1 2 1 . . . 1 1 
• • • • • • • 
• • • • • • • 
、1 1 1 1 ... 1 2》 
B. 18 tokens per frame 
(2 1 1 ... 2 1 ... 1、 
1 2 1 . . . 1 2 . . . 1 
Ti8 = 1 1 2 . . . 1 1 . . . 1 
、1 1 1 ... 1 1 … 2 〉 
We have applied the above token distr ibut ion patterns on the modif ied ver-
sions of P I M and EP IM. Figure 4.7 and Figure 4.8 show the mean cell delay 
performance of the switch using these algorithms w i t h 17 and 18 tokens at the 
64 
Chapter 4 EPIM and Cross-Path Swifxh 
outpu t side respectively. I t can be seen that their performance are basically sim-
i lar to tha t w i t h 16 tokens - mean delay experienced by cell of modi f ied E P I M 
w i t h 2 i terat ions is comparable w i t h that of modi f ied version of P I M w i t h 3 to 
4 i terat ions. This means that i n order to achieve the best performance, using 
E P I M can save up to 2 i terat ions. 
10i 1 1 r 1 1 r- 1 1 ：~ ^ 
—卜 EPIM: 2 -4 iterations I 
9 _ Q- PIM: 1 -4 iterations | _ 
； ； :: 
I 6 _ P,M : PIM 。 / _ 
I 5 1 iter. ； 2 Ker. .. / _ 
® J 
c 4 - •• / -
¢0 / <D . / ： / / : 
1 - • • ' ' ^ t ^ ^ - ' ^ -
=__J i^ " "^^ "^*^^ "^ "^ 
oi n M ii *P -m—f""''^ ‘ 1 1 1 1 1 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 4.7: Delay Performance of E P I M and P I M : 17 tokens. 
I n our simulations, we found that under the same scheduling a lgor i thm, the 
more tokens in a switch, the better is the mean delay performance experienced 
by the cells. The results are shown in Figure 4.9 (modif ied P I M ) and Figure 4.10 
(modif ied E P I M ) . Generally speaking, for a part icular number of i terat ions, the 
mean cell delay in the switch w i t h 18 tokens w i l l be smaller than that w i t h 17 
tokens, which is in t u rn smaller than the case w i t h 16 tokens. To account for this, 
we can trace back the physical meaning of a token in cross-path switch. A token 
is equivalent to a central module for connecting an input -output modules pair in 
65 
Chapter 4 EPIM and Cross-Path Swifxh 
10| 1 1 1 1 r- 1 .  ‘ ‘ ‘ 
+ EPIM: 2 -4 iterations '^ 
9 - | . a . PIM: 1 -4 iterations | j 
： ; % 
I 6- 口 .. i -
1 PIM PIM ： I 
1 5 _ 1 iter. ；• 2 iter. : i -
^ ... f 
？ 4 - • / -
、 ： / -
- , 。 乂 -
..口 ^ < * ^ 
[ _ ^ ^ ^ ^ ^ ^ ^ ^ ^ " ^ ^ " ^ ^ _ 
o l _ _ » . ^ — — • ~ ~ * ~ ~ * ~ ~ ~ " f ' ^ * T ' ' ^ _ _ I 1 1 J 1 
0 0.1 0.2 0.3 0.4 0.5 . 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 4.8: Delay Performance of E P I M and P I M : 18 tokens. 
a t ime slot. The more tokens a pair of input -ou tpu t modules has in a t ime slot, 
the larger number of al ternat ive paths w i l l be available for the input module 
to t ransmi t cells to the dest inat ion output module. I t means that some cells 
that have to be backlogged in the input module due to insufficient of number of 
tokens may get t ransmi t ted if more alternat ive paths (tokens or central modules) 
are available for the scheduler to use upon. A n example is shown in Figure 4.11 
and 4.12. Two Clos networks, both w i t h n = 3 and k = 2 are used. The difference 
between them is the number of modules in the central stage - one w i t h 3 and the 
other w i t h 4 central modules. The connection pat tern w i t h i n the central modules 
are also shown in dot ted lines. Solid lines represents the connections that have 
been set up already. In Figure 4.11, owing to the current buffer occupancy in 
input module 0 and the connection pat tern of central modules, the cell at input 
port 2 is forced to stay in the buffer. There is no extra central module connecting 
66 
Chapter 4 EPIM and Cross-Path Swifxh 
5| 1 1 1 ‘ ‘ ^ ‘ ‘ j :j @ 
4 5 - - ^ - 16tokens I ' i / : _ 
寺 17tokens | / /7 丄；“ 
.Q • 18 tokens | ； J 9 : 
4- ； ； 2 ;|// _ 
3 5 _ ‘ •: iterations I j\ |:: -
^ 1 iteration 丨 f [ ！ ^ T ； ^ 
j： I f : 
气-5- # . # ^ 3 - 4 _ 
/ d . ^ x • iterations 
1 - ^ ¾ ^ ' • 
- _ _ ^ ^ ^ ^ i ^ -
oi -•——fc—-*"^ "^ r^^ _I 1 1 1 1 1 1 
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 4.9: Delay Performance of P I M : 16, 17, and 18 tokens。 
input module 0 and output module 0 in current t ime slot, mak ing the cell to 
suffer ext ra delay. The si tuat ion is resolved when we introduce an ext ra central 
module in to the Clos network (Figure 4.12). I n this switch, the ext ra module 
in the central stage provides an al ternat ive path for input module 0 to choose 
f rom. As a result, al l three head-of-line cells at the v i r tua l ou tput queues can 
be t ransmi t ted in the same t ime slot. 
67 
Chapter 4 EPIM and Cross-Path Swifxh 
5| 1 1 1 1 1— 1 1 1 I I ？ 
4.5 - / I / _ 
• > • 16tokens j r ：： 
. - * - 17 tokens • | ：： -
-Q 18tokens | . | ： 
？ - / / p _ 
I ' I / . 
i 3 - / / / 
i ^ / • 
豈 2 . 5 _ 丨 / 口 _ 
« / / ... 
0 / / •• 
1 2 - . # _ 
15- ^^ 
1 - . ^ “ ^ ‘ 
^ ^ • ‘ 
0.5 - . ] j f ^ < ^ ^ ‘ 
_,^. . , . - .^^^^^^^ 
QH ^ - ^ f c - ^ - * ^ ^ ^ I I I 1 L. 1 1 
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Offered Load 
Figure 4.10: Delay Performance of E P I M : 16, 17, and 18 tokens. 
oMo • • ! 0 y^ >v 
OMl ^^^^~” " " "^“~~" y / ^ 一 _ — N y 
; : ^ ^ ^ ^ ^ ^ 7 ^ ^ ^ ^ ^ ^ 
Backlogged due to y * \ • ‘ y 
lack oftoken toOMO 八 / 、 . 八 
^ i A : S V i 
^ K Q > ^ 
Figure 4.11: A Cross-Path Switch w i t h n=3 , k = 2 , and m = 3 . 
68 
Chapter 4 EPIM and Cross-Path Swifxh 
OM0 ~ O M j 0 , ^ ^ ^ ^ ^ ^ A ^ ^ ^ ^ ^ ^ V | 
OMO D B | 1 ^ " ~ ~ ^ ^ _ _ y ^ ^ ^ — 
oMi V ^~~~~~~\ , ~ ^ ^ yT — 
OMO • • 丨 2 . \ / ' v r ' \ / r 。Mi^^^^^\y^ , \ y ^ ^ " ^ 
Transmitted through the y ^ v X ^ v ^ V 
extra centrai module f \ ^ y \ ^ ^ ^ v y ^ 
m ^ ^ C t t ^ ^ 
^^^^^^^^^^"Tt^^^^^^^^^ 
An extra central module 




I n this thesis, the Enhanced I terat ive Match ing a lgor i thm is proposed for per-
forming traff ic scheduling in the type of A T M input-buf fered switches and is 
studied in detai l . I t is in fact an extended version of the Paral lel I terat ive 
Match ing a lgor i thm, w i t h the in t roduct ion of advance booking of tokens by out-
put ports dur ing Grant phase. Because of this, an input may receive and accept 
grant belonging to different t ime slots in the Accept phase of each i terat ion. A t 
the end of each t ime slot, a par t ia l ly f inished transmission schedule of the switch 
in subsequent t ime slot, i n addi t ion to the current one, w i l l be determined. This 
property of E P I M reduces the size of matching problem faced by the switch in 
each t ime slot, hence improv ing its scheduling speed significantly. I t is val idated 
by computer simulations and analysis of P I M and E P I M . Bo th results reveal 
that the performance of E P I M w i t h two iterations is comparable w i t h that of 
P I M w i t h 4 iterations. 
Provision of bandwid th guarantee to connections in A T M switches is an-
other issue investigated in this thesis. E P I M itself is sufficient for scheduling 
70 
Chapter 5 Conclusion 
cells belonging to the best-effort service category, and the a lgor i thm itself has 
no guarantee on the bandwid th al located to the cells a r r iv ing at the switch. 
To achieve this guarantee, we incorporate the stat ic scheduling a lgor i thm w i t h 
E P I M and successfully provide connections w i t h bandwid th guarantee by f ind ing 
a f rame schedule for t hem dur ing call setup time。We compare the performance 
of E P I M and P I M after the incorporat ion w i t h static scheduling. I t is found 
tha t , under un i fo rm and indepedent incoming traff ic, E P I M again outperforms 
P I M by using two i terat ions fewer to achieve simi lar delay performance。Other 
variations of P I M , the Weighted Paral lel I terat ive Match ing ( W P I M ) and Sta-
t is t ica l Match ing algroi thms, that aims at prov id ing performance guarantee are 
also discussed. Again, our proposal can result i n a feasible schedule w i t h 2 
i terat ions only. 
Lastly, the concept of Path Switching and Cross-Path switch is introduced。 
We have proposed a logical model for input modules in the cross-path switch, 
which enables us to conceptualise the relat ionship between al l the three stages 
of the switch and the frame schedule associated w i t h them. This logical module 
is essentially an input-buffered switch in which mul t ip le tokens are available at 
the output side. As a result, we can apply the P I M and E P I M to this model 
by al lowing mu l t ip le grants to be issued in each i terat ion. Since the external 
bandwid th of the switch is expanded, the performance of either a lgor i thm is 
much better than the single token case in terms of delay and throughput . Above 
all, the advance booking of tokens by output ports makes E P I M superior to P I M 
w i t h respect to scheduling speed and switch performance. 
71 
Bibliography 
1] T . E. Anderson, S. S. Owick i , J. B. Saxe, and C。P. Thacker. High-speed 
Swi tch Scheduling for Local-area Networks. ACM Transactions on Com-
puter Systems, 11(4):319-352, Nov 1993. 
2] N. M c K e o w n and J. Wal rand. Scheduling cells i n an input -queued switch. 
Electronics Letters, 29(25):2174-2175, Dec 1994. 
3] A . M e k k i t t i k u l and N. McKeown. A Pract ica l Scheduling A l g o r i t h m to 
Achieve 100% Throughpu t i n Input -Queued Switches. Proc. INFOCOM 
'98, Vol.2:792-799, M a r - A p r 1998. 
'4] A . Hung, G. Kesidis, and N. Mckeown. A T M Input -Buf fe red Switches w i t h 
the Guaranteed-Rate Property. Proc. ISCC，98, 331-335, Jun 1998. 
5] B. Prabhakar and N. McKeown. On the Speed-up required for combined In-
put and Ou tpu t Queued Switching. Proc. IEEE Intl Symp. on Information 
Theory '98, 165, Aug 1998. 
6] N. McKeown, V . Anantharam, and J. War land. Achiev ing 100% Through-
put i n an Input -Queued Switch. Proc. IEEE INFOCOM，96, 1:296-302, 
1996. 
72 
'7] G. Kesidis. A T M Networks Performance. Kluwer Academic, 1996。 
•8] D. St i l iadis, A . Varma. Prov id ing B a n d w i d t h Guarantees i n an Inpu t -
BufFered Crossbar Switch. Proc. IEEE INFOCOM，95, 960-968, A p r 1995. 
9] D. St i l iadis, Traff ic Scheduling i n Packet-Switched Networks: Analysis, De-
sign, and Implementa t ion . PhD Dissertation, Computer Engineering, Uni-
versity of California, Santa Cruz, 1996. 
10] M . Karo l , M . H luchy j , and S. Morgan. Inpu t versus O u t p u t Queueing on a 
Space Div is ion Packet Switch. IEEE Trans, on Communications, 35:1347-
1356, Dec 1987. 
11] M . H luchyJ and M . Karo l . Queueing in High-Performance Packet Switching. 
IEEE JSAC, 6(9):1587-1597, Dec 1988. 
12] M . Karo l , K . Eng, H. Obara. Improv ing the Performance of Input-queued 
A T M Packet Switches. Proc. IEEE INFOCOM ,92, 1:110-115, May 1992。 
13] A . Pat tav ina and G. Bruzzi . Analysis of Inpu t and Ou tpu t Queueing for 
Nonblocking A T M Switches, IEEE/ACM Trans. On Networking, 1(3):314-
328, Jun 1993. 
14] T . T . Lee and C.H. Lam. Path Switching - A Quasi-Static Rout ing Scheme 
for Large-Scale A T M Packet Switches. IEEE JSAC, 15(5):914-924, Jun 
1997. 
15] H. Zhang. Service Disciplines For Guaranteed Performance Service in 
Packet-Switching Networks. Proc. IEEE, 83(30):1371-1399, Oct 1995. 
16] S.Y. L iew and T . T . Lee. Bandw id th Assignment w i t h QoS guarantee in a 
Class of Scalable A T M switches, em Proc. I E E E ICC '99, 3:1802-1806, Jun 
1999. 
17] M . D . Prycker. Asynchronous Transfer Mode. Ellis Horwood, 1995。 
.18] Y .N .J . Hu i . Resource A l locat ion for Broadband Networks. IEEE JSAC, 
6(9):358-368, Dec 1988. 
.19] C.H. Lam. V i r t u a l Path Traff ic Management of Cross-Path Switch。PhD 
Thesis, Department of Information Engineering, The Chinese University 
of Hong Kong, 1997. 
20] S.C. Liew and T . T . Lee. "Pr inci ip les of Broadband Switching and Networks. 
Lecture Notes. Draf t 3. 1996. 
21] R.J. Wi lson. In t roduct ion to Graph Theory. Academic Press, 1972。 
22] S.Y. L iew, S.W. Cheng, and T . T . Lee. A n Enhanced I terat ive Scheduling 
A lgo r i t hm for A T M Input-bufFered Switch. Proc, IEEE ATM Workshop 
'99, May 1999. 
'23] M.C . Chan, Ph i l ip P. To and T . T . Lee. Per-connection Performance Guar-
antees for Cross-Path A T M Packet Switch. Proc. IEEE ATM Workshop 
,99, May 1999. 
24] Joseph Y . Hui . Switching and Traffic Theory for Integrated Broadband 
Networks. Kluwer Academic, 1990. 
'25] The A T M Forum. Traffic Management Specification. Version 4, Ap r 1996. 
26] G. Nong, Joges K . Muppala , and M . Hamdi . Analysis of Non-block ing A T M 
Switches w i t h Mu l t i p l e Inpu t Queues. Proc. IEEE GLOBECOM '97, 1:531-
535, Nov 1997. 
"27] G. Nong, Joges K . Muppala , and M . Hamdi . A Performance Mode l for 
A T M Switches w i t h Mu l t i p l e Input Queues. Proc. ofSixth Intl Conference 
on Computer Communications and Networks, 222-227, Sep 1997. 
'28] Schwartz Mischa. Broadband Integrated Networks. Prentice Hall, 1996. 
29] Andrew S. Tanenbaum. Computer Networks. Prentice Hall, 1996. 
30] N. McKeown, M . Izzard, A . M e k k i t t i k u l and W . Ellersick. T i n y Tera: a 
packet switch core. IEEE Micro, 26-33, Jan 1997. 
31] Robert S. L i . Theory of Periodic Content ion and its appl icat ion to Packet 
Switching. Proc. IEEE INFOCOM ,88, 320-325, 1988. 
32] H. Obara. A n Eff icient Content ion Resolution A lgo r i t hm for Input Queue-
ing A T M Switches. Int. J. Digital & Analog Cabled Systems, 2(4):261-267, 
1989. 
33] S. Ross. A First Course in Probabi l i ty. MacMillan, 1988. 
34] Peter G. Harr ison and Naresh M . Patel. Performance Model l ing of Com-
municat ion Networks and Computer Architectures. Addison Wesley, 1993. 
.35] Joseph Y . Hu i and E. Ar thurs. A Broadband Packet Switch for Integrated 
Transport. IEEE JSAC, 5(8):1264-1273, Oct 1987. 
36] R.E. Tar jan. Da ta Structures and Network A lgor i thms. Society for Indus-
trial and Applied Mathematics, Pennsylvania, Nov 1983. 
37] T . H . Cormen, C.E. Leiserson, R.L. Rivest. In t roduc t ion to A lgor i thms. MIT 
Press, 1990. 
















 . , -
 • 
< r ‘ 
矿i • ^ >
 / J
 .

















' M . -te ^ i v J < „
 . , : ; ; . : : . : „ . . . . . : , : -
 . • :
 ; • •
 : . " ? - . . -









CUHK L i b r a r i e s 
_llllll_lll_l 
0D37E33Mb 
