New Jersey Institute of Technology

Digital Commons @ NJIT
Dissertations

Electronic Theses and Dissertations

Spring 5-31-2000

Design and analysis of a scalable terabit multicast packet switch :
architecture and scheduling algorithms
Feihong Chen
New Jersey Institute of Technology

Follow this and additional works at: https://digitalcommons.njit.edu/dissertations
Part of the Electrical and Electronics Commons

Recommended Citation
Chen, Feihong, "Design and analysis of a scalable terabit multicast packet switch : architecture and
scheduling algorithms" (2000). Dissertations. 395.
https://digitalcommons.njit.edu/dissertations/395

This Dissertation is brought to you for free and open access by the Electronic Theses and Dissertations at Digital
Commons @ NJIT. It has been accepted for inclusion in Dissertations by an authorized administrator of Digital
Commons @ NJIT. For more information, please contact digitalcommons@njit.edu.

Copyright Warning & Restrictions
The copyright law of the United States (Title 17, United
States Code) governs the making of photocopies or other
reproductions of copyrighted material.
Under certain conditions specified in the law, libraries and
archives are authorized to furnish a photocopy or other
reproduction. One of these specified conditions is that the
photocopy or reproduction is not to be “used for any
purpose other than private study, scholarship, or research.”
If a, user makes a request for, or later uses, a photocopy or
reproduction for purposes in excess of “fair use” that user
may be liable for copyright infringement,
This institution reserves the right to refuse to accept a
copying order if, in its judgment, fulfillment of the order
would involve violation of copyright law.
Please Note: The author retains the copyright while the
New Jersey Institute of Technology reserves the right to
distribute this thesis or dissertation
Printing note: If you do not wish to print this page, then select
“Pages from: first page # to: last page #” on the print dialog screen

The Van Houten library has removed some of the
personal information and all signatures from the
approval page and biographical sketches of theses
and dissertations in order to protect the identity of
NJIT graduates and faculty.

ABSTRACT
DESIGN AND ANALYSIS OF A SCALABLE TERABIT MULTICAST
PACKET SWITCH: ARCHITECTURES AND SCHEDULING
• ALGORITHMS
by
Feihong Chen
Internet growth and success not only open a primary route of information exchange
for millions of people around the world, but also create unprecedented demand for
core network capacity. Existing switches/routers, due to the bottleneck from either
switch architecture or arbitration complexity, can reach a capacity on the order of
gigabits per second, but few of them are scalable to large capacity of terabits per
second.
In this dissertation, we propose three novel switch architectures with cooperated
scheduling algorithms to design a terabit backbone switch/router which is able to
deliver large capacity, multicasting, and high performance along with Quality of
Service (QoS). Our switch designs benefit from unique features of modular switch
architecture and distributed resource allocation scheme.
Switch I is a unique and modular design characterized by input and output
link sharing. Link sharing resolves output contention and eliminates speedup
requirement for central switch fabric. Hence, the switch architecture is scalable to
any large size. We propose a distributed round robin (RR) scheduling algorithm
which provides fairness and has very low arbitration complexity. Switch I can achieve
good performance under uniform traffic. However, Switch I does not perform well
for non-uniform traffic.
Switch II, as a modified switch design, employs link sharing as well as a token
ring to pursue a solution to overcome the drawback of Switch I. We propose a
round robin prioritized link reservation (RR+POLR) algorithm which results in an
improved performance especially under non-uniform traffic. However, RR+POLR

algorithm is not flexible enough to adapt to the input traffic. In Switch II, the link
reservation rate has a great impact on switch performance.
Finally, Switch III is proposed as an enhanced switch design using link sharing
and dual round robin rings. Packet forwarding is based on link reservation. We
propose a queue occupancy based dynamic link reservation (QOBDLR) algorithm
which can adapt to the input traffic to provide a fast and fair link resource allocation.
QOBDLR algorithm is a distributed resource allocation scheme in the sense that
dynamic link reservation is carried out according to local available information.
Arbitration complexity is very low. Compared to the output queued (OQ) switch
which is known to offer the best performance under any traffic pattern, Switch III
not only achieves performance as good as the OQ switch, but also overcomes speedup
problem which seriously limits the OQ switch to be a scalable switch design. Hence,
Switch III would be a good choice for high performance, scalable, large-capacity core
switches.

DESIGN AND ANALYSIS OF A SCALABLE TERABIT MULTICAST
PACKET SWITCH: ARCHITECTURES AND SCHEDULING
ALGORITHMS

by
Feihong Chen

A Dissertation
Submitted to the Faculty of
New Jersey Institute of Technology
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
Department of Electrical and Computer Engineering
May 2000

Copyright © 2000 by Feihong Chen
ALL RIGHTS RESERVED

APPROVAL PAGE
DESIGN AND ANALYSIS OF A SCALABLE TERABIT MULTICAST
PACKET SWITCH : ARCHITECTURES AND SCHEDULING
ALGORITHMS
Feihong Chen

Dr. Ali N. Akansu, Dissertation Co-Advisor
Professor of Electrical and Computer Engineering, NJIT

Date

Dr. Necdet Uzun, Dissertation Co-Advisor
Assistant Professor of Electrical and Computer Engineering, NJIT

Date

Dr. Nirwan Ansari, Committee Member
Professor of Electrical and Computer Engineering, NJIT

Date

Dr. Symeon Papavassiliou, Committee Member
Assistant Professor of Electrical and Computer Engineering, NJIT

Date

Date
Dr. H. Jonthan Chao, Committee Member
Professor of Electrical Engineering, Polytechnic University, Brooklyn, NY

BIOGRAPHICAL SKETCH
Author:

Feihong Chen

Degree:

Doctor of Philosophy in Electrical Engineering

Date:

May 2000

Undergraduate and Graduate Education:
• Doctor of Philosophy in Electrical Engineering
New Jersey Institute of Technology, Newark, NJ, 2000
• Master of Science in Electrical Engineering
Beijing University of Posts and Telecommunications, Beijing, P.R.China, 1996
• Bachelor of Science in Electrical Engineering
Xi'An University of Electronic Science and Technology (XIDIAN University),
Xi'An, ShannXi, P.R.China, 1993
Major:

Electrical Engineering

Presentations and Publications:
F. Chen, B. Yener, A.N. Akansu and S. Tekinay,
"A Novel Performance Analysis for the Copy Network in a Multicast ATM
Switch,"
Proc. of IEEE ICCCN'98, Lafayette, LA, October 12-25, 1998, pp. 99-106.
F. Chen, N. Uzun and A. N. Akansu,
"A High Performance Output-Oriented Cell Scheduling Algorithm for Multicast
ATM Switches,"
Proc. of CISS'99, Baltimore, MD, March 17-19, 1999, pp. 803-808.
F. Chen, N. Uzun and A. N. Akansu,
"A Large Scale Multicast ATM Switch With Input and Output Link Sharing,"
Proc. of IEEE GLOBECOM'99, Rio de Janeiro, Brazil, Dec. 1999, pp. 12511255.

iv

F. Chen, N. Uzun and A. N. Akansu,
"A Scalable Multicast ATM Switch using Link Sharing and Prioritized Link
Reservation,"
Proc. of IEEE ICCCN'99, Boston, MA, October 1999, pp.218-222.
F. Chen, N. Uzun and A. N. Akansu,
"Design of a Large Scale Multicast Packet Switch with a Distributed Resource
Allocation Algorithm,"
Proc. of CISS'2000, Princeton, NJ, March 15-17, 2000, pp. FP 6.5 - FP6.10.
F. Chen, N. Uzun and A. N. Akansu,
"A Distributed Dynamic Scheduling Algorithm for a Terabit Multicast Packet
Switch,"
to be presented at IFIP NETWORKING'2000, Paris, France, May 2000.
F. Chen, N. Uzun and A. N. Akansu,
" Design and Analysis of a Scalable Terabit Multicast Packet Switch with Dual
Round Robin Dynamic Link Reservation,"
to be presented at IEEE ATM Workshop, Germany, June 2000.
F. Chen, N. Uzun and A. N. Akansu,
"A Scalable Terabit Multicast Packet Switch with Link Sharing and Dual
Round Robin Dynamic Link Reservation,"
submitted to the IEEE/ACM Transaction on Networking, January 2000.
F. Chen, N. Uzun and A. N. Akansu,
"A Scalable Terabit Core Switch for Broadband Internet,"
submitted to IEEE MILCOM'2000, Los Angeles, LA, October 2000.

This work is dedicated to my husband and our parents.

vi

ACKNOWLEDGMENT
First and foremost, I would like to express my sincere gratitude to my advisors and
mentors, Dr. Ali N. Akansu and Dr. Necdet Uzun. I appreciate Dr. Akansu for his
constant support, guidance, and encouragement in every step of my way. I would like
to thank Dr. Uzun, an excellent advisor and a good friend, for his inspiration and
valuable discussion; for the time he has put into helping me to make this dissertation
the comprehensive work it is today; and for his belief in my abilities and encouraging
me to aim high.
I would like to thank my dissertation committee, Dr. Nirwan Ansari, Dr.
Symeon Papavassiliou, and Dr. H. Jonathan Chao, for their being interested in this
work and stimulating many valuable discussions and insightful comments. I would
also like to extend my appreciation to Dr. Bulent Yener and Dr. Sirin Tekinay for
their help at the beginning of this research work.
Likewise, I would like to thank Ms. Anne McMahon, who is always there for
us and helps a lot in many ways. I would also like to extend my appreciation to the
staff of the ECE department for their constant assistance in just about everything.
Many thanks are also due to my colleagues at NJCMR and CCSPR, especially
my good friends Mahalingam Ramkumar, Xiaodong Cai, Minyi Zhao, Pingping Zong,
Bin He and Anil Bircan, for all the help they have offered towards completion of this
dissertation, and for a lot of fun we shared together. I would also like to thank
Xueming Lin, Ziqiang Xu, and Jianguo Chen for their continues help and advise
from the first day I joined NJIT.
Finally, words can never express the gratitude I feel toward my husband and
our parents, whose support provides the foundation upon which this work and other
achievements were built.

vii

TABLE OF CONTENTS
Page

Chapter

1 INTRODUCTION

1

1.1 Motivation

1

1.2 Review

1

1.3 Design Issues

6

1.3.1 Multicasting

6

1.3.2 Scalability

8

1.3.3 Low Complexity

9

1.3.4 High Performance

9
11

1.4 Outline

2 A NOVEL PERFORMANCE ANALYSIS FOR THE COPY NETWORK
14
IN A MULTICAST ATM SWITCH
2.1 Introduction

14

2.2 Notation and Assumptions

16

2.3 Performance Analysis of NBNS Copy Network

18

2.4 Performance Analysis for both SIBNS
and SIBS Copy Networks

19

2.4.1 Notation and Assumption

21

2.4.2 The Proposed Markov Model

21

2.4.3 Performance Analysis

25

2.4.4 Validation of the Markov Model for SIBNS and SIBS

28
31

2.5 Conclusion

3 SWITCH I: A LARGE SCALE MULTICAST ATM SWITCH USING
32
INPUT AND OUTPUT LINK SHARING
32

3.1 Introduction

viii

TABLE OF CONTENTS
(Continued)
Page

Chapter

32

3.2 Switch Architecture
3.2.1 Input Shared Block

34

3.2.2 Output Shared Block

35

3.2.3 Central Switch Fabric

36
36

3.3 Cell Scheduling
3.3.1 IVOQ Round Robin

37

3.3.2 GVOQ Round Robin

38

3.4 Performance Evaluation

40

3.4.1 Traffic Model

40

3.4.2 Switch Performance

41
51

3.5 Conclusion

4 SWITCH II: A MODIFIED SWITCH DESIGN USING LINK SHARING
54
AND PRIORITIZED LINK RESERVATION
4.1 Introduction

54

4.2 Switch Architecture

54

4.3 Cell Scheduling

56

4.3.1 Cell Delivery

56

4.3.2 Link Reservation : RR+POLR Algorithm

56

4.3.3 Remarks

60

4.4 Switch Performance

61

4.5 Conclusion

70

4.5.1 Advantages of Switch II

70

4.5.2 Disadvantages of Switch II

72

5 SWITCH III: A SCALABLE TERABIT MULTICAST PACKET SWITCH
WITH DUAL ROUND ROBIN DYNAMIC LINK RESERVATION . . . . 74
74

5.1 Switch Architecture

ix

TABLE OF CONTENTS
(Continued)
Chapter

5.2 Cell Scheduling

Page

76

5.2.1 Cell Delivery

76

5.2.2 Link Reservation

77

5.3 REQ-QOBDLR Algorithm

79

5.3.1 Operations upon receiving REQj Token

80

5.3.2 Operations upon receiving RELn Token

81

5.3.3 Remarks on REQ-QOBDLR Algorithm

81

5.3.4 Conclusion on REQ-QOBDLR Algorithm

82

5.4 REQREL-QOBDLR Algorithm

82

5.4.1 Operations upon receiving RELn Token

83

5.4.2 Operations upon receiving REQj Token

85

5.4.3 Remarks on REQREL-QOBDLR Algorithm

85

5.5 Analysis of QOBDLR Algorithms

86

5.5.1 Algorithm Complexity

86

5.5.2 The choice of HT and LT

86

5.5.3 Scalability of QOBDLR Algorithms

89

5.6 Performance Evaluation

95

5.6.1 Traffic Model

95

5.6.2 Switch Performance

97

5.7 Conclusion

108

6 CONCLUSION AND FUTURE WORK

109

APPENDIX A REQ-QOBDLR ALGORITHM

111

APPENDIX B REQREL-QOBDLR ALGORITHM

115

REFERENCES

119

LIST OF TABLES
Table

Page

1.1 Performance requirements and objectives for BSS [42]. * : includes nonqueueing related delays but excludes propagation, and does not include
delays due to processing above ATM layer. N/S : not specified. 10
3.1 Switch I : switch performance under uniform unicast traffic with different
input load p. The observed performance statistics are : (1) throughput;
(2) average end-to-end cell delay and delay jitter (DE_ to _E, (Min,
Max)); (3) average cell delay in ISB and delay jitter (D ISB , (Min,
Max)); (4) average occupancy of OSB (S OSB and (Min, Max)) 42
3.2 Switch I : switch performance under uniform multicast traffic with
different input load p. The observed performance statistics are :
(1) throughput; (2) average end-to-end cell delay and delay jitter
(DE-to-E, (Min, Max)); (3) average cell delay in ISB and delay jitter
(D ISB , (Min, Max)); (4) average occupancy of OSB (SOSB and (Min,
Max))
48
4.1 RR+POLR Algorithm

58

4.2 Switch II : performance comparison under non-uniform unicast traffic
with different input load p. The observed performance statistics are
: (1) throughput; (2) average end-to-end cell delay and delay jitter
(DE-to-E, (Min, Max)); (3) average cell delay in ISB and delay jitter
(D ISB , (Min, Max)); (4) average occupancy of OSB (SOSB and (Min,
Max))
63
4.3 Switch II : performance comparison under non-uniform multicast traffic
with different input load p. The observed performance statistics are
: (1) throughput; (2) average end-to-end cell delay and delay jitter
(DE_ to _E, (Min, Max)); (3) average cell delay in ISB and delay jitter
(D ISH , (Min, Max)); (4) average occupancy of OSB (SOSB and (Min,
Max))
68
5.1 The possible choices of HT and LT for an 256x256 switch consisting of 8
ISBs and 8 OSBs, i.e. N = 256, K = 8, m = M = 32. We select HT
=4, and LT = 2
89

xi

LIST OF TABLES
(Continued)
Table

Page

5.2 Switch III : performance comparison under uniform multicast traffic
with different input load p. The observed performance statistics are
: (1) throughput; (2) average end-to-end cell delay and delay jitter
(DE-to-E, (Min, Max)); (3) average cell delay in ISB and delay jitter
(D ISB , (Min, Max)); (4) average occupancy of OSB (SOSB and (Min,
Max))
99
5.3 Switch III : performance comparison under unicast "1 ISB → 1 OSB
HotSpot" traffic. The observed performance statistics are : (1)
throughput; (2) average end-to-end cell delay and delay jitter (DE-to-E,
(Min, Max)); (3) average cell delay in ISB and delay jitter (DISB , (Min,
Max)); (4) average occupancy of OSB (SOSB and (Min, Max)) 102

xii

LIST OF FIGURES
Figure

Page

1.1 Review on multicast ATM switches

3

1.2 A general architecture of an output queued (OQ) switch

4

1.3 A general architecture of input queued (IQ) switch : (a) an IQ switch
using FIFO queues ; (b) an IQ switch using VOQs

4

1.4 A general architecture of input-output queued (IOQ) switch

6

1.5 Multicast switches

7

2.1 Three scenarios of copy network in a multicast ATM switch

15

2.2 Overflow probability in NBNS

19

2.3 Cell loss and throughput in NBNS

20

2.4 The general Markov Model for both SIBNS and SIBS

22

2.5 State transition probability matrix of Markov chain : Pt

23

2.6 Cell loss in three scenarios : NBNS, SIBNS and SIBS

26

2.7 throughput in three scenarios : NBNS, SIBNS and SIBS

27

2.8 Cell delay in shared-input-buffer copy networks : SIBNS and SIBS . . . 29
3.1 Switch I : an NxN switch consists of K ISBs, K OSBs and ATMCSF;
K = N and m = M in this dissertation. Input link sharing is achieved
at every ISB-ATMCSF interface, and output link sharing is achieved
at every ATMCSF-OSB interface 33
3.2 Input Shared Block (ISB) : (a) the j th ISB using IVOQs (b) the j th ISB
34
using GVOQs.
3.3 Output Shared Block (OSB) with output link sharing (here, m = M). . . 36
3.4 Switch I applies round robin (RR) cell scheduling which is based on an
37
one-to-one group mapping from K ISBs to K OSBs
3.5 IVOQ Round Robin in an 4x4 switch (N = 4, m = M = 2, K = 2)

38

3.6 GVOQ Round Robin in an 4x4 switch (N = 4, m = M = 2, K = 2) . . . 39
3.7 Traffic Model : Multicast Bursty Traffic

xiii

40

LIST OF FIGURES
(Continued)
Page

Figure
3.8 Switch I : throughput under uniform unicast traffic

44

3.9 Switch I : average end-to-end cell delay (D E _ to _ E ) under uniform unicast
traffic.
45
3.10 Switch I : average cell delay in ISB (D ISB ) under uniform unicast traffic 46
3.11 Switch I : average size of OSB

(SOSB )

under uniform unicast traffic. . . 47

3.12 Switch I : throughput under uniform multicast traffic.

49

3.13 Switch I : average end-to-end cell delay (D E _ to _ E ) under uniform
multicast traffic.
50
3.14 Switch I : average cell delay in ISB (D ISB ) under uniform multicast traffic. 51
3.15 Switch I : average size of OSB ( SOSB ) under uniform multicast traffic. . 52
3.16 Performance of Switch I under '1 ISB
traffic'

xiv

1 OSB hotspot non-uniform
53

LIST OF FIGURES
(Continued)

xv

LIST OF FIGURES
(Continued)
Page

Figure

xv i

CHAPTER 1
INTRODUCTION
1.1 Motivation
Internet today is under tremendous growth and enjoys its world wide success.
The scalable and distributed nature of the Internet attracts more and more users
and service providers. Meantime, many emerging applications demand increased
bandwidth and generate a huge volume of traffic. It creates an unprecedented
demand for core network capacity. Also, the exponential growth of traffic may cause
several problems in the network such as congestion, unpredictable delay, insufficient
reliability and low availability. Facing those challenges, the core switch/router is
therefore required to be able to deliver higher performance in terms of large capacity,
high speed, multicasting as well as the Quality of Service (QoS).
In a word, scalable multi-terabit multicast switches/routers are in demand.
However, existing switches/routers, due to the bottleneck from either switch architecture or arbitration complexity, can reach a capacity on the order of gigabits per
second but few of them is scalable to terabit capacity.
In this dissertation, we propose several switch architectures and scheduling
algorithms to approach the desired scalable terabit multicast packet' switch.

1.2 Review
In the history of switch design [1] [2], various multicast ATM switches have been
proposed in literature. As shown in Fig 1.1, switch fabric on which a switch architecture is built can be classified into three types : Banyan network, Crossbar network,
and Clos network. Starlite switch [3], Turner's broadcast switch [4] and Lee's
'In this dissertation, a packet has fixed length of 53 bytes.

1

2
multicast switch [5] were the typical multicast switches based on Banyan network.
Later on, a practical version of Lee's switch is proposed in [6]. And another
advanced switch with fault-tolerant multistage interconnection network (MIN) switch
is presented in [7]. Those switches have an advantage of a reduced hardware
complexity. But, internal path conflict and head of line (HOL) blocking are the
challenges for those switches to achieve high performance and scalability. One of the
switches built on Crossbar network is Knockout Multicast switch [8], which utilizes a
concentrator in every output port to resolve output contention. Following Knockout
Multicast switch, Crossbar switch [9], Shared Concentration and Output Queueing
Multicast (SCOQ) [10], Multicast Output Buffered ATM Switch (MOBAS) [11],
Abacus [12], and a growable multicast switch [13] were proposed. Crossbar switches
can achieve high performance because of output queueing and output contention
resolution. The tradeoff is the cost of hardware complexity and speedup required.
Growable packet switch [14] and ring sandwich network [15] were the multicast
ATM switches based on Clos network. [20] presents a performance study for a
buffered Clos switch. In fact, Clos network belongs to MIN but it only has 3 stages.
Since Clos network can provide multiple paths from an input port to an output
port, internal path conflicts are relaxed. Clos network has better performance than
Banyan network but it has higher hardware complexity.
Existing packet switches including above multicast switches are able to achieve
Gigabit/sec capacity. But, few of them provides further scalability to Terabit/sec.
Besides the restraint from switch fabric, queueing strategy and cooperated scheduling
scheme have a great impact on switch scalability as well. From switch buffering point
of view, switches can be classified into output-queued (OQ) switches 2 , input-queued
(IQ) switches, and input-output-queued (IOQ) switches.
2

0Q switches include centralized shared memory switches.

3

Figure 1.1 Review on multicast ATM switches

Fig 1.2 depicts a general model of an OQ switch. OQ switches, such as [8,
11, 12, 13, 16, 17, 18, 19, 21], proved to maximize throughput and optimize latency.
Hence, OQ switches are able to provide Quality of Service (QoS) guarantees [22, 23,
24, 28]. But, switch fabric and output buffers have to operate N (N is the switch
size in terms of the number of switch inputs/outputs) times as fast as the line rate,
because cells arriving at switch inputs have to be delivered to and stored in output
queues in a same cell slot. It may be practical to implement an output queued switch
or router with an aggregated bandwidth of several 10Gb/s. But, it is not feasible to
design a large OQ switch with fast line rate, because memory access speed achieved
in commercial is not fast enough to support N times speedup.
On the other hand, IQ switches (see Fig 1.3) become more attractive because
switch fabric and input memory only need to run as fast as the line rate. An IQ
switch with FIFO queues is known to suffer head of line (HOL) blocking which
limits the throughput to (2 — = 58.6%. To overcome HOL blocking, virtual

4

Figure 1.2 A general architecture of an output queued (OQ) switch

output queues (VOQs) are applied in every switch input together with scheduling
algorithms like Longest Queue First (LQF)[30], Oldest Cell First (OCF)[31], Longest
Port First (LPF) [32] to achieve 100% maximized throughput. To support multicast
traffic, TATRA and WBA were proposed for IQ switches [33] [34]. A combined
input output queued (CIOQ) switch has been proposed[35] and demonstrated that
the CIOQ switch can precisely emulate the OQ switch when speedup S > 2 —
In addition, [36] [37] [38] propose some priority queueing algorithms for integrated
traffic.

Figure 1.3 A general architecture of input queued (IQ) switch : (a) an IQ switch
using FIFO queues ; (b) an IQ switch using VOQs

5

Though IQ switches are capable of supporting high speed line rate without any
speedup in hardware, scheduling arbitration complexity of at least 0(N 2-5 ) is a big
obstacle if IQ switches grow to a large size. The reason is that, most scheduling
algorithms proposed for IQ switches employ a centralized scheduler, which needs
to collect traffic information from N switch inputs in every cell slot and consumes
multiple iteration to determine the final input-output matching. Situation may
become more complex under multicast traffic. As scheduling complexity increases
with switch size N, an IQ switch using a centralized scheduler has difficulties in
growing to a large switch size and terabit/sec capacity.
IOQ Switches are combinations of IQ switches and OQ switches (refer Fig 1.4).
As comparison study in [39], OQ switches deserve the best throughput/delay
performance for arbitrary traffic distributions. However, since the current memory
access time is limited to a few nsec by state-of-the-art integrated circuit technology,
output-buffered switch architecture is not scalable for large-capacity systems. On
the other hand, IQ switches endures poor throughput/delay performance because of
HOL blocking, but input-queued architecture is feasible to extend. The IOQ switch
is a solution by trading off the high performance of the OQ switch and the low
hardware complexity of the IQ switch.
One of few existing IOQ switch designs is CIOQ switch [35]. But, the reason
for CIOQ switch in [35] to adopt both input queueing and output queueing is to
provide QoS in IQ switches. As speedup is required in IQ switches for QoS purpose,
output queueing is needed to avoid cell loss. CIOQ switch, in fact, can be classified
as an IQ switch. The centralized scheduler sustains an arbitration complexity of
O (N 2.5 ) so that CIOQ switch [35] is not scalable. In addition, the modular batcherbinary-banyan switch [40] proposed by T.T.Lee and Sunshine switch [41] proposed

6

Figure 1.4 A general architecture of input-output queued (IOQ) switch

by Bellcore are also IOQ switches. But, because of irregular interconnection pattern
in hardware, those switches are limited to up to 20Gb/s.

1.3 Design Issues

Several issues should be considered when we design a large-capacity switch. In this
section, we mainly address following aspects which are targeted in our design of a
scalable terabit multicast packet switch.
1.3.1 Multicasting

In today's B-ISDN and Internet, many services, such as teleconferencing, entertainment video, distributed data processing, are characterized by point(multipoint)to-multipoint communication. Switches need to support not only point-to-point
connections, but also multipoint connections. Multicast switch is a solution for
sending information from one sender to a group of receivers.
Multicast functions in ATM switches can be implemented either with a separate
nonblocking copy network followed by a point-to-point routing network, or with an

7

Figure 1.5 Multicast switches

integrated switching fabric performing both replication and routing functions. Fig 1.5
illustrate the two alternatives.
The architecture of a nonblocking copy network followed by a traditional
point-to-point ATM switching network is adopted by many commercially available
switches [3, 4, 5, 6], because the traditional switch doesn't need to change completely
but only adding a copy network ahead. Copy network replicates an input multicast
cell 3 to the number of cell copies. Then the cell copy is routed to an output line
through a point-to-point routing network. But, the copy network faces the problem
of overflow which may cause performance degradation. In addition, there is an
implementation redundancy by separating copy network and routing network. It
increases hardware complexity.
Another architecture of multicast switch is shown in Fig 1.5(b). Cell duplication and cell routing are integrated together in implementations. For example,
[7, 11, 12, 16, 17, 18, 19, 20, 43, 44] are the switches using either output buffer or
shared-memory to handle the cell copy and to schedule cells at the same time. And
the multicast IQ switches belong to this type of switch architecture. [28] is a typical
shared-memory architecture combining cell duplication and cell delivery. Most
3 In this paper, the multicast cell is defined as a cell with one destination or multiple
destinations.

8

recent switch designs adopt this integrated switch architecture to reduce hardware
complexity and achieve an efficient buffer management as well.
1.3.2 Scalability

Scalability can be evaluated from two aspects — capacity and expandability. Internet
applications continue to grow and create an ever-increasing demand for bandwidth.
Switches have to be scalable to avoid being frequently re-architectured in order to
support massive increase of traffic. Thus, core switches face an emerging challenge
to provide more than 100Gb/s even Terabit/s capacity. Existing switches using
current state-of-the-art technology can obtain a capacity up to several 10Gb/s, but
are not easy to pursue Terabits/sec due to some constrains such as memory access
rate or arbitration complexity. For example, shared-memory switches are optimal in
performance and also cost effective. But, switch size and capacity of shared-memory
is ruled by the fact that :

where R is the input line rate, and N is the number of switch inputs (outputs).
Bounded by the RAM read/write rate, it is observed that shared-memory switch is
not able to gear to the high capacity expectation.
In addition to capacity, another necessary requirement for scalability is expandability. It considers whether switch architecture supports increased speeds or
additional switch ports, and how flexible the switch can be to pursue an expanding
configuration. The best solution would be a modular switch architecture.

9
1.3.3 Low Complexity

Both hardware complexity and scheduling arbitration complexity must be minimized.
Hardware complexity is often measured in terms of logic gate counts, chip pinout,
memory speed, implementation costs. From prototype design to real implementation,
above concerns should be carefully evaluated. For example, the multicast switch
using copy network usually has implementation redundancy and incurs high hardware
costs. In addition, some switches such as the OQ switch are limited by the memory
access speed because of the up to N times speedup required. The switch fabric with
shuffle connection from N switch inputs to N switch outputs gains reliability but
pays for high connection cost. In short, an efficient switch design should minimize
the hardware complexity but without sacrificing reliability and performance.
Apart from hardware complexity, arbitration complexity should be low to gear
up the hardware design. The IQ switch, for example, is better than the OQ switch in
the aspect of hardware complexity. But, the IQ switch uses a centralized scheduler
to resolve HOL blocking so that the IQ switch tolerant a high arbitration complexity
of at least 0(N 2-5 ). The arbitration complexity hinders the IQ switch to build a
large scale switch.
In summary, we may need to trade off between the hardware complexity and
arbitration complexity in order to pursue a good solution based on some specific
design requirements.
1.3.4 High Performance

Switches should provide satisfactory performance. Bellcore has recommended
performance requirements and objectives for a Broadband Switching Systems (BSS)
[42]. Table 1.1 defines three classes of Quality of Service (QoS) and explains the
associated performance objectives.

10

Table 1.1 Performance requirements and objectives for BSS [42]. * : includes nonqueueing related delays but excludes propagation, and does not include delays due
to processing above ATM layer. N/S : not specified.

QoS class 1 is dedicated to cell loss sensitive applications. It corresponds to
AAL layer class A service which is defined by ITU-T XIII Group and ATM Forum.
QoS class 3 is applied for low latency, connection-oriented data transfer applications
which is intended for AAL class C service. In addition, QoS class 4 is related to low
latency, connectionless data transfer applications which is for AAL class D service.
The performance parameters include cell loss ratio, cell transfer delay, and cell
delay variation. The performance objectives associated to a QoS class are determined
by the status of the cell loss priority (CLP) bit in the ATM cell header. End users can
initialize the CLP bit but switches along the connection path can change it according
to network conditions.
For all three QoS classes, the probability of cell transfer delay greater than
150µs is guaranteed to be less than 1 percent, i.e. :
Pr. [ cell transfer delay > 150 is < 0.01
The probability of cell delay variation (CDV) greater than 250µs is required to
be less than 10 -1° for QoS class 1, and to be less than 10' for QoS class 3/4.

11
In addition to above performance objectives, switches need to be flexible
to cooperate other technologies such as connection admission control, buffer
management, traffic engineering in order to provide Quality of Service (QoS)
guarantees.

1.4 Outline
Our goal is to design a scalable terabit multicast packet switch which is capable
of multicasting, large capacity, low complexity, modular configuration, and high
performance. In this dissertation, we propose three switch architectures with
cooperated scheduling algorithms namely Switch I, Switch II, and Switch III, to
achieve the desired switch. Our designs benefit from unique features of modular
switch architecture and distributed scheduling arbitration.
In chapter 2, we first present a theoretical work on the performance of copy
network under three scenarios : (1) Non-Buffer-NonSplitting copy network (NBNS).
(2) Shared-Input-Buffer-NonSplitting copy network (SIBNS). (3) Shared-InputBuffer-Splitting copy network (SIBS). For NBNS, we derived the exact overflow and
cell loss probabilities instead of the Chernoff Bound [5]. Furthermore, we propose a
general Markov Model, a novel theoretical approach, for the performance analysis
of the Shared-Input-Buffer copy networks. This analysis method can be applied for
both SIBNS and SIBS. Theoretical and simulation results are compared for every
scenario.
In chapter 3, we propose a novel switch design, namely Switch I, using input and
output link sharing. Switch inputs and outputs are grouped into small modules called
Input Shared Blocks (ISBs) and Output Shared blocks (OSBs). Link sharing resolves
output contention and eliminates the speedup requirement for central switch fabric.

12

Two Round Robin (RR) scheduling algorithms are proposed. Both schemes provide
a group mapping from an ISB to an OSB. Scheduling complexity is dramatically
reduced. The switch can easily extend to high capacity and large scale. Performance
evaluation demonstrates that the switch can achieve good performance under uniform
multicast 4 traffic. However, isolated Input Shared Blocks (ISBs) prevent switch from
achieving high performance under non-uniform traffic.
To overcome the weakness of Switch I, in chapter 4, we present Switch II,
a modified switch design using link sharing and prioritized link reservation. ISBs
are connected by a token ring. We propose a Round Robin Prioritized Output Link
Reservation (RR+POLR) algorithm to allocate link resource and alleviate starvation
of OSBs. Switch II obtains an improved performance under non-uniform traffic. But,
RR+POLR algorithm is not flexible enough to adapt the dynamic traffic timely.
Switch performance is highly determined by how fast link reservation rate the switch
can pursue.
Switch III, as an enhanced switch design using link sharing and dual round
robin dynamic link reservation, is finally proposed in chapter 5. Unlike the previous
two switches, ISBs are connected by dual rings on which K link request tokens
(REQs) and K link release tokens (RELs) circulate in a round robin manner. Cell
delivery is based on link reservation in every ISB. We propose two Queue Occupancy
Based Dynamic Link Reservation (QOBDLR) algorithms to achieve a fast and fair
link resource allocation among ISBs. QOBDLR is a distributed link reservation
scheme in a way that every ISB, according to its local information, can dynamically
increase/decrease its link reservation by "borrowing" or "lending" links from/to each
other. Arbitration complexity is 0(1). Switch III is competitive to OQ switches in
this work, multicast traffic includes unicast traffic, i.e., a multicast cell may have
one or multiple destinations.
4 1n

13
the sense that Switch III not only can achieve a comparable performance to OQ
switches under any traffic pattern but also can eliminate N times speedup required
in OQ switches.
At last, conclusion is drawn and future work is addressed in chapter 6.

CHAPTER 2
A NOVEL PERFORMANCE ANALYSIS FOR THE COPY
NETWORK IN A MULTICAST ATM SWITCH
2.1 Introduction
To accommodate the growing demands for a wide class of services, such as voice,
data, teleconferencing and entertainment video, a broadband packet network needs
to support not only point-to-point connections, but also multipoint connections.
Multicast switching is a solution for delivering information from a given source to a
group of destination.
A conventional architecture of multicast ATM switches consists of a nonblocking
copy network followed by a traditional point-to-point ATM switching network
[4][5][26][47] [49]. It provides point-to-multipoint connections by performing two
operations : packet replication and packet switching. The function of copy network
replicates an incoming cell to the number of required copies.
By applying a self-routing non-blocking fabric, the copy network does not have
any internal conflict. But, the copy network faces the problem of overflow if the total
copies required exceed the number of output lines of the network. Various scheduling
algorithms[47][50][52] to maximize throughput of the copy network were proposed.
They introduce additional buffers (input/output/central buffer) and/or scheduling
algorithms (one-shot, splitting, etc.), in order to maximize the number of cell copies
injected to the point-to-point switching network.
In this chapter, we present a theoretical work on the performance of the copy
network in three typical scenarios (shown in Fig 2.1). In Non-Buffer Non-Splitting
(NBNS) copy network (Fig 2.1(a)), all the copies required by a multicast cell are
replicated in the same time slot. The copy network has no inside buffer to save
blocked cells. NBNS causes high cell loss. To prevent the blocked cells from being

14

15

Figure 2.1 Three scenarios of copy network in a multicast ATM switch

16

lost, we introduce a shared input buffer in the copy network. Two scheduling
algorithms are considered for the Shared-Input-Buffer copy network : Non-Splitting
algorithm (SIBNS) (Fig 2.1(b) ), all the copies required by a multicast cell are
replicated in a same time slot; Splitting algorithm (SIBS) (Fig 2.1(c)), a multicast
cell can be partially copied in a time slot, and the remains can be delayed to the
next time slot.
For NBNS, we derived the exact overflow and cell loss probabilities instead of
the Chernoff Bound [5]. Furthermore, we propose a novel theoretical approach based
on a general Markov model, for the performance analysis of the Shared-Input-Buffer
copy networks. This analysis method can be applied for both SIBNS and SIBS.
Both theoretical analysis and simulation results are presented for every scenario.
The comparison shows that shared-input-buffer (SIBNS and SIBS) can obtain an
improved performance with lower cell loss and higher throughput. However, the
tradeoff is long cell delay. With the splitting algorithm, SIBS can provide better
performance than NBNS and SIBNS.
This chapter is organized as follows. In Section 2.2, we provide several
notations and assumptions that we use throughout this chapter. Section 2.3 presents
performance analysis for NBNS copy network. In Section 2.4, we propose a general
Markov Model for the performance analysis of both SIBNS and SIBS copy networks.
The analysis model is examined by the numerical and simulation results. Conclusions
are finally drawn in Section 2.5.

2.2 Notation and Assumptions

We assume that : (1) input lines are independent and identically distributed; (2)
cells' arrival is Poisson process. If an input line has cells arriving, this input line is
an active line, otherwise, it's an idle line.

17
N : size of the copy network.( for 8inputs/8outputs copy network, N=8);

Cmax : the maximum number of copies allowed for every multicast cell, 0 <

Cmax

≤

N;
C : random variable, represents the number of copies required. Assumed to be
uniformly distributed;
Ck : Probability that the number of copies is k , i.e. pdf of random variable C;

18
2.3 Performance Analysis of NBNS Copy Network
Assume that, in every cell slot, the copy network serves incoming multicast cells from
the 1st input line to the

Nth

input line (i.e., top-down order). If the total number of

desired cell copies exceeds the size of the copy network, some multicast cell(s) arrived
at the later input lines will be discarded.

19

each input line. There is an unfairness : the later input line will have higher overflow
probability. Our analysis provides an exact overflow probability, while the Chernoff
Bound [5] is much looser. Fig 2.3 illustrates cell loss and throughput. Large copy
load (Cmax ) and heavy input load (Pin ) incur more cell loss and less throughput.
NBNS does not introduce any cell delay in copy network.

Figure 2.2 Overflow probability in NBNS

2.4 Performance Analysis for both SIBNS
and SIBS Copy Networks

To improve the performance of copy network, a solution is to apply additional buffers
[47][48][50][51]. In this paper, we focus on the shared input buffer with two scheduling
methods (NonSplitting and Splitting algorithms).

SIBNS : In Shared-Input-Buffer Non-Splitting scenario, cell copies belonged to
a same multicast cell should be delivered in a same cell slot. Otherwise, the multicast

20

Figure 2.3 Cell loss and throughput in NBNS

21

cell is blocked in the shared buffer with whole copy requirements. Buffered cells have
higher priority to be served than a new arriving cell.
SIBS : In Shared-Input-Buffer Splitting scenario, the copy network can make
partial copies for a multicast cell. The splitted cell is saved into shared buffer with
remained copy requests.
2.4.1 Notation and Assumption

BUFmax : the maximum size of the shared input buffer.
BUF m : the length of the shared input buffer at the end of the

M th

time slot.

IN m : the number of new arriving cells from N inputs in the M th time slot. In every
time slot, at most 1 cell comes into the copy network from each input line.

OUTm : the number of multicast cells successfully delivered out of the copy network
in the mth time slot. In a time slot, at most N multicast cells can go through the
copy network (when each cell just needs 1 copy). The probability distribution

2.4.2 The Proposed Markov Model

In Fig 2.4, we propose a general Markov Model for SIBNS and SIBS. Each state
indicates current queue length in the shared buffer, i.e., how many multicast cells
are waiting in the shared memory.

22

Figure 2.4 The general Markov Model for both SIBNS and SIBS
The model we propose is unique in the sense that each multicast cell occupies
only one unit in the buffer, no matter how many copies it requires. It can be applied
to many different scheduling algorithms, buffer and copy network sizes.

23

Figure 2.5 State transition probability matrix of Markov chain :

Pt

24

25
2.4.3 Performance Analysis
2.4.3.1 Cell Loss :

Due to finite buffer size, cell loss will happen when the

shared input buffer is overloaded. In our proposed Markov model, we assume that
there are i multicast cells waiting in the buffer at the end of (m — 1)th time slot. Cell
loss happens when the Markov chain jumps to the state BUFmax at the

m th

time

slot.

Cell loss is illustrated in Fig 2.6. SIBNS and SIBS copy networks causes less
cell loss than NBNS copy network. In fact, SIBNS and SIBS significantly reduce the
cell loss in some region where Cmax and Pin jointly give an average load to the copy
network. Compared with SIBNS, SIBS has lower cell loss.

2.4.3.2 Throughput : Throughput is evaluated as the number of multicast cells
successfully passing the copy network every time slot.

26

Figure 2.6 Cell loss in three scenarios : NBNS, SIBNS and SIBS

Figure 2.7 throughput in three scenarios : NBNS, SIBNS and SIBS

28
Shown in Fig 2.7, SIBS achieves higher throughput than SIBNS and NBNS.
Higher throughput results from lower cell loss. Throughput is increased with a large
buffer size.

2.4.3.3 Cell Delay : Assume that E(d) is the average delay for a multicast cell
waiting in a copy network. E(n) is the average buffer length occupied by the blocked
multicast cells. According to the Little's Formula, we have

average input load which is accepted by the copy network every time slot. Therefore,

λ 'eff is actually the same as the throughput.

Fig 2.8 shows the performance of cell delay. When the copy load Cmax or the
input load Pin, becomes heavy, more cells are blocked in the buffer. It causes increased
cell delay. SIBS copy network has less cell delay than SIBNS copy network. Larger
buffer results in longer cell delay. According to our assumption on

Ck,

the cell delay

increases linearly with buffer size. But, the proposed theoretical approach can be
applied to other distribution of

Ck•

2.4.4 Validation of the Markov Model for SIBNS and SIBS
We propose a general Markov Model for the Shared-Input-Buffer copy network with
and without splitting algorithm (SIBNS and SIBS). In fact, with different algorithms,
the difference between the SIBNS and SIBS exists only in the place where we compute

POUT = m), which is the probability that m multicast cells successfully pass

29

Figure 2.8 Cell delay in shared-input-buffer copy networks : SIBNS and SIBS

30

through the copy network. The P(OUT = m) is eventually derived in terms of the
overflow probability as :

In Eq. 2.21, each one with the form like P(CP1 + CP2 + • +CPi > N) could
be obtained by the convolution of CPk , like :

With NonSplitting algorithm, in the SIBNS copy network, C P 1 is always the
original copies required by the 1 S t multicast cell. Therefore, we have

However, in SIBS copy network, the 1S t multicast cell is probably splitted.
Therefore, the copies required by the 1s t cell might be part of the original copy
requirements.

where CP
buffer.
In Eq. 2.24, P(CP 1' = m)P(CP1 = l/CP1' = m) is the probability that the
1S t multicast cell in the buffer currently needs 1 copies instead of the m copies which

mti1'suthlecoa;rgn bfpiesrqudbythl

31

is the original requirements. The remaining / copies might be any value which is
positive but not larger than m. Therefore,

From the above discussion, the Markov Model we proposed here is generic for
both SIBNS and SIBS scenarios. Our Markov Model and corresponding analysis is
a novel approach for the performance analysis of the copy network in a multicast
ATM switch.

2.5 Conclusion

In this chapter, we analyze the performance of the copy network in a multicast ATM
switch under three scenarios : NBNS, SIBNS and SIBS. Theoretical analysis is done
for the three cases and compared with simulation results. We proposed a general
Markov Model for Shared-Input-Buffer copy network. Our analysis model is shown
to be a novel approach for evaluating the performance of copy networks.
The multicast switch evaluated in this chapter is the switch design consisting
of a copy network followed by a traditional point-to-point ATM switching network.
Switch architecture endures a lot of redundancy due to the usage of copy network
and routing network individually. Our switch designs proposed in later chapters will
integrate the functions of cell replication and cell routing to reduce the hardware
complexity.

CHAPTER 3
SWITCH I: A LARGE SCALE MULTICAST ATM SWITCH USING
INPUT AND OUTPUT LINK SHARING
3.1 Introduction
In this chapter, we propose Switch I, a novel switch architecture using input and
output link sharing. Switch inputs and switch outputs are grouped into small
modules called Input Shared Blocks (ISBs) and Output Shared Blocks (OSBs).
Input link sharing resolves output contention and avoids link starvation. Output
link sharing eliminates the speedup requirement for the central switch fabric when
more than one cell goes to a switch output. Two round robin scheduling algorithms
— Individual Virtual Output Queue Round Robin (IVOQ Round Robin), and
Grouped Virtual Output Queue Round Robin (GVOQ Round Robin), are presented.
Both schemes support group mapping from an ISB to an OSB so that scheduling
complexity is significantly reduced. Switch performance is evaluated through
simulations. It shows that Switch I can achieve a comparable performance as
the OQ switch under uniform traffic. Switch I is scalable due to its modular
configuration.
This chapter is organized as follows. In Section 2, we describe the proposed
switch architecture in detail. In Section 3, we introduce two cell scheduling
algorithms : IVOQ Round Robin and GVOQ Round Robin. Switch performance is
evaluated in Section 4. Conclusion is drawn in Section 5.

3.2 Switch Architecture
Fig 3.1 depicts the architecture of Switch I which consists of three major modules

: Input Shared Block (ISB), Output Shared Block (OSB), and ATM Central Switch
Fabric (ATMCSF). The N Switch inputs and the N switch outputs are respectively

32

33
grouped into K ISBs and K OSBs, where K = N/m At every ISB-ATMCSF interface,
there are M input links shared by m related switch inputs. At every ATMCSF-OSB
interface, there are M output links shared by m grouped switch outputs. In this
dissertation, we only consider the case of m = M, and the study of M > m which
implies a virtual speedup in ATMCSF is the subject of our ongoing work. Applying
input link sharing and output link sharing together makes the proposed switch a
unique design.

Figure 3.1 Switch I : an NxN switch consists of K ISBs, K OSBs and ATMCSF;
K = -1Y- and m = M in this dissertation. Input link sharing is achieved at every
ISB-ATMCSF interface, and output link sharing is achieved at every ATMCSF-OSB
interface.

34
3.2.1 Input Shared Block

An ISB can be a shared memory receiving multicast cells from m (= M) related
switch inputs. A multicast cell is saved once in an ISB instead of keeping j identical
cell copies (assume, j is the fanout of a multicast cell, 0 < j < N). We investigate two
schemes for shared memory management in an ISB (shown in Fig 3.2) Individual
Virtual Output Queue (IVOQ), and Grouped Virtual Output Queue (GVOQ) [54].

Figure 3.2 Input Shared Block (ISB) : (a) the j th ISB using IVOQs ; (b) the j th
ISB using GVOQs.

IVOQ scheme is shown in Fig 3.2(a). Every ISB keeps N virtual output queues.
Each virtual queue is a linked list of the multicast cells going to the same switch
output. The physical address to save a multicast cell will be stored into every related
linked list. Cell delivery from a virtual output queue is based on FIFO principle.
When a multicast cell has all cell copies delivered, the memory address for this cell
will be released and available for a new cell.

35

GVOQ scheme is shown in Fig 3.2(b). Every ISB only maintains K (= Z)
grouped virtual output queues. A grouped virtual output queue is a linked list of
the multicast cells targeting an OSB. If a multicast cell has more than one destination
to an OSB, only a single connection carrying all desired destinations is attached to
the related grouped virtual output queue. Hence, a cell delivered from an ISB to
ATMCSF may carry multiple destinations, and will be stored into every related
output queues when the cell is received by an OSB. Compared with the switch using
IVOQs, the switch with GVOQs can forward more cell copies from ISBs to OSBs
so that the switch can achieve better performance. GVOQ scheme follows FIFO
principle to receive and deliver cells.
Since an ISB-ATMCSF interface has a capacity of M links, an ISB can deliver
at most M cells to the central switch fabric in every cell slot. An ISB can send
a cell through any of the M shared links. Input link sharing is able to avoid link
starvation if some virtual output queue is empty, because other virtual output queues
can utilize the idle link to deliver their cells. Input link sharing results in an improved
performance.
3.2.2 Output Shared Block

An OSB is a shared memory containing M output queues as shown in Fig 3.3. In
every cell slot, each output queue delivers one cell out of the related switch output.
An ATMCSF-OSB interface only supports M links, hence, each OSB can accept at
most M cells from the central switch fabric in every cell slot. ATMCSF can use
any of the M shared links to pass a cell to an OSB. Without output link sharing, if
more than one cell goes to the same switch output, either cells are blocked, or it is
necessary for the switch fabric to speedup. However, output link sharing is able to
avoid both problems.

36

Figure 3.3 Output Shared Block (OSB) with output link sharing (here, m = M).

3.2.3 Central Switch Fabric

The central switch fabric (ATMCSF) should keep the same cell sequence for those
cell copies which are delivered from an ISB to an OSB. Apart from this, no other
restrictions are placed on ATMCSF. It can be any type of switch fabric (for example,
Abacus switch [12]), and no speedup is necessary.

3.3 Cell Scheduling

Cell scheduling aims to deliver cells from K ISBs to K OSBs in a fast manner. As
shown in Fig 3.4, Switch I utilizes the well-known Round Robin (RR) scheme as the
cell scheduling algorithm.
In every cell slot, there is an one-to-one mapping from K ISBs to K OSBs,
thus, each ISB is responsible for sending up to M cells to its matched OSB. Round
Robin mapping ensures that an ISB has an opportunity to send cells to every OSB
in every K cell slots. Fairness among ISBs is guaranteed. Since either IVOQs or
GVOQs are employed in every ISB, we propose two scheduling algorithms IVOQ

Round Robin and GVOQ Round Robin.

37

Figure 3.4 Switch I applies round robin (RR) cell scheduling which is based on an
one-to-one group mapping from K ISBs to K OSBs.

3.3.1 IVOQ Round Robin

An example of IVOQ RR algorithm is illustrated in Fig 3.5. The basic rules of IVOQ
RR algorithm are as follows.
Every ISB divides its N individual virtual output queues into K subgroups.
Each subgroup has M virtual output queues which are engaged to a certain OSB.
According to the one-to-one mapping in current cell slot, an ISB delivers cells from
a subgroup of M virtual output queues to its matched OSB. The HOL cells from the
selected M virtual queues are sent to central switch fabric.
If a polled virtual output queue is empty (refer to * in Fig 3.5), other virtual
queues in the same subgroup can deliver more than one cell. This is because of
using input link sharing which can avoid link starvation. The scheduling complexity
depends on how many iterations are needed to select up to M cells from M subgroup
GVOQs. Thus, scheduling complexity is in the range of [0(1), 0(M)]. In addition,
it may happen that more than one cell goes to a same switch output (refer to #
in Fig 3.5). Hence, output link sharing is needed to avoid internal cell loss and to

38

Figure 3.5 IVOQ Round Robin in an 4x4 switch (N = 4, m = M = 2, K = 2)

eliminate speedup in the central switch fabric. The switch fabric is required to keep
the same cell sequence for those cells delivered from an ISB to an OSB.
3.3.2 GVOQ Round Robin

An example of GVOQ RR algorithm is shown in Fig 3.6. GVOQ RR algorithm has
several good features.
First, an ISB only maintains K grouped virtual output queues instead of
keeping N individual virtual output queues. Each grouped virtual output queue
is for an OSB (i.e. for grouped m = M switch outputs). According to the one-toone mapping from ISBs to OSBs in current cell slot, every ISB only needs to poll a
grouped virtual output queue which is for its mapped OSB, then, delivers the first

39

Figure 3.6 GVOQ Round Robin in an 4x4 switch (N = 4, m = M = 2, K = 2)

M cells, if any, to ATMCSF (refer to * in Fig 3.6). Scheduling complexity is 0(1)
so that GVOQ RR is simpler than IVOQ RR.
Moreover, using GVOQs in ISBs is able to offer better performance especially
under multicast traffic, because a cell delivered from a grouped virtual output queue
can carry multiple destinations to the matched OSB (refer to * in Fig 3.6). Compared
with IVOQ RR, GVOQ RR algorithm results in a faster cell forwarding from ISBs
to OSBs.
But, GVOQ RR algorithm has a flaw that GVOQ RR may block a cell going
to an idle switch output while sending more than one cell to a switch output (refer
to # in Fig 3.6).

40
3.4 Performance Evaluation
3.4.1 Traffic Model

We evaluate the switch performance under both uniform and non-uniform traffic.
Multicast burst traffic is applied. As shown in Fig 3.7, we use an ON (active)/OFF
(idle) model to describe the burst traffic. The back-to-back cells in an ON duration
belong to the same VC, i.e. they have the same multiple destinations. Cell destinations are uniformly distributed among N switch outputs.

Figure 3.7 Traffic Model : Multicast Bursty Traffic

41
3.4.2 Switch Performance

Using OPNET, we simulate an 256x256 switch (N = 256) with either 32x32
ISBs/OSBs (M = 32, K = 8) or 08x08 ISBs/OSBs (M = 8, K = 32). As a
comparison, we also simulate an 256x256 output queued (OQ) switch under same
traffic condition. The OQ switch is assumed to have infinite output buffers. Cells
arriving at switch inputs will be sent to the related output queues in the same cell
slot.
There are two reasons for us to select an OQ switch as a comparison reference
: (1) OQ switches proved to maximize throughput and optimize latency under any
traffic pattern; (2) in the literature so far, few of existing switches are dedicated for
a distributed large scale switch and have been evaluated under any traffic condition.
Therefore, we believe that it is fair and effective to compare our designs with an OQ
switch under same traffic patterns.
The performance of Switch I under uniform traffic is illustrated in Table 3.1
and Table 3.2. For uniform traffic, the input load to an ISB uniformly targets N
switch outputs. We apply both unicast traffic and multicast traffic. In unicast
traffic, every arriving cell only carries a single destination. But, in multicast
traffic, a coming cell may have multiple destinations. Cells' destinations are

uniformly distributed among N switch outputs.
Through simulation, we would like to : (1) investigate the impact of ISB/OSB
size on switch performance; (2) evaluate IVOQ RR and GVOQ RR algorithms; (3)
compare Switch I with the OQ switch.
Table 3.1 shows the switch performance under uniform unicast traffic.
Fig 3.8 ~ Fig 3.11 depict the performance on throughput, end-to-end cell delay (i.e.

DE-to-E), cell delay in ISB (i.e. D ISB ), occupancy of OSB (i.e. SosB) respectively.

42

Table 3.1 Switch I : switch performance under uniform unicast traffic with different
input load p. The observed performance statistics are : (1) throughput; (2) average
end-to-end cell delay and delay jitter (DE-to-E, (Min, Max)); (3) average cell delay
in ISB and delay jitter (DISB, (Min, Max)); (4) average occupancy of OSB (SOSB
and (Min, Max)).

43
It is observed that, larger size of ISBs/OSBs results in better performance. For
example, if input load p is 99%, Switch I with 32x32 ISB(s) obtains approximately
98.5% throughput, while the switch with 8x8 ISB(s) endures 3% less throughput.
The reason is that, an 32x32 ISB receives cells from 32 switch inputs so that the
preserved cells in an ISB are more varied in terms of the destination requirements.
Hence, larger ISB is more likely to provide a saturated input load to the central
switch fabric (i.e. keep every input line of ATMCSF busy).
Comparison also shows that, IVOQ RR algorithm exceeds GVOQ RR
algorithm on switch performance. The reason is that, IVOQ RR can send M
cells to different switch outputs of its mapped OSB, if none of the M related virtual
queues is empty. But, since GVOQ RR simply schedules the first M cells from
a grouped virtual output queue to ATMCSF, more than one cell may go to the
same switch output. Hence, IVOQ RR achieves higher throughput than GVOQ
RR. Moreover, under unicast traffic, GVOQ RR losses its unique merit to deliver
multicast cells to ATMCSF. But, it is worth to notice that IVOQ RR algorithm and
GVOQ RR algorithm yield a very similar performance when input load p is reduced.
When traffic load is light, an ISB usually can send arriving cells to the mapped OSB
very quickly so that few cells are blocked in ISBs.
Compared to the OQ switch, Switch I exhibits a promising performance under
the uniform unicast traffic. Switch I causes less than 4.0% throughput degradation
under heavy traffic load, but achieves a very similar throughput when traffic load
decreases. The OQ switch defeats Switch I due to the reason that the OQ switch is
work-conserving 1 at every time slot, but switch I is not. Switch I endures longer cell
delay (i.e. D E _ to _ E ) than the OQ switch especially under heavy input load. Longer
1A

switch is work-conserving, if in every cell slot, a switch output is not idle as long as
there are cells going to that output port.

44

Figure 3.8 Switch I : throughput under uniform unicast traffic.

cell delay is due to lower throughput. We also measured the queueing latency in
ISBs (i.e. D ISB ) and the occupancy of OSBs (i.e. S OSB ) for Switch I. When input
load increases, both DISB and SOSB become longer. Since the central switch fabric
is assumed to deliver cells from input shared links to output shared links in the same
cell slot, hence, end-to-end cell delay is resulted from two parts : cell delay in ISBs
and cell delay in OSBs;i.ewhav

where SOSB/M is the average length of the output queue in OSBs, assuming that
m = M. Thus, this ratio approximates DOSB, i.e. the average cell delay in OSBs.

45

Figure 3.9 Switch I : average end-to-end cell delay (D E _ to _ E ) under uniform unicast
traffic.

46

Figure 3.10 Switch I : average cell delay in ISB
traffic.

(DISB)

under uniform unicast

Switch performance under uniform multicast traffic is illustrated in
Table 3.2. As we discussed previously, Switch I with 32x32 ISB/OSB achieves
better performance than that using 8x8 ISB/OSB. But, GVOQ RR outperforms
IVOQ RR for multicast traffic. It is due to the fact that GVOQ RR can forward
splitted multicast cells to ATMCSF, but IVOQ RR only delivers unicast cells to
ATMCSF. Hence, GVOQ RR provides a faster cell forwarding. Compared with
IVOQ RR algorithm, GVOQ RR can be claimed as a cost-effective algorithm in the
sense that GVOQ RR can obtain a similar or better performance than IVOQ RR
while gaining a lot by reducing the complexity on memory management and cell
scheduling.

47

Figure 3.11 Switch I : average size of OSB (SOSB ) under uniform unicast traffic.

The switch performance observed from Fig 3.12 Fig 3.15 show that Switch I is
able to pursue a comparable performance to the OQ switch under uniform multicast
traffic. In addition, larger size of ISBs/OSBs results in better performance. But, to
mimic the OQ switch, small latency in ISBs is expected. If D ISB = 0, it implies that
the proposed switch can pass cells to OSBs as fast as the OQ switch, however, our
switch does not need any speedup in central switch fabric. This is our essential goal.
Up to now, Switch I is demonstrated to support uniform traffic. However,
Switch I has a weakness to support non-uniform traffic in which cells accumulated
in an ISB may prefer to go to some switch outputs but do not go to other outputs.
A typical example of the non-uniform traffic is so called "1 ISB 1 OSB hotspot

48

Table 3.2 Switch I : switch performance under uniform multicast traffic with
different input load p. The observed performance statistics are : (1) throughput; (2)
average end-to-end cell delay and delay jitter (DE-to-E, (Min, Max)); (3) average
cell delay in ISB and delay jitter (DISB, (Min, Max)); (4) average occupancy of OSB
(SOSB and (Min, Max)).

49

Figure 3.12 Switch I : throughput under uniform multicast traffic.

traffic", i.e. arriving cells to the i th ISB only target the switch outputs belonged
to the i th OSB. Fig 3.16 shows the throughput performance of Switch I under "
1 ISB -4 1 OSB hotspot traffic" . It is observed that Switch I suffers a significant
performance degradation when compared with the OQ switch. When input load is
99%, Switch I only yields 50% throughput under multicast traffic, and approximately
12.9% throughput under unicast traffic. The reason for that is, Round Robin cell
scheduling only allows an ISB to deliver cells to its mapped OSB in a cell slot. If an
ISB has no cells to its related OSB, other ISBs do not have authority to deliver cells
to the starved OSB. How to solve this problem motivates new solutions which will
be presented in the later chapters.

50

Figure 3.13 Switch I : average end-to-end cell delay (D E _ to _ E ) under uniform

multicast traffic.

51

Figure 3.14 Switch I : average cell delay in ISB (D ISB) under uniform multicast
traffic.

3.5 Conclusion
In this chapter, we proposed Switch I, a novel switch architecture using input and
output link sharing. The merits of Switch I are the modular switch architecture and
the distributed cell scheduling. Compared to the OQ switch, Switch I eliminates
speedup requirement for the central switch fabric. Moreover, RR scheduling
algorithms resolve output contention in a distributed manner and guarantee fairness
for switch inputs. Scheduling complexity of IVOQ RR algorithm is at most 0(M),
while it is 0(1) for GVOQ RR algorithm. Compared with the centralized schedulers
with a complexity of at least 0(N 2.5 ) proposed for IQ switches in [30, 31, 32, 35],

52

Figure 3.15 Switch I : average size of OSB ( SOSB ) under uniform multicast traffic.

53

Figure 3.16 Performance of Switch I under '1 ISB -4 1 OSB hotspot non-uniform
traffic'.

cell scheduling in our design is much simpler. Hence, Switch I shows good features
to be a scalable design.
But, Switch I has a drawback to support non-uniform traffic in which cells
injected in an ISB are not uniformly target N switch outputs. Since RR scheduling
algorithm only allows an ISB to deliver cells to its matched OSB in every cell slot,
if an ISB does not have cells to go to the polled OSB, other ISBs do not have
authority to deliver cells to the idle OSB. Starvation of OSB(s) will cause performance
degradation. To resolve this problem, we will present a modified switch design in
chapter 4.

CHAPTER 4
SWITCH II: A MODIFIED SWITCH DESIGN USING LINK
SHARING AND PRIORITIZED LINK RESERVATION
4.1 Introduction
In previous chapter, we proposed Switch I as a basic switch design using input and
output link sharing. Switch I is demonstrated to be able to support uniform traffic,
but it suffers a disability to provide high performance under non-uniform traffic 1 .
To overcome the drawback of Switch I, in this chapter, we present Switch II which
is a modified switch design using link sharing and prioritized link reservation.
In Switch II, ISBs are connected through a token ring. Cell delivery in a cell slot
is based on link reservation in every ISB. We propose a round robin prioritized output
link reservation (RR+POLR) algorithm to resolve contentions on input shared links
and output shared links. Basically, Switch II still applies RR scheme to obtain an
one-to-one mapping from K ISBs to K OSBs in every cell slot. An ISB has the
highest priority to reserve as many links as possible to its mapped OSB. If an ISB
can not fully occupy the M links to its mapped OSB, the ISB will issue a token to
inform other ISBs that there are idle links remained at the specific ATMCSF-OSB
interface. Therefore, other ISBs can reserve and utilize the available links to transfer
their cells to the OSB. Starvation of OSBs is alleviated. Switch II can pursue an
improved performance especially under non-uniform traffic.

4.2 Switch Architecture
Fig 4.1 exhibits the architecture of Switch II consisting of K ISBs, ATMCSF, K
OSBs, and a token ring. Functions of ISB, OSB and ATMCSF are the same as what
'Non-Uniform traffic is usually characterized by "hot spot" phenomenon (i.e. cells

accumulated in an ISB may prefer some switch outputs but seldom go to other outputs).

54

55

Figure 4.1 Switch II : an NxN switch consists of K ISBs, K OSBs, ATMCSF, and
a token ring; K =N/M. Cell delivery in a cell slot is based on link reservation. We

propose a round robin prioritized output link reservation (RR+POLR) algorithm.

we presented in Switch I. We concluded in chapter 3 that GVOQ is a cost-efficient
scheme compared to IVOQ. Hence, both Switch II and Switch III will apply GVOQs
in each ISB.
ISBs are connected by a token ring on which K tokens circulate in a round robin
manner. Each token is related to a specific OSB, for example, Tokenj is engaged
to the j th OSB (0 < j < K). As shown in Fig 4.1, a token has two fields : (1)
"OSB_ID" is the identification of an OSB; (2) "Num_Lk_Idle" records the number of
available links at the identified ATMCSF-OSB interface, Num_Lk_Idle ≥ 0.

56
4.3 Cell Scheduling
4.3.1 Cell Delivery

Cell delivery is based on link reservation. Every ISB should make link reservation in
advance in order to obtain the desired links at the targeted ATMCSF-OSB interfaces.
Every ISB has a link reservation vector and a queue occupancy vector. We use
LK_RSV i and Q i to represent the two vectors in the i th ISB (0 ≤ i, j < K) :
,

Link reservation vector is renewed in every cell slot. Each ISB resets its link
reservation vector at the beginning of a cell slot, then starts reserving output shared
links according to a Round Robin Prioritized Output Link Reservation (RR+POLR)
algorithm. When a cell slot ends, every ISB delivers cells to the central switch fabric
according to its current link reservation. For example, if LK_RSV i is [2, 0, • • 4]
in current cell slot, the i th ISB will send two cells to OSB 0 and four cells to OSB
(K-1), but no cells are scheduled to other OSBs.
4.3.2 Link Reservation : RR+POLR Algorithm

57

Definition 2 : Link Reservation Slot, i. e. Rsv_Slot.

Rsv_Slot is defined as a small time interval during which an ISB receives a token and
makes link reservation to the identified OSB. Rsv_Slot is independent from Cell_Slot,
usually, Rsv_Slot << Cell_Slot. When a cell slot is due, every ISB delivers cells to
ATMCSF according to its current link reservation vector.

Table 4.1 illustrates RR+POLR algorithm which is performed in every cell slot.
Each ISB resets its link reservation vector at the beginning of a cell slot. Switch II
adopts Round Robin (RR) scheme proposed in Switch I (Fig 3.4) to obtain an oneto-one mapping from K ISBs to K OSBs in every cell slot. The OSB mapped to an
ISB is called the ISB's Master-OSB. In the 1' Rsv_Slot of a cell slot, an ISB has
the highest priority to reserve as many links as possible to its Master-OSB. If an
ISB does not fully occupy the M links to its Master-OSB, a token carrying available
links for this OSB will be issued by the ISB. Tokens pass through ISBs one by one
in a round robin manner. When an ISB receives a token carrying available links, the
ISB can reserve as many links as possible to the identified OSB.
Fig 4.2 depicts the operations of RR+POLR algorithm in a cell slot. Here, we
give an example that RR+POLR algorithm is performed in the 1" Cell_Slot.
In the 1S t Rsv_Slot of every cell slot, according to the one-to-one mapping,

an ISB has the highest priority to reserve as many links as possible to its MasterOSB. Link reservation in an ISB is determined by queue occupancy of the related
GVOQ :

58

Table 4.1 RR+POLR Algorithm

59

Figure 4.2 Round Robin Prioritized Output Link Reservation (RR+POLR)
algorithm performed in the 1s t Cell_Slot.

After reserving links to its Master-OSB, every ISB initiates a token about its
own Master-OSB, and fills in "Num_Lk_Idle" field to record how many links to its
Master-OSB are still available. Then, every ISB passes its new-born token to the
down-link neighboring ISB. So far, K tokens are generated and start circulating on
the token ring. A Token passes an ISB in every Rsv_Slot.

In the n th (n > 1) Rsv_Slot of the same cell slot, every ISB will receive
a token from its up-link neighbor. If the received token carries available links, an
ISB checks the queue occupancy of the related GVOQ and reserves as many links as
possible to the identified OSB. The total links reserved in an ISB should not exceed

the token will be reduced by the number of links occupied by the ISB. At the end of
the n th Rsv_Slot, every ISB will hand over its received token to next ISB.

When a cell slot is due, every ISB delivers cells to the central switch fabric
based on its current link reservation vector. Meantime, all existing tokens in current

60

cell slot will be destroyed by ISBs. Note that, link reservation rate (i.e. Rsv_Slot) is
independent of a cell scheduling cycle (i.e. Cell_Slot), usually Rsv_Slot << Cell_Slot.
If K*Rsv_Slot < Cell_Slot, a token can finish a complete ring in a cell slot. Otherwise,
a token only goes through some ISBs in a cell slot.
When a new cell slot starts, each ISB resets its link reservation vector.

According to the one-to-one mapping in the new cell slot, an ISB resumes link reservation with the highest priority from its new Master-OSB. Round Robin mapping
ensures that an ISB treats every OSB as its Master-OSB in every K cell slots. Fairness
in RR+POLR algorithm is guaranteed.
4.3.3 Remarks
4.3.3.1 Switch II vs. Switch I : In Switch II, each ISB needs to reset its link

reservation vector in every cell slot. Link reservation rate is identified by Rsv_Slot
which is independent from cell delivery rate represented by Cell_Slot.
We use an integer R to represent the ratio of Cell_Slot and Rsv_Slot, i.e. R =

will be the same as Switch I in this case, because an ISB just makes link reservation
for its mapped OSB but does not have opportunity to reserve links to other OSBs. If

R > 1, Switch II is superior over Switch I because a token can go through R ISBs so
that an ISB can reserve links to several OSBs a cell slot. If R> K, a token can finish
a complete ring during a cell slot, hence, an ISB is able to make link reservation for
every OSB. Obviously, Switch II deserves better performance than Switch I if R > 1.

4.3.3.2 Complexity of RR+POLR Algorithm : RR+POLR algorithm is a

distributed link reservation algorithm in the way that an ISB reserves links to an
OSB according to its queue occupancy of the related GVOQ. Arbitration complexity

61

if 0(1), though an ISB may repeat the same arbitration R times for different OSBs
in a Cell_Slot.

4.3.3.3 Fairness of RR+POLR Algorithm : Because of employing the one-

to-one RR mapping, an ISB fairly selects each OSB as its Master_OSB in every K cell
slots. In another words, every ISB has the same opportunity to make link reservation
for K OSBs. Fairness is guaranteed in RR+POLR algorithm.

4.4 Switch Performance

In this section, we investigate Switch II through a performance comparison between
Switch I, Switch II and an OQ switch. We simulate an 256x256 (N = 256) switch
consisting of 32x32 ISBs/OSBs (M = 32, K = 8) for Switch I and Switch II. As a
comparison, we also simulate an 256x256 OQ switch under the same traffic condition.
Output queued switch is assumed to have infinite output buffers. A cell arriving at
any switch input will be forwarded to the related output queues in the same cell slot.
The OQ switch is work-conserving so that it results in the best performance.
VBR sources are applied to generate the input traffic, as shown in Fig 3.7. We
use the ON (active) /OFF (idle) model to describe the burst-idle process of input
traffic stream. The back-to-back cells in an ON duration belong to a same VC,
i.e. they have same multiple destinations. No cells arrive in an idle period. Traffic
parameters are : MBS which is the maximum burst size; LCR which is the line cell

is the average cell rate. The effective input load is defined as p = (AC R * F)/LCR,

p < 1, F is the average fanout.

If R > 1, Switch II will exceed Switch I because in a cell slot, an ISB can reserve
links to multiple OSBs so that link starvation of OSBs may be relaxed. Since we
have shown in chapter 3 that Switch I can obtain a good performance under uniform
traffic, Switch II will be able to achieve a good performance under uniform traffic.
Hence, we will not evaluate Switch II under uniform traffic, but, we mainly examine
Switch II under non-uniform traffic to show the differences and likenesses from
Switch I.
Three circumstances are likely to build a non-uniform traffic. (1)

If maximum bursty size MBS is very large, then cells in an ON period will keep
targeting the same multiple destinations for a relatively long time. Cell destinations
of an ISB are not uniformly distributed among N switch outputs in a time period.
(2) If bursts are correlated with each other, cells in successive ON bursts have the
same destination outputs. Even though MBS is small, cells accumulated in several
bursts will make the traffic non-uniformly distributed among N output ports. (3) In
an extreme case, arriving cells to an ISB only go to a specific OSB. This is so called
'1 ISB —> 1 OSB HotSpot Traffic'. In our simulation, we use this traffic to perform
a comparison study because this traffic model is able to clearly expose the strength
or weakness of different switches.
Table 4.4 illustrates switch performance under unicast "1 ISB -4 1 OSB
HotSpot Traffic". The average hot spot burst length is 5 * MBS 100 successive
cells. It is observed that Switch I suffers a dramatic performance degradation, while

Throughput performance is compared in Fig 4.3. Switch I with GVOQ RR
obtains a very low throughput about 13% when input load is 99%. The reason for

63

Table 4.2 Switch II : performance comparison under non-uniform unicast traffic with
different input load p. The observed performance statistics are : (1) throughput; (2)
average end-to-end cell delay and delay jitter (D E _ to _ E , (Min, Max)); (3) average
cell delay in ISB and delay jitter (D ISB , (Min, Max)); (4) average occupancy of OSB
(SOSB and (Min, Max)).

64
that is, an ISB accommodates cells going to a specific OSB so that an ISB only have
cells to be delivered to OSBs in 1 out of every 8 successive cell slots. Link resources
are wasted by using the one-to-one RR mapping scheduling algorithm. However,
Switch II using RR+POLR provides a mechanism for ISBs to compensate each other
to reserve links and forward cells to the non-fully loaded OSBs. Starvation of OSBs
may be relaxed according to the ratio R. For example, if R = 4 which implies that
an ISB is allowed to reserve links to 4 OSBs in a cell slot, switch throughput leads to
52%; if R = 8, i.e. an ISB has the opportunity to reserve links to every OSB, Switch
II can achieve a similar performance of throughput as the OQ switch.

Figure 4.3 Switch II : throughput under "1 ISB

1 OSB hotspot unicast traffic" .

65
Fig 4.4 evaluates the average end-to-end cell delay (D E _ to _ E ), which is defined
as the latency for a cell to pass through the switch. DE_t o _E is measured in terms of
the number of cell slots. When R = 8, Switch II yields a very similar performance of
DE_t o _E as the OQ switch. Compared to the lower bound of DE_t o _E obtained in

the OQ switch, Switch II incurs no more than 8 cell slots longer delay. But, Switch
I as well as Switch II with R = 4 sustain much longer cell delay. Longer cell delay is
due to the lower throughput. Since more and more cells are backlogged in ISBs, cell
delay keeps increasing in both of the two switches.

Figure 4.4 Switch II : average end-to-end cell delay (DE_t o _E) under "1 ISB
OSB hotspot unicast traffic".

1

The end-to-end cell delay is resulted from two parts : queueing delay in ISBs
(D ISBS )7 and queueing delay in OSBs (D OSBs ). Fig 4.5 shows the average queueing

66

Figure 4.5 Switch II : average cell delay in ISB (D ISB ) under "1 ISB → 1 OSB
hotspot unicast traffic".

delay in ISBs (i.e. D ISB). When R = 8, Switch II causes at most 1 cell slot delay
of D ISB . It indicates a potential capability of Switch II to forward cells as fast
as the OQ switch. In this circumstance, since most arriving cells are transmitted to
OSBs immediately, OSBs in Switch II can employ any existing scheduling strategy to
provide QoS guarantee as the OQ switch does. But, it is worth to notice that Switch
II is sensitive to the ratio R. As shown in Fig 4.5, if R = 4, Switch II tolerates much
longer delay in ISBs especially under heavy input load. Switch I performs even worse
because DE-to-E is mainly incurred by the latency in ISBs (i.e. DE-to-E ≈ D ISB ).
In this case, Switch I is more like an IQ switch.

67

Figure 4.6 Switch II : average size of OSB (SOSB ) under "1 ISB

1 OSB hotspot

unicast traffic" .

In addition, we measure the average size of OSBs (i.e. SosB) in Fig 4.6. SOSB
reflects the occupancy of OSBs in terms of the number of accommodated cells. Switch
II with faster link reservation rate (i.e. large ratio R) is able to forward cells to OSBs
quickly so that it will have more cells saved in OSBs.
Moreover, Table 4.4 investigates switch performance under multicast "1 ISB

1 OSB HotSpot Traffic" . Fig 4.7~Fig 4.10 respectively depicts performance of
different aspects. Generally, we have similar observations as what we had discussed
for unicast traffic. In addition, under multicast traffic, switches benefit from GVOQs
so that multicast cells can be forwarded from ISBs to OSBs. Therefore, under the

68

Table 4.3 Switch II : performance comparison under non-uniform multicast traffic
with different input load p. The observed performance statistics are : (1) throughput;
(2) average end-to-end cell delay and delay jitter (DE-to-E, (Min, Max)); (3) average
cell delay in ISB and delay jitter (DISB , (Min, Max)); (4) average occupancy of OSB
(S OSB and (Min, Max)).

69

Figure 4.7 Switch II : throughput under "1 ISB → 1 OSB hopspot multicast traffic" .

70

Figure 4.8 Switch II : average end-to-end cell delay (D E _ to _ E ) under "1 ISB
OSB hopspot multicast traffic".

1

same input load p, switches achieve better performance than in unicast traffic. For
example, Switch II with R = 4 is able to obtain a comparable performance to the
OQ switch under multicast traffic. But, it is not the case in unicast traffic.

4.5 Conclusion
4.5.1 Advantages of Switch II
Switch II inherits the modular switch architecture of Switch I. It benefits from input
and output link sharing, hence, no speedup is necessary in central switch fabric.

71

Figure 4.9 Switch II : average cell delay in ISB (DISB) under "1 ISB → 1 OSB
hopspot multicast traffic" .

To resolve input and output contention, we propose a Round Robin Prioritized
Output Link Reservation (RR+POLR) algorithm. Cell delivery is determined by link
reservation in every ISB. RR+POLR algorithm is a distributed resource allocation
algorithm, in the sense that an ISB makes link reservation for an OSB in a Rsv_Slot
according to queue occupancy of the related GVOQ. Arbitration complexity is 0(1).
Switch II uses RR+POLR to avoid starvation of OSBs so that it achieves
an improved performance especially under non-uniform traffic. If R ≥ K, i.e. a
token can circulate a complete ring in a cell slot, RR+POLR may be able to gain a
comparable performance to the OQ switch.

72

Figure 4.10 Switch II : average size of OSB (SOSB) under "1 ISB → 1 OSB hopspot

multicast traffic".

4.5.2 Disadvantages of Switch II

RR+POLR algorithm has a deficiency to achieve an efficient link resource allocation
among ISBs, mainly due to following two reasons.
First, it may not be efficient for an ISB to reset its link reservation vector in
every cell slot, because traffic patterns injected in an ISB usually will not change
dramatically in every cell slot. In addition, the performance of Switch II is mainly
depended on the link reservation rate, i.e. the ratio of Rsv_Slot and Cell_Slot. If
Rsv_Slot = Cell_Slot, Switch II is exactly the same as Switch I because an ISB only
has the opportunity to reserve links to its Mater_OSB in a cell slot.

73

Moreover, prioritized link reservation may hinder an ISB to reserve links for
a starved OSB if the ISB has already reserved M links to other OSBs. Fig 4.11
shows an example of this scenario. In the l' Rsv_Slot, according to the one-to-one
mapping, ISB 0 and ISB 1 respectively reserve links to their Master-OSBs. After
that, no idle link is left for OSB 0, but 2 links to OSB 1 are still available since ISB
1 does not have cells destined to OSB 1. However, when receiving Token s in the
2 nd Rsv_Slot, ISB 0 is not able to reserve any more links (refer to * in Fig 4.11).

Eventually, OSB 1 is not served with any cells even though there are cells in ISBs
which want to go to OSB 1 (ref t$ in Fig 4.11). This causes throughput degradation.
The problem is due to prioritized link reservation without the knowledge of traffic
load of other OSBs.

Figure 4.11 RR+POLR causes starvation of OSBs (refer to #) in an example 4x4
switch, 2x2 ISBs/OSBs (N=4,M=2,K=2).

To resolve this problem, in next chapter, we present an enhanced switch architecture using dual round robin dynamic link reservation to achieve a dynamic fast
and fair link resource allocation among ISBs.

CHAPTER 5
SWITCH III: A SCALABLE TERABIT MULTICAST PACKET
SWITCH WITH DUAL ROUND ROBIN DYNAMIC LINK
RESERVATION
In this chapter, we propose Switch III as an enhanced switch design using link sharing
and dual round robin dynamic link reservation. Unlike the previous two switches,
ISBs are connected by dual rings on which K link request tokens (REQs) and K link
release tokens (RELs) circulate in a round robin manner. Cell delivery is based on
link reservation in every ISB. But, without reseting its link reservation vector in every
cell slot, each ISB can dynamically increase/decrease its link reservation for a specific
OSB by "borrowing" or "lending" links from/to other ISBs. We propose two Queue
Occupancy Based Dynamic Link Reservation (QOBDLR) algorithms to achieve a fast
and fair link resource allocation among ISBs. QOBDLR is a distributed link reservation scheme in the sense that every ISB utilizes its local available information to
arbitrate a modification for its own link reservation. Arbitration complexity is 0(1).
Performance evaluation shows that Switch III can achieve s comparable performance
to OQ switches under any traffic pattern. Moreover, Switch III avoids the speedup
problem which is involved in OQ switches. Hence, Switch III would be a good choice
for high performance, scalable, large-capacity core switches.

5.1 Switch Architecture
Fig 5.1 shows the architecture of Switch III, which consists of K ISBs, ATMCSF,
K OSBs, and dual round robin rings. Functions of ISB, OSB and ATMCSF are the
same as what we presented in Switch I and Switch II.
ISBs are connected by dual rings : a down-ward ring conveys link request tokens
(REQs); and an up-ward ring carries link release tokens (RELs). At any time, there

74

75

Figure 5.1 Switch III : an NxN switch consists of K ISBs, K OSBs, ATMCSF, and
dual round robin rings; K = m m = M. Cell delivery is based on link reservation.
Dual round robin rings provide a mechanism for ISBs to dynamically "borrow"
and/or "lend" links from each other.
,

are K REQ tokens and K REL tokens circulating on the dual rings respectively and
passing ISBs one by one in a round robin manner. Each OSB (e.g. the i th OSB) is
correlated with a REQ token (e.g. REQ i ) and a REL token (e.g. REL i ).
As shown in Fig 5.1, REQ token and REL token have the same format
containing two fields : (1) "OSB_ID" is the identification of an OSB; (2) "REQ_NUM"
indicates how many link requests are issued for the identified OSB. Or, "REL_NUM"

76

records the number of released links which are available to be reserved at the related
ATMCSF-OSB interface.

In a cell slot, each ISB delivers cells to central switch fabric according to its
link reservation. For example, as shown in Fig 5.2, if LK_RSV i is [4, 0,...,2] in
current cell slot, the i th ISB will send two cells to OSB 0 and four cells to OSB (K-1),
but no cells are scheduled to other OSBs.
But, unlike Switch II, ISBs in Switch III do not need to reset their link reservation vectors in every cell slot. According to its queue occupancy vector, an ISB can
dynamically modify its link reservation for a specific OSB when the ISB receives the
related REQ token or REL token. We propose two link reservation algorithms, both
of them are based on a queue occupancy based dynamic link reservation (QOBDLR)
scheme.

77

Figure 5.2 Cell delivery is based on link reservation.
5.2.2 Link Reservation

Link reservation in ISBs needs to resolve two contentions : (1) K GVOQs in an
ISB contend for M links at the ISB-ATMCSF interface. (2) K ISBs contend for M
links at every ATMCSF-OSB interface. To achieve a fast and fair link resource
allocation among ISBs, we propose two algorithms :

• REQ-QOBDLR algorithm : Request-Motivated Queue Occupancy Based
Dynamic Link Reservation algorithm.

• REQREL-QOBDLR algorithm: Request/Release-Motivated Queue Occupancy
Based Dynamic Link Reservation algorithm.
To present the two link reservation algorithms, we provide following definitions.
Definition 1 : Link Reservation Rule.
Link reservation among K ISBs must satisfy two criteria :

78

not exceed M which is the maximum number of links at the ATMCSF-OSB interface.
Definition 2 : Link Reservation Slot, i.e. Rsv_Slot.
As shown in Fig 5.3, Rsv_Slot is defined as a small time interval during which
an ISB receives a pair of REQ and REL tokens. In a Rsv_Slot, an ISB has the
authority to modify its link reservation for the two OSBs which are identified by the
received REQ and REL tokens.
Rsv_Slot is independent from Cell_Slot, usually, Rsv_Slot <<Cell_Slot. We

R = 1, link reservation is performed in the slowest rate because a token only goes
through one ISB in a cell slot; if R > K, a token can circulate a complete ring in a
cell slot. Dynamic link reservation is operated in a Rsv_Slot. When a cell slot is due,
every ISB delivers cells to ATMCSF according to its current link reservation vector.

Figure 5.3 Rsv_Slot (link reservation slot) vs. Cell_Slot (cell delivery slot) in an
example switch consisting of 3 ISBs and 3 OSBs.

79
5.3 REQ-QOBDLR Algorithm

In this section, we present the Request Motivated Queue Occupancy Based Dynamic
-

Link Reservation (REQ QOBDLR) algorithm. As a common model shown in Fig 5.4,
-

the i th ISB (0 ≤ i < K) is receiving REQj token and REL n token in current Rsv_Slot,
usually REQ j and REL n identify two different OSBs (i.e. j n). The i th ISB will
only modify its link reservation, i.e. and r ni , for the i th OSB and the n th OSB in
current Rsv_Slot.

Figure 5.4 REQ-QOBDLR algorithm which is performed in every Rsv_Slot.

REQ-QOBDLR algorithm is performed in every Rsv_Slot. For the received
REQ j token, the i th ISB will refer the queue occupancy to decide an intended

modification of r i 3 For the received RELn token, the i th ISB simply takes an extra
link if the ISB had requested it. As the increasing/decreasing of link reservation
for a specific OSB can only be triggered and accomplished as a result of an explicit
request by the i th OSB, this algorithm is so called REQ-QOBDLR algorithm.
The detail of REQ-QOBDLR algorithm is illustrated in Appendix A. In this
section, we present the basic rule and main operations of REQ-QOBDLR algorithm.

80
5.3.1 Operations upon receiving REQj Token

81

REL_NUM n will be reduced by 1 if the i th ISB reserves an available link from
REL n token. Moreover, before the i th ISB forwards REL n token to the next ISB,
if the i th ISB has a pending released link resulted from the operation of receiving
REQ n token, the released link will be inserted into REL n token.

In addition, when system starts, every ISB issues a REQ token and a REL
token. There is no specific rule on how to establish the token sequence as long as
each OSB is represented by a pair of REQ token and REL token. After that, K
REQs and K RELs will keep circulating on the dual rings in a round robin manner.
Cell delivery and link reservation are independent operations. When a Cell_Slot
is due, every ISB sends cells to ATMCSF based on its current link reservation vector.
But, an ISB is able to change its link reservation in every Rsv_Slot, usually Rsv_Slot
< Cell_Slot.

82
5.3.4 Conclusion on REQ-QOBDLR Algorithm

Dynamic link reservation in REQ-QOBDLR algorithm is triggered by issuing link
requests. When receiving REQ j token, the i th ISB can ask for an extra link to the

link if it had issued a link request and has been waiting for an available link. The
advantage of REQ-QOBDLR algorithm is to ensure that requesting a link and/or
releasing a link happens when necessary.
However, REQ-QOBDLR algorithm may have a potential problem especially
when switch grows and K is large. The reason is that, the i th ISB does not measure

even though it did not issue a link request for the nt h OSB before. Or, the i th ISB
currently does not need the extra link even though it has sent out a link request
before. To match the real traffic, it would be more effective if the i th ISB evaluates

to propose another competitive algorithm called REQREL-QOBDLR algorithm in
next section.

5.4 REQREL-QOBDLR Algorithm

In this section, we present Request/Release-Motivated Queue Occupancy Based
Dynamic Link Reservation (REQREL-QOBDLR) algorithm. As show in Fig 5.5,
the i th ISB is receiving REQ j and REL n token in current Rsv_Slot. For the
received REQ ; token, the i th ISB does the same operation as REQ-QOBDLR

83

Figure 5.5 REQREL-QOBDLR algorithm which is performed in every Rsv_Slot.

Assume that the i th ISB is receiving REQ j and RELn

token in current Rsv_Slot.

The detail of REQREL-QOBDLR algorithm is addressed in Appendix B. In
this section, we present the basic idea of REQREL-QOBDLR algorithm.
5.4.1 Operations upon receiving RELn Token

84

85

ISBs. Bearing this in mind, when the i th ISB releases a link for the n th OSB due to
qni < LT, there are actually (REQ_NUM n — 1) 2 or (REQ_NUM n — 2) 3 link requests

are expecting available links for the n th OSB. Hence, the i th ISB should decrease
REQ_NUM n by either 1 or 2. However, the i th ISB does not hold REQ n token in
current Rsv_Slot. The i th ISB has to record this pending reduction of link requests
and waits for receiving REQ n token to modify REQ_NUM n .
5.4.2 Operations upon receiving REQ j Token

The operations for the received REQ j token is very similar as that in REQ-QOBDLR
algorithm. But, due to the operations for the received REL.' token in several
Rsv_slot(s) before, the i th ISB may need to first update REQ_NUM i with the pending
increment/decrement of link requests for the j th OSB. Hence, REQ_NUM i reflects
the real number of link requests issued for the j th OSB. After that, the i th ISB follows
the same operations as we presented in REQ-QOBDLR algorithm.
5.4.3 Remarks on REQREL-QOBDLR Algorithm

In a Ring_Cycle (i.e. = K * Rsv_Slot), an ISB has two opportunities to evaluate
its traffic load and to modify its link reservation for a specific OSB. Compared with
REQ-QOBDLR algorithm, REQREL-QOBDLR algorithm is able to quickly adjust
link resource allocation to adapt to the input traffic.
But, the efficiency of REQREL-QOBDLR algorithm is subject to the values of
HT and LT. For example, if LT is given a large value but traffic load is not heavy,
then every ISB will reduce its link reservation even though each ISB may have enough
the i th ISB has not sent a link request for the n th OSB, then (REQ_NUM n — 1) link
requests are demanding available links after the i th ISB releases a link.
3 If the i th ISB has issued a link request for the n th OSB, then (REQ_NUM n — 2) link
requests are demanding available links after the i th ISB releases a link.
2 If

86

by any ISB so that link resources are wasted. It will cause performance degradation.
How to select HT and LT will be addressed in next section.

5.5 Analysis of QOBDLR Algorithms
5.5.1 Algorithm Complexity

Both REQ-QOBDLR algorithm and REQREL-QOBDLR algorithm are distributed
link reservation schemes. In every Rsv_Slot, an ISB modifies its link reservation for
only two OSBs which are identified by the received pair of REQ token and REL
token. Arbitration on "borrowing" and/or "lending" a link to a specific OSB is
based on the queue occupancy of two related GVOQs. Since an ISB modifies its link
reservation according to its local available information, arbitration does not need to
undergo multiple iterations and complexity is only 0(1).
5.5.2 The choice of HT and LT

In QOBDLR algorithms, the high threshold HT and the low threshold LT are
predefined system parameters and are consistent after their initialization. In every
Rsv_Slot, each ISB evaluates its queue occupancy of a GVOQ with HT and LT
to decide whether to increase/decrease its link reservation for the related OSB. To
select appropriate values of HT and LT is very important for QOBDLR algorithms
to achieve a fair and fast link resource allocation among ISBs.
Notice that, ISBs tolerate two contentions when making link reservation for
the targeted OSBs : (1) K GVOQs in a same ISB contend for M links at the
ISB-ATMCSF interface; (2) K ISBs contend for M links at every ATMCSF-OSB
interface.

Figure 5.6 An ideal traffic scenario : the aggregated input load to the i th ISB
(0 < i < K) uniformly targets K OSBs. Every ISB has the same traffic pattern.

As shown in Fig 5.6, if the aggregated input traffic to the i th ISB (0 ≤ i < K)
uniformly targets K OSBs, then the queue occupancy of every GVOQ in the i th ISB
will be the same. Hence, we have

In above ideal case, ISBs do not need to borrow/lend links from each other
because traffic pattern in every ISB is exactly the same. The M links at every
ATMCSF-OSB interface will be evenly allocated to every ISB:

88

However, in real life, input load to an ISB is dynamically changed and different
from each other's. In order to be fair for every ISB, the criteria to choose HT and
LT should be :

If the queue occupancy of a GVOQ in an ISB is larger than HT, the ISB can
ask for an extra link to the related OSB because its traffic load to the OSB is heavier
than the normal load q avg . If the queue occupancy of a GVOQ in an ISB is less than
LT, the ISB will release a link if other ISBs have link requests. Hence, QOBDLR
algorithms provide a fair resource allocation among ISBs.
The values of HT and LT may have multiple choices. Table 5.1 shows an
example which we apply to determine the values of HT and LT for Switch III if it is
an 256x256 switch constructed by 8 ISBs and 8 OSBs, i.e. N = 256, K = 8, m = M
= 32. Table 5.1 lists the possible choices of HT and LT based on different input load
p. HT and LT should be suitable to handle most of the possible traffic loading. Since
the input traffic loaded to a switch input is usually more than 50% (i.e. p ≥ 0.5),
we choose that LT = 2. When the traffic load is less than 50% (i.e. p < 0.5), the

M links at an ATMCSF-OSB interface are most likely to be sufficient to support
cell delivery of all ISBs. To determine HT, the joint set of values of HT to satisfy
all possible traffic load is : HT

E [4, 00).

If HT is very large, a link request will

be activated much slowly because the queue occupancy of a GVOQ can not easily
exceed HT. Therefore, an ISB may not be able to increase its link reservation timely
to adapt to the increasing traffic. With this concern, we select HT as 4 above which

89
an ISB starts requesting additional link. Hence, link reservation can adapt to the
traffic quickly.
Table 5.1 The possible choices of HT and LT for an 256x256 switch consisting of 8
ISBs and 8 OSBs, i.e. N = 256, K = 8, m = M = 32. We select HT = 4, and LT
= 2.

It is worth to mention that, a theoretical work about the optimal choice of HT
and LT, rather than the upper/lower bounds of the two thresholds, may be needed
for different input traffic patterns. This would be our future work.
5.5.3 Scalability of QOBDLR Algorithms
To cooperate the modular switch architecture to achieve a good performance,
QOBDLR algorithms should be scalable as well. The scalability of QOBDLR
algorithms can be investigated from two aspects simplicity and efficiency.
In section 5.5.1, we discussed that QOBDLR algorithms sustain a very low
arbitration complexity of 0(1). Hence, QOBDLR algorithms will be able to afford
switch growth without increasing arbitration complexity.

90

On the other hand, QOBDLR algorithms should be efficient to provide a fast
and fair link resource allocation. One of the main factors to judge the efficiency of
link reservation is

which is the average latency for a link request to be granted

Dgrant,

by a released link. For example, if a link request issued by an ISB has to travel the
whole ring to find an available link at the fastest ISB, then the ISB will suffer a long
delay to obtain its desired link. It will demote the efficiency of QOBDLR algorithms.
3.5.3.1 D gran t in REQ QOBDLR Algorithm : In REQ-QOBDLR algorithm,
-

an ISB may issue a link request for a specific OSB upon receiving the related REQ
token. Fig 5.7 depicts an example in which ISB 0 is generating a new link request
for the nth OSB ( 0 < n < K ) when it receives REQ n token in Rsv_Slot 0.

Figure 5.7 Dg rant in REQ-QOBDLR Algorithm

We assume that in a Ring_Cycle (i.e. K * Rsv_Slot), there is at least one ISB
on the ring who can grant a link for the link request of ISB 0. Otherwise, ISB 0
is destined not to be able to obtain its desired link because other ISBs do not have
available links to be released. If it is the j th ISB that eventually releases a link to
satisfy the link request of ISB 0, the latency for the link request to be granted by

91
an available link is j Rsv_Slot(s). Statistically, we have the average latency Dgrant in
terms of the number of Rsv_Slot(s) as follows :

We first derive D gran t under uniform traffic in which input load injected into an
ISB uniformly targets K OSBs. In this scenario, every ISB has the same probability

p (0 < p < 1) to grant a link request for a certain OSB. Since a token passes ISBs
one by one and only visits an ISB in a Rsv_Slot, we have :

92

Eq. 5.10 explains that, under uniform traffic,

Dgrant

is not effected by K but

is determined by p. It implies that large switch size (i.e. large K) will not incur an
increasing delay for a link request to meet an available link.
In Eq. 5.10, p is the probability that an ISB is able to grant a link for a link
request. If p = 1, then Dgrant is 1, i.e. a link request will be satisfied by the next
neighboring ISB. But, if p 1, then

Dgrant

1, i.e. it will take several Rsv_Slots

for a link request to encounter an available link. Since an ISB needs to check its
related queue occupancy to arbitrate whether to release a link for a link request, p
can be expressed as follows ( for 0 ≤ i n < K) :
,

Eq. 5.11 is engaged with a queueing problem modeled in Fig 5.8. The input
traffic from every switch input is an ON-OFF traffic stream multiplexing several
VC connections. Moreover, the input load injected into every GVOQ is in fact an

93

Figure 5.8 How to get Pr (qni > c), where c is a constant value.

aggregation of multiple ON-OFF streams carrying multicast cells. The outgoing
traffic rate is identified by r ni which can be interpreted as the dynamic service rate.
The queueing model illustrated in Fig 5.8 is similar to the traffic model consisting of
batch arrival and batch departure. However, in our model, theoretical analysis is too
complicated to achieve because departure process is correlated with arrival process
and queue length.
Independent batch arrival/batch departure traffic model, in general, is a
difficult analyzing problem due to multiplexing of typically a large number of
connections and burstiness of individual cell streams at possibly different time
scales [61]. Sohraby et al. presented several solutions based on M/G/1-Type
Markov Chains in [61] [62] [63]. However, their approaches are very computationconsuming. Moreover, those solutions are not applicable in our model where arrival
and departure and correlated. Hence, a further effort to achieve a comprehensive
theoretical analysis is still our ongoing work.
But, we may do some approximation and intuitively interpret the meaning
beyond the equation. From Eq. 5.11, we derive an upper bound of D g rant

•

94

and

Dgrant

will increase. It makes sense in a way that the queue occupancy of a

GVOQ may not easily go below LT, if LT is small. Thus, an ISB will not be able to
grant a link for a link request. It will incur a longer delay for a link request to be
satisfied.
Under non-uniform burst traffic, situation may become more complicated. If
we use p i (0 < i < K) to indicate the probability that the i th ISB is able to release
a link for a link request to the n th OSB, then we may have :

Moreover, with non-uniform traffic, we have :

hence, Dgrant will be expressed as :

Due to the difficulty to analyze the model as shown in Fig 5.8, we have not
been able to obtain a close-form expression of Eq. 5.15. But, we would like to
present following discussions to intuitively evaluate Dgrant

•

First, a REQ token

passes ISBs one by one in a round robin manner, so that the nearest ISB has the
highest responsibility to release a link for a link request. Intuitively, an ISB who
had issued a link request for a specific OSB should obtains the desired link from its

95

down-stream neighboring ISBs in several Rsv_Cycle(s). Second, if switch size is very
large (i.e. K is very large), traffic pattern in each ISB may change during the period
that a link request is seeking for an available link. Therefore, an ISB who previously
does not want to grant a link may be able to release a link when the link request
token stops at its block. In our performance simulations, it rarely happens that a
link request needs to traverse a complete ring to obtain an available link.
3.5.3.2

Dgrant

in REQREL-QOBDLR Algorithm : In REQREL-QOBDLR

algorithm, link request operation and link release operation are more independent
than those in REQ-QOBDLR algorithm. Even though there is no link request issued
by any ISB, an ISB may release a link if its queue occupancy is less than LT. On
the other hand, an ISB who has issued a link request to a certain OSB may be
able to catch an available link immediately without polling neighbor ISBs one by
one because there may already be released links circulating on the ring. Hence,
in REQREL-QOBDLR algorithm, D grant is smaller than that in REQ-QOBDLR
algorithm. However, the delay variation in REQREL-QOBDLR may be larger than
that in REQ-QOBDLR algorithm in which a link release is only stimulated by an
explicit link request.

5.6 Performance Evaluation

In this section, we evaluate the performance of Switch III and compare it with Switch
I, Switch II and the OQ switch under same traffic scenarios.
5.6.1 Traffic Model

The switch performance is investigated under both uniform and non-uniform traffic.
As shown in Fig 5.9, cells coming from different VCs are multiplexed in bursts which

96

are interleaved to contribute as the arrival traffic at every switch input. We employ
the ON(active)/OFF(idle) model to describe the burst-idle process. The back-toback cells in an ON duration belong to the same VC so that they have same destinations. No cells arrive in idle period.

Figure 5.9 Traffic Model : Multicast Burst Traffic

Under uniform traffic, the aggregated input load to every ISB is the same
and uniformly targets all switch outputs. We set a small value of MBS which is the
maximum burst size (i.e. the number of cells in an ON duration). Cells' destinations
are uniformly distributed among N switch outputs.
On the contrary, non-uniform traffic is featured by "hot spot" phenomenon
: cells accommodated in an ISB prefer to go to some switch outputs, but rarely go
to other output ports. Three scenarios are likely to build a non-uniform burst traffic
in an ISB : (1) If MBS (i.e. maximum burst size) is very large, cells arriving in an
ON period will keep targeting the same destinations for a long time. Input traffic is
not uniformly destined to N switch outputs in this time duration. (2) If bursts are
correlated with each other, i.e. cells arriving in successive ON periods keep going to
the same destinations. Even though MBS may be small, cells accumulated in several
ON periods will generate a non-uniform traffic in an ISB. (3) In an extreme case, an

97

ISB only has cells to go to a specific OSB, but has no cells to other OSBs. Different
ISB is dedicated to different OSB. This is so called "1 ISB → 1 OSB hot spot" traffic.
In this dissertation, VBR source is used to generate the ON-OFF traffic.
Following are traffic parameters : MBS is the maximum burst size; PCR is the peak
cell rate (i.e. the number of cells per second) which satisfies that PCR ≤ LCR; LCR

link; ACR is the average cell rate. We define F ont as the fanout of a cell. Font has
a uniform distribution from 0 to Cmax . The average fanout load F = (Cmax + 1)/2,
where Cmax is the maximum copies allowed for a multicast cell. The effective input
load is defined as p = (ACR x F) I LC R, 0 ≤ p ≤ 1.
5.6.2 Switch Performance

The switch performance is evaluated through simulations by using OPNET/MIL3
simulation platform [65]. For our switch designs, we apply an 256x256 (N = 256)
switch consisting of 8 ISBs and 8 OSBs (K = 8). Each ISB/OSB is of size 32x32 (m
= M = 32). For Switch II using RR+POLR, we assume that the link reservation

Hence, a token only goes through one ISB in a cell slot. Switch III with any faster
link reservation rate (i.e. R > 1) will obtain a better performance than what we
simulated here. For the OQ switch, we assume that the OQ switch can support N
times speedup. Therefore, cells arriving at switch inputs can be transmitted to the
related output queues in a cell slot. Output buffers are infinite so that no cells could

98

be lost. The OQ switch proved to be able to achieve the best performance under any
traffic pattern.
We investigate the switch performance under both uniform and non-uniform
traffic. Following performance statistics are estimated :
• Throughput : switch throughput which is statistically measured on N switch

outputs
• DE_t o _E : the average end-to-end cell delay in terms of the number of cell slots.
DE_t o _E is the latency for a cell going through the switch. Delay jitter is

measured by (Min, Max) of DE_t o _E .
• DISB : the average cell delay in ISBs in terms of the number of cell slots. We

assume that ATMCSF forwards cells from input shared links to output shared
links in a cell slot, hence, ATMCSF does not cause any cell delay. DE_t o _E is
resulted from two parts : the cell delay in ISBs (i. e. D ISB ), and the cell delay
in OSBs. The vector of (Min, Max) of DISB is the minimum delay and the
maximum delay incurred in ISBs.
•

:tSOB he average occupancy of an OSB measured by the number of cells accom-

modated in the OSB. Since each OSB consists of M output queues,

SOSB

indicates the total number of cells waiting in an OSB. The vector of (Min,
Max) of Sout estimates the lower bound and the upper bound of the occupancy
of an OSB.
Table 5.2 compares the switch performance under uniform traffic with
different input load p. Multicast uniform traffic is applied. For Switch III, we select
the values of HT and LT as 4 and 2 as what we discussed in previous section.

99

Table 5.2 Switch III : performance comparison under uniform multicast traffic with
different input load p. The observed performance statistics are : (1) throughput; (2)
average end-to-end cell delay and delay jitter (D E _ to _ E , (Min, Max)); (3) average
cell delay in ISB and delay jitter (D ISB , (Min, Max)); (4) average occupancy of OSB
(SOSB and (Min, Max)).

100
In chapter 3 and chapter 4, we have shown that Switch I and Switch II are
able to provide good performance under uniform traffic. Comparison in Table 5.2
indicates that, Switch III, as well as Switch I and Switch II, can achieve a comparable
performance as the OQ switch under uniform traffic. On throughput performance,
the OQ switch always leads to the maximized throughput p, and our switches have
less than 0.5% throughput degradation. In general, the end to end cell delay
-

-

DE_t o _E increases with input load p. Compared with the lower bound of DE_t o _E

achieved in the OQ switch, DE_t o _E in our switch designs causes 2 -15 more cell
,

slots. Longer cell delay is due to lower throughput.
It is also observed from Table 5.2 that, Switch II and Switch III obtain a little
bit better performance than Switch I because the former ones utilize link reservation
to avoid starvation of OSBs. However, under uniform traffic, link reservation is not
necessary so that Switch I still can obtain a similarly good performance as the other
two switches. When input load is heavy such as p = 0.99 or 0.90, Switch II results
in a little bit better performance than Switch III but it happens with the condition

R = 1, Switch II will yield the same performance as Switch I so that Switch III can
defeat Switch II.
The two QOBDLR algorithms proposed for Switch III are very competitive to
each other. REQREL-QOBDLR algorithm exceeds REQ-QOBDLR algorithm with
better performance when input load p is heavy. The reason for that is, dynamic link
reservation achieved by REQREL-QOBDLR is faster than that in REQ-QOBDLR
algorithm. But, when input load p is less than 0.7, REQ-QOBDLR algorithm
outperforms REQREL-QOBDLR algorithm because REQREL-QOBDLR intends

101
to release more links which may not be utilized by any ISBs. In general, both
REQ-QOBDLR and REQREL-QOBDLR algorithms are capable of providing good
performance for uniform traffic.
Moreover, Table 5.2 shows that DE_t o _E in our designs is mainly due to the
latency in OSBs (i.e. D E_to_E — DISB ) rather than the delay in ISBs (i.e. D ISB ). In
addition, SOSB indicates that cells are forwarded to OSBs in a fast manner because
most of the cells are backlogged in OSBs. This is a good feature of our switch
designs because OSBs may be able to incorporate per VC queueing with appropriate
cell schedulers to provide QoS guarantees as the OQ switch does. It is the subject
of our ongoing work.
Table 5.3 compares the switch performance under non uniform traffic.
-

Fig 5.10 ~ Fig 5.13 illustrate the performance of throughput, DE_t o _E ,
SOSB

DISB

and

individually. We apply unicast "1 ISB 1 OSB HotSpot Traffic" : the input

load injected into an ISB only targets a specific OSB, but no cells go to other OSBs.
Performance comparison shows that Switch I fails to offer a good performance
for non-uniform traffic. The reason is that, in Switch I, an ISB is only allowed
to deliver cells to its matched OSB according to the one-to-one mapping in a cell
slot. If an ISB does not have cells to go to its assigned OSB, other ISBs do not
have authority to send cells to the starved OSB. Under "1 ISB → 1 OSB HotSpot
Traffic", an ISB only has cells to be delivered in 1 out of every K cell slots. Switch I
suffers a significant performance degradation and only approaches 13% throughput
even though input load is 99%. Since more and more cells are blocked in ISBs,
the cell delay in ISBs (i.e. D ISB ) continues to increase. It, therefore, causes an
ever-increasing end-to-end cell delay (i.e. DE_to_E ).

102

Table 5.3 Switch III : performance comparison under unicast "1 ISB —+ 1 OSB
HotSpot" traffic. The observed performance statistics are : (1) throughput; (2)
average end-to-end cell delay and delay jitter (DE-to-E, (Min, Max)); (3) average
cell delay in ISB and delay jitter (DISB, (Min, Max)); (4) average occupancy of OSB
(SOSB and (Min, Max)).

103

Figure 5.10 Switch III : throughput performance under unicast "1 ISB → 1 OSB
HotSpot" traffic.

Figure 5.11 Switch III : average end-to-end cell delay (DE_ to _E) under unicast "1
ISB → 1 OSB HotSpot" traffic.

104

Figure 5.12 Switch III : average delay in ISBs (D ISB ) under unicast "1 ISB

OSB HotSpot" traffic.

1

Figure 5.13 Switch III : average size of OSBs (SOSB ) under unicast "1 ISB -4 1

OSB HotSpot" traffic.

105

Switch II will endure the same performance 'decline as Switch I if link reser-

to an improved performance with any faster link reservation rate such as

R = 4 (refer

to Table 5.3). However, Switch II has a weakness that an ISB has to reset its link
reservation vector in every cell slot. Hence, the performance of Switch II is mainly
determined by the link reservation rate which would be a bottleneck for Switch II to
achieve the high performance. For example, Switch II with

R = 4, though achieving

better performance than Switch I, can not approach to a comparable performance
as the OQ switch.
Being an enhanced switch design, Switch III outperforms the other two switches
and achieves a comparable performance to the OQ switch under non-uniform traffic.
Switch III benefits from the dynamic link reservation schemes so that an ISB does not
need to reset its link reservation vector in every cell slot. Even though the dynamic

using QOBDLR algorithms can adapt to the input traffic quickly and perform a fast
and fair link resources allocation among ISBs. Fig 5.10 shows that Switch III leads to
a very similar throughput as the OQ switch. Fig 5.11 indicates that Switch III causes
no more than 30 cell slots longer delay of DE-to-E

if compared to the OQ switch. We

also observed that most of the cells are forwarded to and buffered in OSBs, hence,
DE-to-E is mainly resulted from the cell delay in OSBs. As we mentioned before, it

is a good feature of the proposed switch because OSBs, which look like the output
queues in the OQ switch, are able to incorporate per VC queueing with appropriate
cell schedulers to provide QoS guarantees.
Under multicast traffic, our switch designs can yield better performance than
under unicast traffic, because ISBs can take the advantage of GVOQs to deliver

106
multicast cells to ATMCSF. Thus, a faster cell forwarding can be gained when
switches handle multicast input traffic. Table 4.2 and Table 4.3 have evaluated
Switch I and Switch II under multicast " 1 ISB → 1 OSB HotSpot traffic". Here,
we will not further examine Switch III under multicast traffic because Switch III
has already proved to be able to achieve a good performance under unicast traffic
as shown in Table 5.2. Obviously, performance of Switch III under multicast traffic
will be even better.

Figure 5.14

DE-to-E

in Switch III using REQ-QOBDLR algorithm with different

107

Figure 5.15 Max. of D E-to-E in Switch III using REQ-QOBDLR algorithm with

R = 1 is capable of obtaining a high throughput which is comparable to that of the
OQ switch. The choice of R does not affect the throughput performance significantly.
But, Fig 5.14 and Fig 5.15 shows that, for Switch III using REQ-QOBDLR algorithm,
DE_t o _E and delay variance (i.e. Max of D E _ to _ E ) will be reduced if R increases.

The same conclusion can be drawn for REQREL-QOBDLR algorithm : the faster
link reservation rate, the better performance.
In summary, Switch III exhibits the capability to pursue a high performance
under both uniform traffic and non-uniform traffic. Compared to the OQ switch,

108
Switch III can be claimed as a competitive design in the sense that Switch III not
only can achieve a comparable performance to the OQ switch but also can eliminate
the N times speedup which is necessary in the OQ switch.

5.7 Conclusion
In this chapter, we present a novel switch design for scalable terabit multicast packet
switches. The proposed switch enjoys a modulr architecture consisting of ISBs,
OSBs and a central switch fabric. Dual round robin rings provide a mechanism for
ISBs to dynamically "borrow" and/or "lend" links from/to each other. The switch
benefits from input and output link sharing so that no speedup is needed in the
central switch fabric.
To resolve input and output contentions, cell delivery is based on link reservation in every ISB. We propose two Queue Occupancy Based Dynamic Link Reservation algorithms, both of them are able to provide a fast and fair link resource
allocation among ISBs. QOBDLR is a distributed link reservation scheme in which
an ISB can dynamically increase/decrease its link reservation for a specific OSB
according to its local available information. Arbitration complexity is 0(1).
Performance evaluation demonstrates that Switch III can achieve a comparable
performance to the OQ switch under any traffic pattern. But, our switch design
can scale easily without requiring speedup, while the OQ switch supporting similar
performance needs N times speedup (N is the switch size) which in large scale
switches is impractical.

CHAPTER 6
CONCLUSION AND FUTURE WORK
The aim of this dissertation is the design of a scalable, large-capacity, high
performance core switch for broadband networks. The issues addressed for the switch
design include multicasting, architecture scalability, and arbitration complexity. In
this dissertation, we proposed three novel scalable terabit multicast packet switches
Switch I, Switch II and Switch III.
From an architectural point of view, all the proposed switches are characterized by a modular configuration using ISBs, OSBs and ATMCSF. Furthermore,
all switches employ a novel co-operative input and output link sharing so that no
speedup is necessary in the central switch fabric. Thus, the bottleneck in memory
access rate and architecture expansion is avoided. Multicast function is achieved by
cell splitting along with cell delivery. The novel scheme of grouped virtual output
queue (GVOQ) provides a fast cell forwarding and simple cell scheduling, especially
for multicast traffic. Because of the modular architecture, the proposed switches can
easily scale to a large size and high capacity.
Instead of using a centralized scheduler to resolve input and output contentions,
we proposed various distributed resource allocation algorithms for each switch design.
In Switch I, two round robin scheduling algorithms IVOQ RR and GVOQ RR,
are presented. The arbitration complexity of IVOQ RR is in a range of [0(1) ,

0(M)], while GVOQ RR sustains a low complexity of 0(1). Switch II applies a
prioritized link reservation algorithm RR+POLR to eliminate the starvation of OSBs.
This results in substantial improvement in switch performance especially for nonuniform traffic. For the enhanced Switch III, we proposed two dual round robin
dynamic link reservation algorithms REQ-QOBDLR and REQREL-QOBDLR.

109

110
A fast and fair link resource allocation among ISBs is achieved by "borrowing"
and/or "lending" links from each other through. REQ tokens and REL tokens. Both
algorithms are distributed link reservation schemes in which every ISB, according to
its local available information, can dynamically modify its own link reservation. As
arbitration complexity is 0(1), scheduling complexity is not an obstacle any more
for switch growing to a large scale.
Comparison studies on switch performance show that Switch I performs well
for uniform traffic but it is not suitable for non-uniform traffic. Though Switch II
yields an improved performance under non-uniform traffic, RR+POLR algorithm is
not flexible enough to quickly adapt to the input traffic. Switch III benefits from
dynamic link reservation which provides a fast and fair resource allocation. Hence,
Switch III achieves a high performance as good as the OQ switch, while at the same
time eliminating the N times speedup of central switch fabric required in the OQ
switch. Thus, Switch III is a good choice for a scalable terabit multicast packet
switch.
The following issues need to be further addressed for practical implementation
of Switch III. First, the optimal choice of HT and LT may need to be investigated
rather than the bounds of HT and LT. A more comprehensive theoretical work on
REQ-QOBDLR and REQREL-QOBDLR algorithms is needed, and is ongoing work.
In addition, a detailed study on QoS features of Switch III might be necessary, even
though it appears that OSBs can incorporate per VC queueing with appropriate cell
schedulers to provide QoS guarantee.

APPENDIX A
REQ-QOBDLR ALGORITHM

A.1 Operations upon receiving REQj token
When receiving REQj token, the i th ISB will evaluate its queue occupancy q jiagnst
two thresholds : a high threshold (HT) and a low threshold (LT). Then, the i th ISB
decides whether to request an extra link and/or release a link to the j th OSB.
Step 1 : The i th ISB decides whether to request an additional link for the j th
OSB ?

111

112

will be inserted into REQ_NUM j in Step 3. In case 2, the i th ISB had sent a link
request but has not obtained the desired link yet. The i th ISB will keep waiting for
an available link but will not issue a new link request again. In case 4, the i th ISB
will cancel its current link request if REQ_NUMj > 0.

113
Step 2 : The i th ISB decides whether to release a link if REQ token carries

link requests ?

Step 3 : The i th ISB updates REQ token, then passes REQ token to next

down-steam ISB.
If a new-born link request for the j th OSB was generated by the i th ISB in Step
1 (i.e. case 1 in Fig A.1), the i th ISB will insert this new request in REQ token.
Hence, REQ_NUMj will be increased by 1. Finally, the i th ISB forwards REQj token
to its down-stream adjacent ISB.

114
A.2 Operations upon receiving REL n token

When receiving REL n token, the i th ISB will decide whether to take a link from
REL n token if REL n token carries available links.
Step 1 : The i th ISB decides whether to take an available link from REL n

token.

Step 2 : The i th ISB updates REL n token, then passes REL n token to its

up-link neighboring ISB.
If the i th ISB held a pending link release for the n th OSB (i.e. lni = 1), now,
the i th ISB can really release the link for the n th OSB through REL n token. The

APPENDIX B
REQREL-QOBDLR ALGORITHM

REQREL-QOBDLR algorithm is illustrated in detail in this appendix. The i th ISB
(0 < i < K) needs another vector LK_REQ_Modify i besides the four vectors such as
Qi , LK_RSV i , LK_REL i , LK_REQ i .
• LK_REQ_Modify i : Link Request Modification Vector in the i th ISB.

Assume that the i th ISB is receiving REQ j and REL n token in current
Rsv_Slot. Operation for REQ token is more similar to than different from that
in REQ-QOBDLR algorithm. But, operation for REL n token is unlike that in
REQ-QOBDLR algorithm.
We first describe the operation upon receiving REL n, token. Then we focus
on what is the impact of such difference in the operation of REL n token on the
operation for REQ token.

B.1 Operations upon receiving REL n token

Step 1 : The i th ISB decides whether to take a released link from REL n token,
if REL_NUM n > 0?
Arbitration : If REL_NUM n > 0, there are available links to the nth OSB.

The i th ISB will take a released link from REL n token only if its queue occupancy

115

116

Operation : If the i th ISB decides to take an available link from REL n token,

i th ISB is snatching a link which is supposed to satisfy another TSB's link request. To
compensate the 'stolen' link, the i th ISB will issue a link request for the n th OSB to
trigger a new released link not for itself but for another ISB who is still waiting for
its desired link. Since the i th ISB can not use REQ j token to carry a link request for
the n th OSB in current Rsv_Slot, hence, the ISB records this pending link request in

REQ n token in some Rsv_Slot(s) later, ISB i will first add the pending link request
into REQ_NUM n .
> Step 2 : The i th ISB decides whether to release an occupied link to the n th
OSB?

reduce 1. The released link will be inserted into RELn token right away so that
REL_NUMn will increase 1.
Since the i th ISB releases a link based on its own traffic load but it does not
know whether other ISBs are demanding this available link for the n th OSB. The

117

118

Rsv_Slot) may cause a potential pending increase/decrease on total link requests
carried in REQj token. Hence, when the i th ISB receives REQ token in present,
REQ_NUM j will be adjusted to be more realistically reflect the number of link
requests for the j th OSB. REQ_NUMj may be negative, it implies that the available
links to the j th OSB is more than link requests to the j th OSB in current time point.
A negative REQ_NUM j will not trigger any more link release for the j th OSB. Hence,
from long-term point of view, the released links for the j th ISB will keep a balance
to the link requests for the j th OSB (i.e. REL_NUMj ≤ REQ_NUM j ). This is a
characteristic of the REQREL-QOBDLR algorithm. (END)

REFERENCES

1. M. H. Guo, R. S. Chang, Multicast ATM Switches : Survey and Performance
Evaluation, Computer Communication Review, Vol 28, No. 2, April 1998,
pp. 98-131.
2. J. Turner, N. Yamanaka, Architecture Choices in Large Scale ATM Switches,
WUCS 97-21, May 1997.
3. A. Huang and S. Knauer, STARLITE: A Wideband Digital Switch, Proc. IEEE
GLOBECOM'84, pp. 121-125, Dec. 1984.
4. J. S. Turner, Design of a Broadcast Packet Switching Network, IEEE Trans. on
Commun., Vol. 36, June 1988, pp. 734-743.
5. T. T. Lee, Nonblocking Copy Networks for Multicast Packet Switching, IEEE J.
on Select. Areas in Commun. Vol. 6, December 1988, pp. 1445-1467.
6. J.S. Turner, A Practical Version of Lee's Multicast Switch Architecture, IEEE
Trans. on Commun., Vol 41, No.8, August 1993, pp. 1166-1169.
7. J. Kim, J. Park, H. Yoon, J.W. Cho, Fault-Tolerant Multicasting in MIN's
for ATM Switches, IEEE Commun. Letters, Vol. 2, No. 12, Dec. 1998,
pp.331-333.
8. K.Y. Eng, M.G. Hluchyj, Y.S. Yeh, Multicast and Broadcast Services in a
Knockout Packet Switch, Proc. of INFOCOM'88, 1988, pp.29-34.
9. C.K. Kim, T.T. Lee, Call Scheduling Algorithms in a Multicast Switch, IEEE
Trans. on Commun., Vol. 40, No. 3, March 1992, pp. 625-635.
10. D. X. Chen, J. W. Mark, SCOQ : a Fast Packet Switch with Shared Concentration and Output Queueing, IEEE/ACM Trans. on Networking, Vol. 1,
1993, pp.142-151.
11. H. J. Chao, B. S. Choe, Design and Analysis of A Large-Scale Multicast Output
Buffered ATM Switch, IEEE/ACM Trans. on Networking, Vol. 3, No. 2,
April 1995, pp. 126-138.
12. H. J. Chao, B. S. Choe, J. S. Park, N. Uzun, Design and Implementation of
Abacus Switch : A Scalable Multicast ATM Switch, IEEE J. on Select.
Areas in Commun., Vol. 15, No. 5, June 1997, pp. 830-843.
13. K. Wang, M.H. Cheng, Design and Performance Analysis of a Growable
Multicast ATM Switch, Proc. of INFOCOM'97, pp.934-940.

119

120

14. D.J. Marchok, C.E. Rohrs, R.M. Schafer, Multicasting in a Growable Packet
(ATM) Switch, INFOCOM'91, 1991, pp. 850-858.
15. Y.Yang, G.M. Masson, Broadcast Ring Sandwich Networks, IEEE Trans. on
Computers, Vol 44, Oct. 1995, pp. 1169-1180.
16. K.Y. Eng, M.J. Karol, G.J. Cyr, M.A. Pashan, Design and Prototype of a Terabit
ATM Switch Using a Concentrator-Based Growable Switch Architecture,

Proc. of International Switching Symposium (ISS), 1995, pp. 404-408.
17. M.R. Hashemi, A. Leon-Garcia, The Single-Queue Switch : A Building Block
for Switches with Programmable Scheduling, IEEE J. on Select. Areas on
Commun., Vol. 15, No. 5, June 1997, pp. 785-793.
18. A.K. Choudhury, E.L. Hahne, A New Buffer Management Scheme for Hierarchical Shared Memory Switches, IEEE/ACM Transactions on Networking,
Vol. 5, No. 5, October 1997, pp. 728-738.
19. R.C.Chang, C.Y. Hsieh, Design of Multicast ATM Switch, IEE Electronics
Letters, Vol. 34, No. 22, October 29, 1998, pp. 2089-2091.
20. Y. Xiong, L. Mason, Multicast ATM Switches Using Buffered MIN Structure :
A Performance Study, IEEE INFOCOM'97, pp. 926-933.
21. F. M. Chiussi, Y. Xia, V. P. Kumar. Performance of Shared-Memory Switches
Under Multicast Bursty Traffic, IEEE J. on Select. Areas in Commun.,
Vol. 15, No. 3, April 1997, pp. 473-487.
22. A. Racz, G. Fodor, Z. Turanyi, Weighted Fair Early Packet Discard at an ATM
Switch Output Port, IEEE INFOCOM'99, S 8E.
23. J.S. Wu, C.C. Ke, ATM Shared Memory Switch with Multicasting Balancing,
IEICE Trans. on Commun., Vol. E78-B, No. 9, September 1995, pp. 12621268.
24. H.J.Chao, N. Uzun, An ATM Queue Manager Handling Multiple Delay and Loss
Priorities, IEEE/ACM Trans. on Networking, Vol. 3, No. 6, December
1995, pp. 652-659.
25. A.K.Choudhury, E.L.Hahne, A New Buffer Management Scheme for Hierarchical Shared Memory Switches, IEEE/ACM Trans. on Networking, Vol.
5, No. 5, October, 1997, pp.728-738.
26. S.H.Byun, D.K.Sung, A General Expansion Architecture for Large-Scale
Multicast ATM Switches, IEEE GLOBECOM'97, S7-7.

121

27. S. Kumar, D.P. Agrawal, On Multicast Support for Shared Memory Based ATM
Switch Architecture, IEEE Network, Jan/Feb 1996, pp. 34 39.
-

-

-

28. M. R. Hashemi, A. Leon Garcia, A Multicast Single Queue Switch with a Novel
Copy Mechanism, Proc. of IEEE INFOCOM'98.
-

-

29. M. R. Hashemi, A. Leon Garcia, A Multicast Single Queue Switch with a Novel
Copy Mechanism, IEEE INFOCOM'98, SO7A-2.
-

-

30. N. Mckeown, V. Anantharam, J. Walrand, Achieving 100% Throughput in an
Input Queued Switch, Proc. of IEEE INFOCOM'96, March 1996.
-

31. A. Mekkittikul, N. McKeown, A Starvation-free Algorithm For Achieving 100%
Throughput in an Input-Queued Switch, Proc. of ICCCN'96.
32. A. Mekkittikul, N. Mckeown, A Practical Scheduling Algorithm to Achieve 100%
Throughput in Input-Queued Switches, Proc. of IEEE INFOCOM'98,
April 1998.
33. B. Prabhakar, N. McKeown, and R. Ahuja, Multicast Scheduling for InputQueued Switches, IEEE J. on Select. Areas in Commun., Vol. 15, No. 5,
pp. 855-866, 1997.
34. B. Prabhakar, N. Mckeown, Designing A Multicast Switch Scheduler, Proc. of
the 33th Annual Allerton Conference on Communication, Control and
Computing, October 1995.
35. S-T. Chuang, A. Goel, N. Mckeown, B. Prabhakar, Matching Output Queueing
with Combined Input and Output Queueing, Proc. of IEEE INFOCOM'99,
March, 1999.
36. M. Andrews, S.Khanna, K. Kumaran, Integrated Scheduling of Unicast and
Multicast Traffic in an Input-Queued Switch, IEEE INFOCOM'99, S 8E.
-

37. J. S C. Chen, R. Guerin, Performance Study of an Input Queueing Packet Switch
with Two Priority Classes, IEEE Trans. on Commun., Vol. 39, No. 1,
pp.117-126.
-

38. L. Jacob, A. Kumar, Delay Performance of Some Scheduling Strategies in an
Input Queuing ATM Switch with Multiclass Bursty Traffic, IEEE/ACM
Trans. on Networking, Vol. 4, No. 2, April 1998, pp. 258-271.
39. M.J. Karol, M.G. Hluchyj, S.P.Morgan, Input versus Output Queueing on a
Space Division Packet Switch, IEEE Trans. on Commun., Vol. Com-35,
No. 12, December 1987, pp. 1347-135.
-

122
40. T. T. Lee, A Modular Architecture for Very Large Packet Switches, Proc. of
GLOBECOM'89, Dec. 1989, pp. 1801-1809.
41. J. N. Giacopelli, J. J. Hickey, W. S. Marcus, W. D. Sincoskie, M. Littlewood,
Sunshine A High Performance Self Routing Broadband Packet Switch
Architecture, IEEE J. Select. Areas in Commun., Vol. 9, No. 8, Oct. 1991,
-

-

pp. 1289-1298.

42. Bellcore, Broadband Switching System (BSS) Generic Requirements, BSS
Performance, GR 110 CORE, Issue 1, Sept. 1994.
-

-

43. F. Sestini, Recursive Copy Generation for Multicast ATM Switching,
IEEE/ACM Trans. on Networking, Vol. 5, No. 3, June 1997, pp.329 335.
-

44. S. C. Liew, A General Packet Replication Scheme for Multicasting in Interconnection Networks, IEEE INFOCOM'95, 3d.4.1-3d.4.8.
45. J. F. Hayes, R. Breault, M. K. Mehmet-Ali, Performance Analysis of a Multicast
Switch, IEEE Trans. on Commun., Vol 39, No. 4, pp. 581-587.
46. J.S C, Chen, T.E.Stern, Throughput Analysis, Optimal Buffer Allocation, and
Tra fic Imbalance Study of a Generic Nonblocking Packet Switch, IEEE
J. on Select. Areas in Commu., Vol. 9, No. 3, April 1991, pp. 439-449.
-

f

47. W. D. Zhong, Y. Onozato, J. Kaniyil. A Copy Network with Shared Buffers for
Large Scale Multicast ATM Switching, IEEE/ACM Trans. on Networking,
Vol. 1, No. 2, April 1993, pp. 157 165.
-

-

48. R. P. Bianchini Jr., H. S. Kim. Design of a Nonblocking Shared Memory Copy
Network for ATM, Proc. of INFOCOM'92, pp. 6D.3.1-6D.3.10.
-

49. P. S. Min, M, V. Hegde, and H. S. Saidi, A. Chandra. Nonblocking Copy
Networks in Multi Channel Switching, IEEE/ACM Trans. on Networking,
Vol. 3, No. 6, December 1995, pp. 857 871.
-

-

50. X. Liu, H. T. Mouftah. Queuing Performance of Copy Networks With Dynamic
Cell Splitting for Multicast ATM Switching IEEE Trans. on Commun.,
Vol. 45, No. 4, April 1997, pp. 464-472.
51. F. M. Chiussi, Y. Xia, V. P. Kumar, Performance of Shared Memory Switches
Under Multicast Bursty Traffic, IEEE/ACM Trans. on Networking, Vol.
1, No. 2, April 1993, pp. 157-165.
-

52. W.T. Chen, Y.L. Chang, W.Y. Hwang, A High Performance Cell
Scheduling Algorithm in Broadband Multicast Switching Systems, IEEE
GLOBECOM'97, S5B-3.

123

53. C.K. Kim, Performance Analysis of a Duplex Multicast Switch, IEEE Trans. on
Commun., Vol. 40, No. 10, October 1992, pp. 1615 1624.
54. N. Uzun, A. Blok, Ten Terabit Multicast Packet Switch with SRRM Scheduling
Algorithm, BSS'99, Kingston, Canada, June 1999.
-

55. R. 0. LaMaire, D. N. Serpanos, Two-Dimensional Round-Robin Schedulers
for Packet Switches with Multiple Input Queues, IEEE/ACM Trans. on
Networking, Vol. 2, No. 5, October 1994, pp. 471-482.
56. J.F. Hayes, R. Breault, M. K. Mehmet-Ali, Performance Analysis of a Multicast
Switch, IEEE Trans. on Commun., Vol. 39, No. 4, April 1991, pp. 581-587.
57. C.S. Wu, G.K. Ma, B S. P. Lin, A Cell Scheduling Algorithm for VBR Traffic
in an ATM Multiplexer, IEEE, 1995, pp. 632 637.
-

-

58. S.Q. Li, J.W. Mark, Traffic Characterization for Integrated Services Networks,
IEEE Trans. on Commun., Vol 38, No. 8, August 1990, pp. 1231 1243.
-

59. A.K. Choudhury, E.L. Hahne, Dynamic Queue Length Thresholds for SharedMemory Packet Switches, Vol. 6, No. 2, April 1998, pp.130-140.
60. M. Murata, Y. Oie, T. Suda, H. Miyahara, Analysis of a Discrete Time Single-

Server Queue with Bursty Inputs for Traffic Control in ATM Networks,

IEEE J. on Select. Areas in Commun., Vol. 8, No. 3, April 1990, pp.
447-458.
61. N. Akar, N.C. Oguz, K. Sohraby, Matrix-Geometric Solutions of M/G/1-Type
Markov Chains : A Unifying Generalized State-Space Approach, IEEE J.
on Select. Areas in Commun., Vol. 16, No. 5, June 1998, pp. 626 639.
-

62. R. Jafari, K. Sohraby, Performance Analysis of a Priority based ATM Multiplexer with Correlated Arrivals, IEEE INFOCOM'99, pp. 1036 1043.
-

63. R. Jafari, K. Sohraby, General Discrete-Time Queueing Systmes with Correlated
Batch Arrivals and Departures, IEEE INFOCOM'00, March 2000.
64. N. Matsufuru, R. Aibara, Efficient Fair Queueing for ATM Networks using
Uniform Round Robin, IEEE INFOCOM'99, pp. 389-397.
65. OPNET by MIL3 Inc., Washington, DC 20008.

