1 Asynchronous vs Synchronous Input-Queued Switches by Andrea Bianco et al.
1
Asynchronous vs Synchronous
Input-Queued Switches
Andrea Bianco, Davide Cuda, Paolo Giaccone
Dipartimento di Elettronica, Politecnico di Torino (Italy)
F
Abstract—Input-queued (IQ) switches are one of the reference archi-
tectures for the design of high-speed packet switches. Classical results
in this ﬁeld refer to the scenario in which the whole switch transfers the
packets in a synchronous fashion, in phase with a sequence of ﬁxed-
size timeslots, tailored to transport a minimum-size packet. However, for
switches with large number of ports and high bandwidth, maintaining
an accurate global synchronization and transferring all the packets in
a synchronous fashion is becoming more and more challenging. Fur-
thermore, variable size packets (as in the trafﬁc present in the Internet)
require rather complex segmentation and reassembly processes and
some switching capacity is wasted due to partial ﬁlling of timeslots.
Thus, in this work we consider a switch able to natively transfer
packets in an asynchronous fashion thanks to a simple and distributed
packet scheduler. We investigate the performance of asynchronous IQ
switches and show that, despite their simplicity, their performance is
comparable or even better than those of synchronous switches. These
results highlight the great potential of the asynchronous approach for
the design of high-performance switches.
1 INTRODUCTION
A vast technical literature exists on input-queued (IQ)
switches, which are considered to be a winning choice
to achieve high-end performance due to their limited
technological requirements. Basically, IQ switches trade
a lower internal data transfer capacity (i.e., very limited
speedup of the switching fabric) for a larger complexity
in switch control and scheduling algorithms. Classical
results in this ﬁeld mostly refer to a synchronous slotted
operation of the entire switch: incoming variable-size
Ethernet or IP packets must be segmented at switch
inputs in ﬁxed-size data units, which are transferred
to outputs, where they are re-assembled as variable-
size packets. Beyond the complexity/efﬁciency costs of
this segmentation/reassembly process, the real imple-
mentation of a fully synchronous large packet switch
is not a trivial task. Indeed, the difﬁculty in keeping
under control the alignment of the clock reference signals
in different parts of the (often multi-rack) switch, and
the different propagation delays in boards, backplanes
and interconnection ribbons (often in presence of high-
parallelism buses), forced several manufacturers to have
independent clocking domains in different subsystems
of the switch, leading to an asynchronous operation.
Consider that on a 1 Gbps line, each bit lasts 1 ns,
corresponding to 20 cm in space on the line. Hence, the
time alignment is lost for two bits traveling over paths
differing more than 20 cm in length.
In this paper we collect known results (referenced in
the text when needed) and derive new results, showing
that the asynchronous IQ switch operation does not
introduce signiﬁcant performance detriment. We show
that the asynchronous operation simpliﬁes scheduling
and can even lead to higher throughput in some sce-
narios. Even when the asynchronous operation leads to
performance losses in comparison with synchronous cell
switching, these losses are limited, and may be smaller
than the losses due to the above-mentioned segmenta-
tion/reassembly overheads. In real scenarios, the over-
heads for synchronous operations can be responsible for
performance losses in the order of 10%, a non-negligible
value. The main novel contribution of our work is
to highlight the effect of the packet size distribution
on the performance of asynchronous and synchronous
switches, with different queueing architectures. On the
contrary, past works mainly concentrated on either ﬁxed
or exponentially distributed packet sizes.
Our contributions are organized as follows. In Sec. 2
we describe the different switching architectures consid-
ered in our work and deﬁne the synchronous and asyn-
chronous behavior of the switch. We describe the trafﬁc
model and show Property 1 that bounds the maximum
variation coefﬁcient of the packet size. This bound is
practically relevant, since such variation coefﬁcient af-
fects the throughput. In Sec. 2.4 we discuss in details the
overhead of the segmentation process in synchronous
switches and we evaluate the corresponding overhead
in real scenarios, through the analysis of real trafﬁc
traces. These results show that segmentation introduces
a considerable performance degradation in synchronous
switches, independently of the speciﬁc queueing struc-
ture. Sec. 3 focuses on input queued switches with a
single queue for each input. The maximum throughput
is theoretically evaluated, showing its dependency from
the coefﬁcient of variation of the packet size. The the-
oretical results are validated through simulation. Sec. 4
focuses on IQ switches with Virtual Output Queueing.
It investigates the performance mainly by simulation,
showing the effect of the packet size distribution. In
Sec. 5 we investigate the throughput performance of
a bufferless switch under different trafﬁc models. This
speciﬁc architecture allows to highlight the effect of the
output contention for synchronous and asynchronous
arrival patterns. Note that this architecture is practically2
relevant mainly for transparent packet optical switch-
ing [1], in which packets are not stored to avoid power-
hungry electro-optical conversions.
A preliminary version of this work appeared in [2].
2 SYSTEM MODEL
We assume that packets are switched across a N × N
bufferless non-blocking switching fabric, e.g. a crossbar.
Furthermore, no speedup is available, i.e. the transfer
rate at the inputs and at the outputs of the switching
fabric is the external switch rate. Packets arrive at the
input ports of the switch where they are stored and
processed. Since no speedup is available, input queues
are needed to cope with output contentions, i.e., when
several packets from different inputs are directed to the
same output. Queues at the outputs are not needed, un-
less for reassembly purposes. A scheduler solves output
contention between head-of-line (HoL) packets by choos-
ing a set of packets that can be transferred satisfying
two constraints: at most one packet can be transferred
from each input and to each output at the same time. A
feasible conﬁguration of the switching fabric is referred
to as a “matching” in the bipartite graph whose left-side
nodes correspond to the inputs and the right-side nodes
correspond to the outputs. We assume that packets have
variable size.
2.1 Switch architecture
We consider three switching architectures, shown in
Fig. 1.
The ﬁrst one is a bufferless switch, without queues at
the inputs: when more than one packet, destined to the
same output, arrives at different inputs, one packet is
served and the others are lost. Even if this is meaningful
only for full optical switches, it is worth investigation
to highlight the throughput reduction due to the simple
contention resolution scheme and the lack of queueing.
Note that the scheduler can be easily distributed among
the outputs, because the scheduling decision can be
taken independently by each output.
The second architecture is an input-queued (IQ) switch
with a single FIFO queue per input and a total of N
queues. The scheduling decision is relatively simple and
can be distributed among the outputs. Its main drawback
(a) (b) (c)
Fig. 1. Switching architectures: (a) bufferless switch, (b)
single-FIFO IQ switch, (c) VOQ IQ switch
is that it suffers from the HoL blocking problem [3] that
limits the maximum achievable throughput.
Finally, we consider IQ switches with VOQ (Virtual
Output Queueing), i.e. with one FIFO queue for each
input-output pair. This architecture avoids the through-
put degradation due to HoL blocking, even if at the
cost of managing N2 queues. To obtain high throughput,
scheduling decision requires coordination between in-
puts and outputs, thus increasing scheduler complexity.
2.2 Synchronous (SYN) switching
In SYN switches, all data transfers across the switch-
ing fabric occur at the same time and last exactly one
“timeslot”. The timeslot duration is equal to the packet
transmission time for networks in which all the packets
have the same size. In the case of variable-size packets, as
in the Internet trafﬁc, the packets are chopped into ﬁxed
sized packets (named cells), whose transmission time is
the timeslot. Cells are individually switched across the
switching fabric and then reassembled at the outputs
to obtain the original packet, ready to be sent to the
output interface. The timeslot duration (or, equivalently,
the cell size) requires careful design to minimize the
throughput loss due to cell granularity. In Sec. 2.4 we
will address this design problem by considering two real
packet trafﬁc traces [4], [5] captured on the network of
FastWeb, one of the largest Italian ISP. We will show
that around 10% of the bandwidth is lost even if the
cell size is optimized, due to partial ﬁlling of slots and
extra-overheads.
Schedulers for SYN switches can work in either “cell
mode” or “packet mode”. A cell-mode (CM) scheduler
takes independent decisions at each time slot, without
considering the packet each cell belongs to. Thus, at the
output of the switching fabric packet interleaving may
occur and some reassembly queues are needed at the
outputs. Furthermore, partial losses of the packet content
may occur. On the contrary, packet-mode (PM) sched-
ulers [6] take into account that the cells are originated
by packets. Thus, PM schedulers force the transfer of
all the cells belonging to the same packet in consecutive
timeslots. As a consequence, no packet interleaving is
allowed at the outputs.
2.3 Asynchronous (ASY) switching
In ASY switches, the initial time at which a packet is
transferred across the switching fabric occurs indepen-
dently of other packets. When the packet has been com-
pletely transmitted to an output or a new packet arrives
to an empty queue, a new matching can be computed
between the inputs and outputs that are currently free.
Packet interleaving is not allowed, as in PM for SYN
switches. However, packet transfer through the switch-
ing fabric occurs asynchronously. The ASY model well
captures the behavior of asynchronous and variable-
length optical packet switching systems [1], [7] since in
the optical domain the packet alignment may be difﬁcult3
to achieve [8]. Moreover, the ASY model approximates
electronic packet switching when the minimum unit at
which the transfer and the control occurs is much smaller
(e.g., word or byte) than the packet/cell.
For simplicity, we consider an abstract “pure” ASY
model, in which the minimum transfer unit tends to
zero. Thus, we assume that all packets have a size
which is a continuous random variable. The scheduling
decision becomes simple to be implemented, because at
most one packet can ﬁnish its transmission across the
switching fabric at a given time or can arrive to an empty
port.
As a drawback, due to the asynchronous nature of
packet transmissions, when a packet has been fully
transmitted, the input (output) can be matched to a
different output (input) only if there exist at least another
non-busy output (input). This fact limits the degree of
freedoms in changing the matching, especially for high
load. Hence, some queues can suffer from temporary
starvation, which may increase the average delay expe-
rienced by packets.
2.4 Segmentation overhead in synchronous switch-
ing
In SYN switches, when packets are chopped into ﬁxed-
size cells, some bandwidth is wasted mainly due to
two effects: (i) unﬁlled cells, (ii) additional control in-
formation needed on each cell. Indeed, the last cell of
a packet may be only partially ﬁlled due to rounding.
In the worst case, a packet slightly larger than a cell
generates two cells. As a consequence, almost 50% of the
bandwidth can be wasted. Furthermore, each cell should
carry some control information to correctly reassemble
the packet at the ORM. Examples of such information
are: the sequence number, the last-cell ﬂag, the packet
identiﬁer, or the payload size. Each cell should carry also
some control information to route correctly the cell inside
the switching fabric, such as the router port or interface.
On the contrary, in ASY switches, when transferring a
packet across the switching fabric, the packet simply
carries control information to route the packet, as it
happens for each cell in a SYN switch.
We evaluate the throughput inefﬁciency in SYN
switches due to segmentation, with respect to ASY
switches, in some realistic cases. In a SYN switch, let
breas be the amount of control information added for
each cell to correctly reassemble the packet in the ORM,
and let broute be the amount of control information for
each cell devoted to the routing process. A packet of
size p generates a number of cells equal to1 ⌈p/c⌉, being
c the cell payload size. The total amount of data DSY N
transferred across the switching fabric is
DSY N(p,c) =
lp
c
m
(c + broute + breas) (1)
1. ⌈x⌉ is the smallest integer ≥ x.
 1
 1.5
 2
 2.5
40 100 500 1000 1500
S
p
e
e
d
u
p
 
f
a
c
t
o
r
IP-PDU size [bytes]
c=30 bytes
c=40 bytes
c=50 bytes
c=100 bytes
Fig. 2. Segmentation speedup S(p,c) for SYN switches
to compensate throughput degradation due to packet
segmentation
In an ASY switch, for a fair comparison, we assume
that broute is still the total amount of control informa-
tion necessary to route the packet across the switching
fabric. Hence, when a packet of size p is transferred, the
total amount of information DASY transferred across the
switching fabric is simply
DASY (p) = p + broute (2)
Obviously, DSY N(p,c) > DASY (p) and this implies a ca-
pacity loss to transfer user’s data in the SYN switch with
respect to the ASY switch. We deﬁne the segmentation
speedup S as the acceleration factor a switching fabric
should provide to compensate the loss of throughput
due to segmentation in a SYN switch. S = 1 means
that ASY and SYN switches behave the same, whereas
larger values of S means that SYN switches experience
a throughput reduction by a factor S. Formally,
S(p,c) =
DSY N(p,c)
DASY (p)
Thanks to (1) and (2), we have:
S(p,c) =
lp
c
m c + broute + breas
p + broute
(3)
Fig. 2 shows S(p,c) for the following settings: breas =
broute = 1 byte, c ∈ {30,40,50,100} bytes, p ∈ [40,1500]
bytes. In general, ﬁnding the optimal value of c is not
immediate. Nevertheless, optimizing c just considering
the average packet size is far from being a good ap-
proximation of the optimal value because the packet
size distribution in the Internet is spread and thus the
probability of observing packets with the average packet
size is low.
One optimal way to choose c is to minimize the
average speedup for a given distribution F(p) of the
packet sizes, by solving:
c⋆ = argmin
c
X
p
DSY N(p,c)F(p) (4)4
TABLE 1
Effect of segmentation for different cell size c
Trafﬁc Optimal cell size Adopted cell size Ave. segmentation
trace c⋆ [bytes] c [bytes] speedup S
T1 113 113 1.035
50 1.069
T2 50 50 1.066
113 1.150
We evaluated numerically c⋆ for real trafﬁc scenarios. We
considered two trafﬁc traces captured on an high-speed
core router on FastWeb network [4], [5]. The ﬁrst trafﬁc
trace (T1) refers to the trafﬁc from the POP to the users
and comprises around 31   106 IP packets, whereas the
second one (T2) refers to the trafﬁc from the users to the
POP and comprises around 7   106 IP packets. The two
traces are quite different since the ﬁrst consists mainly of
multicast IP-TV trafﬁc distributed to the users, whereas
the second of data and VoIP trafﬁc generated by the
users.
We evaluated the optimal cell sizes for both traces
and reported them in the second column of Table 1.
Unfortunately, the optimal value is very different for
the two traces and cannot be chosen a-priori during
the hardware design of the switch. Nevertheless, we
computed for both traces the segmentation speedup in
the case of the optimal value for the corresponding trace
and the optimal value for the other trace.
From these results it is clear that some throughput
degradation (3−7%) is experienced when the cell size is
optimal, but this becomes larger (up to 15%) when the
cell size is not optimally matched to the actual trafﬁc.
Hence, this performance degradation due to segmenta-
tion may not be negligible.
2.5 Methodology and trafﬁc models
We will later discuss the performance of the three switch-
ing architectures presented in Sec. 2.1. The theoretical
models will be validated by event-driven simulations
implemented through OMNeT++ [9]. Note that for SYN
switches we do not consider the throughput degradation
due to segmentation, i.e. we assume that no bandwidth
is wasted due to partial cell ﬁlling and to additional
overhead.
In our simulations, packets are generated at inputs
according to two states: during ON-state the input gener-
ates a single packet, whereas during OFF-state the input
is idle. Both ON and OFF periods are i.i.d..
Idle OFF-periods are geometrically distributed for the
SYN switch, and exponentially distributed for the ASY
switch, and their average is set as to obtain the required
average input load ρ.
Regarding ON-periods, let L be a random variable
corresponding to the packet length (i.e., ON-period),
measured in bits/packet. Let mL be the average packet
length E[L] = mL, and α be the variation coefﬁcient of L.
The packet length distribution for the ASY (SYN) switch
is exponential (geometric) for α = 1, hypo-exponential
i.e. gamma (hypo-geometric) for α < 1 and hyper-
exponential (hyper-geometric) for α > 1.
It is important to note that any real distribution of
packet size is always discrete, with a minimum byte
granularity. Using standard probability theory, it can be
shown [10]:
Property 1: Assume that the packet size is arbitrarily
distributed between a minimum lmin and maximum
lmax, then the corresponding coefﬁcient of variation α
is always bounded by
α ≤
lmax − lmin
2
√
lminlmax
(5)
Note that Property 1 holds independently of the speciﬁc
distribution and permits to bound any realistic α by
considering the minimum and the maximum transmis-
sion unit (MTU) allowed in the network. In the case
of networks based on standard Ethernet, lmin = 64 and
lmax = 1518 bytes, thus α < 2.32. In the case of jumbo
frames adopted in Gigabit Ethernet, lmax = 9018 bytes
and α < 5.9.
Note that the bound provided in (5) corresponds to a
worst-case distribution that is very unlikely to happen
in reality. Indeed, realistic values of α are much smaller.
The analysis of the two FastWeb trafﬁc traces presented
in Sec. 2.4 gives α = 0.48 and α = 1.35. The value
0.48 was for trafﬁc comprising mainly IP-TV multicast
packets (with many packets of the same size), whereas
1.35 was for trafﬁc comprising mainly data and VoIP
packets. Both values are much smaller than 2.32 or 5.9.
Let λij be the packet arrival rate from input i to output
j measured in packets/s. The trafﬁc matrix is deﬁned as
Λ = [λij]. Let c be the link capacity, measured in bit/s.
The trafﬁc is said to be admissible if neither an input or
an output is overloaded:
N X
i=1
λijmL ≤ c
N X
j=1
λijmL ≤ c
We consider always admissible trafﬁc in the following. The
trafﬁc is said to be uniform if λij = ρ/mL for any i,j.
The switch is said to be in saturation whenever ρ = 1.
3 INPUT-QUEUED SWITCHES WITH SINGLE
FIFO QUEUE
We consider a switch with a single queue per input and
controlled by a scheduler performing random choices at
the outputs. In a SYN switch, each output chooses one
cell at random among the cells at the head of the queues
(referred as head-of-line (HoL) cells) directed to it. It
is well know [3] that the maximum throughput, under
uniform trafﬁc and Bernoulli i.i.d. arrivals, is limited
to 2 −
√
2 ≈ 58% due to the HoL blocking problem.
For correlated arrivals (bursts) of ﬁxed size cells, [11]
showed that the throughput varies between 0.5 and 0.58,
depending on the degree of burstyness in the trafﬁc.5
In an ASY switch, when an output ﬁnishes to serve
a packet, the output scheduler chooses one packet at
random among the HoL packets directed to it; if no
packet is available, the output scheduler waits for the
ﬁrst HoL packet directed to it. We prove that:
Theorem 1: Under uniform ON-OFF trafﬁc, a single-
FIFO ASY switch achieves a maximum throughput TASY
equal to 0.5 for α = 1; for α  = 1
TASY =
√
2α2 + 2 − 2
α2 − 1
(6)
Proof: The maximum throughput can be estimated
in saturation conditions (ρ = 1) by considering the sys-
tem of virtual queues corresponding to the HoL packets,
waiting or being in service. Such virtual system is built
on N queues, one for each output, and N jobs, one
for each possible HoL packet. By construction, the size
of virtual queue j corresponds to the number of HoL
packets directed to output j. Whenever an input ends
the transmission across the switching fabric of a packet
directed to output j, virtual queue j ﬁnishes to serve
a job. Since the switch is in saturation, a new packet,
behind the HoL packet just served, reaches the HoL, and
a new job arrives at the virtual queue corresponding to
its destination output. Note that the queueing network
of the virtual queues is closed, with N jobs, because
at each service corresponds a new arrival. In summary,
the arrival and departure events in the virtual system
correspond to ends of transmissions of the IQ switch.
There exists a bijective relation between any of the HoL
packets and the jobs; the service duration of a job in the
virtual system corresponds to the transmission time of
the corresponding packet.
Since the trafﬁc is uniform and the scheduler operates
randomly, we can consider a generic output. Let X be
the corresponding virtual queue size (i.e., the number
of HoL packets directed to such output). By deﬁnition
X ∈ [0,N] and E[X] = 1 because the total number of
HoL packets is N. The dynamics of X can be described
by the occupancy of a continuous time M/G/1 queue
in which the service time is equal to the packet length
L, which is a random variable. Since trafﬁc is uniformly
distributed among outputs, the arrivals at the queue are
given by the superposition of N independent and identi-
cally distributed renewal processes, each with rate λ/N.
Now, thanks to the superposition limit theorem [12],
for N → ∞, the arrival process becomes Poisson at
rate λ. Note that, very similarly, [3] showed that in a
SYN switch X follows the dynamics of a discrete time
M/D/1 queue where the number of jobs arriving during
a generic timeslot follows a Poisson distribution, given
that N → ∞.
Now we can exploit the known result for the M/G/1
queue:
E[X] = ρ +
λ2E[L2]
2(1 − ρ)
= ρ +
ρ2(1 + α2)
2(1 − ρ)
 0.1
 0.2
 0.3
 0.4
 0.5
 0.6
 0.1  1  10
T
h
r
o
u
g
h
p
u
t
α
SYN-RND-CM
SYN-RND-PM
ASY-RND
ASY-Theoretical
Fig. 3. Maximum throughput for single-FIFO ASY and
SYN switches under uniform trafﬁc for a 100×100 switch.
As a reminder, in standard Ethernet networks, α < 2.32.
Since E[X] = 1, we obtain:
(α
2 − 1)ρ
2 + 4ρ − 2 = 0 (7)
By solving (7), for α = 1, the maximum throughput is
ρ = 0.5. For α  = 1, we get (6).
3.1 Known results
The throughput of single-FIFO ASY switches was also
studied in [13], [14], in the case of Poisson or long-
range-dependent arrivals process, for exponential packet
lengths and under a generic trafﬁc matrix. All these re-
sults assume exponentially distributed packet sizes and
can be seen as a special case of Theorem 1 when α = 1,
which holds for a generic packet length distribution. No-
tably, [15] has extended the results in [3] to a combined
input-output queued switch with backpressure, running
asynchronously and fed by ﬁxed-size cells. For the par-
ticular case in which the output queues are not available,
the considered architecture degenerates into an IQ with
single FIFO queues. As general result, [15] showed that
an ASY switch achieves the same throughput than a SYN
switch, with negligible differences in terms of delays.
This result is coherent with our ﬁndings, for the speciﬁc
case α = 0. The methodology adopted in [15] holds for a
generic packet size distribution (even it has been applied
only to the ﬁxed-size case). However, differently from
Theorem 1, [15] does not provide any insight about the
effect of α on the throughput.
3.2 Simulation results
Fig. 3 compares the maximum throughput for ASY and
SYN switches. In the case of ASY switches, we report
the results obtained by considering a random output
scheduler (ASY-RND) and the theoretical curve obtained
by (6), which appears to be very accurate. In the case
of SYN switches, we considered two random schedulers
(SYN-RND-CM, SYN-RND-PM) operating in CM and
PM respectively.6
In the case α → 0, i.e. ﬁxed packet sizes, the maximum
throughput for an ASY switch is
√
2 − 2 ≈ 58% as in
a SYN architecture, coherently with [15]. This is not
surprising, since even if the arrivals in a ASY switch
are time-continuous, the queueing effect tends to syn-
chronize the services among all the outputs and, after a
transient period, the system behaves like a SYN switch
in saturation. When α → ∞, the maximum through-
put goes to zero. This theoretical result shows that the
throughput degradation due to ASY mode can be very
large, as expected. However, this happens only when
α is very large: only for α > 2, the throughput remains
smaller than 30%. Recalling from Sec. 2.5 that in standard
Ethernet networks α ≤ 2.32, the ASY throughput would
be always larger than 0.4.
Performance of the SYN switch in CM are almost
constant with α. On the other hand, ASY-RND and
SYN-RND-PM behave similarly, presenting the same
throughput degradation as α increases.
Finally, these results show that, depending on the
trafﬁc conditions, an ASY switch can perform better
or worse than a SYN switch; in a realistic case, the
throughput degradation due to the ASY behavior is
limited.
4 INPUT-QUEUED SWITCH WITH VOQ
We now consider an input-queued (IQ) switch with
Virtual Output Queueing (VOQ), i.e. one FIFO queue
for each input-output pair (see (c) in Fig. 1).
In a SYN switch, the scheduler transfers a non-
conﬂicting set of HoL cells by computing a matching
between the inputs and the outputs. Each VOQ is asso-
ciated with a weight equal to the number of enqueued
cells. The maximum weight matching (MWM) algorithm
chooses, among all possible matchings, the one with the
maximum weight.
4.1 Known results
It is well known [16] that MWM is able to achieve 100%
throughput under any admissible Bernoulli i.i.d. trafﬁc.
This result has been notably extended to any admissible
trafﬁc process in which the cumulative number of cells
arrived follows the strong law of large numbers. This
means that MWM is optimal also when the trafﬁc is
correlated, as in the case of cell arrivals due to the
segmentation process.
Many extensions/variations of the MWM have been
proposed to achieve the maximum throughput in a
SYN switch operating in CM [17], [18]. In summary, [6]
showed that: i) the MWM operating in PM (PM-MWM)
achieves 100% throughput under Bernoulli i.i.d. packet
generation; ii) the delay performance of PM can be better
or worse than cell-based schedulers depending on the
variation coefﬁcient α of the packet size distribution
(this result is in contrast with the intuitive idea that
PM can only increase delays due to packet starvation);
iii) non-optimal PM schedulers behave very closely to
optimal schedulers (since less degrees of freedom in the
matching choice require less iterations). These results
were generalized in [19], where it was shown that, under
regenerative trafﬁcs, PM-MWM is throughput optimal.
Furthermore, [19] showed that, when the trafﬁc is
non-regenerative, PM-MWM may not be optimal from
the throughput point of view. Indeed, it is possible to
devise counterexamples in which the trafﬁc, even if
admissible, is such that one matching is kept forever,
preventing all the other queues from being served. These
counterexamples require a strong correlation among the
arrivals at different inputs and, even if not realistic,
show the limitations of PM schedulers. To deal with non-
regenerative trafﬁc, [19] proposes to freeze the matching
after a ﬁxed number sf of timeslots and wait until one of
the corresponding queues empties. After this event, the
scheduler starts to compute again the matching on the
whole set of queues. Of course this process introduces
some throughput loss; given the maximum packet size
lmax in timeslots, then the period in which the matching
is kept frozen is lmax − 1 and the maximum bandwidth
loss is equal to (lmax − 1)/(sf + lmax − 1), which can be
set equal to any small ǫ for sufﬁcient large sf. Indeed,
it is sufﬁcient to set sf = ⌈(1 − ǫ)lmax/ǫ − 1/ǫ⌉ + 1, to
experience a bandwidth loss ǫ that can be compensated
by a switching speedup equal to (1 + ǫ) [19]. Finally,
[20] discusses in details the asynchronous implemen-
tation of the classical iSLIP [21] scheduling algorithm.
It highlights also some malicious trafﬁc patterns, non-
regenerative according to the deﬁnitions in [19], that may
cause starvation problems.
An input queued switch can be modeled as a special
case of a generic constrained queueing network with
asynchronous behavior. [22] has deﬁned the optimal
policy in terms of throughput in such generic networks
under Poisson arrivals and random packet size. For
the ASY switch, the optimal policy proposed in [22]
degenerated into computing the MWM at time tn (when
the weight of the MWM is wn) and keeping such
matching for a time equal to wr
n, for some r < 1.
Then the policy waits until all the outputs end their
current packet transmissions, and then recomputes the
MWM, very similarly to what has been proposed in [19].
A wide generalization of the asynchronous queueing
scenario considered by [22] are the stochastic processing
networks, for which an extension of the MWM (called,
maximum pressure policy) has been shown throughput
optimal [23].
4.2 Scheduling in an ASY switch
In an ASY switch, the scheduler has few degrees of
freedom in choosing the packets, similarly to PM sched-
ulers in SYN switches. Since packet arrivals are time-
continuous, all the scheduling choices are concentrated
at output ports. Each output operates asynchronously
and independently of all other outputs, allowing fully
distributed scheduling algorithms. Whenever an output7
OU
i j
U I
Fig. 4. At the end of the transmission of a packet from
i to j, the set Ω of all possible candidate edges are
shown dashed.
ﬁnishes to transmit a packet, only two situations can
occur. Either there are other queued packets (at most
N) to choose from, or no packet is present. As a conse-
quence, the matching “slowly” evolves with the time. It
is possible to bound the maximum number of edges that
can change in the matching at any time, as follows:
Property 2: In a ASY switch with VOQ, at any time
the maximum number of newly added edges is two.
Proof: Assume that the transmission of a packet
from input i to output j ends at time t−. This event
happens asynchronously with respect to all the other
outputs, because of the continuous support of the packet
size distribution. We now partition the inputs and the
outputs based on their match condition: we deﬁne IU
the set of unmatched inputs at time t and OU the set
of unmatched outputs at time t; we have i ∈ IU and
j ∈ OU. Now the set of candidate edges to add to the
current matching is a subset of set Ωij deﬁned as
Ωij = {i → j} ∪ {k → j for any k ∈ IU}∪
{i → k for any k ∈ OU}
since the VOQs corresponding to the edges in Ωij may
be empty. Fig. 4 shows an example of such sets. Note
that the edges between unmatched inputs and outputs
(except for the edge i → j), i.e. the edges between any
input in IU \ {i} and any output in OU \ {j}, cannot
exist, otherwise they would have been already matched
just before t. Now the new matching computed by the
outputs on Ωij can include only 0, 1 or 2 edges.
However, even if the matching evolves “slowly”, we
will see in Sec. 4.4 that the matching is able to adapt
very quickly to the queues state and being optimal most
of the time.
4.3 Trafﬁc scenarios
The simulation study aims at comparing the perfor-
mance of scheduling algorithms for SYN switches and
ASY switches. In the case of SYN switches, we con-
sidered iSLIP [21] and MWM, running in cell-mode
(CM) and in packet-mode (PM). These algorithms are
denoted as SYN-iSLIP-CM, SYN-iSLIP-PM, SYN-MWM-
CM and SYN-MWM-PM. In the case of ASY switches,
we considered the following algorithms running at each
output: round-robin (ASY-RR), random (ASY-RND) and
longest queue ﬁrst (ASY-LQF). Note that ASY-LQF is
similar to SYN-MWM-PM.
The following three trafﬁc scenarios have been consid-
ered:
• Uniform (UNI) trafﬁc: λij = ρ/N, for all i,j; this is
the most classic trafﬁc scenario in the literature;
• Bidiagonal (BID) trafﬁc: λii = 2ρ/3, λi|i+1|N = ρ/3, for
any 0 ≤ i ≤ N−1, being |x|N equal to x modulus N;
this trafﬁc is well known in the literature for SYN
switches, since it highlights performance losses due
to non-optimal scheduling algorithms.
• Logdiagonal (LOG) trafﬁc: λij = 2|j−i|N/c, for any 0 ≤
i,j ≤ N − 1, being c an appropriate normalization
constant; also this trafﬁc highlights performance
losses due to non-optimal scheduling algorithms.
Packet sizes were chosen according to the following
distributions:
• Fixed packet size: L = l1, being l1 a constant value.
For small packets, we chose l1 = 40 bytes cor-
responding to the minimum IP packet size, after
removing the MAC header. For large packets, we
chose l1 = 1500 bytes corresponding to the maxi-
mum packet size seen on a Ethernet network.
• Trimodal packet size: P{L = li} = pi for i =
1,2,3, being {li} the set of packet sizes and {pi}
the corresponding probabilities. We have approx-
imated the distribution observed in the above-
mentioned FastWeb traces with the following pa-
rameters: {li} = {40,40 × 12,40 × 32} bytes and
{pi} = {0.559,0.200,0.241}.
We report the results for a 16 × 16 switch; for larger
switches we observed similar results. Port rate is set
equal to 10 Gbit/s. In the case of SYN switches, the
timeslot is equal to the minimum packet size, 40 bytes
(32 ns). The queue size is equal to 400,000 bytes. The in-
vestigated performance metrics are the average through-
put and the average packet delay, versus the offered load
in Gbps. Note that a load equal to 10 Gbps corresponds
to a fully loaded switch for which the average delay
is bounded by the ﬁnite queue size. Statistics were
obtained, after removing the transient period, with an
accuracy of 2% for a 95% conﬁdence interval.
4.4 Simulation results
Fig. 5 shows the average packet delay under uniform
trafﬁc and trimodal packet size distribution. All the
algorithms behave similarly, achieving the maximum
throughput. In SYN switches, CM shows slightly larger
delays due to the packet interleaving at each output,
as discussed in [6]. To highlight the effect of packet
interleaving, in Fig. 6 we show the delays obtained
with a trafﬁc scenario with only large packets. In CM,
the queue length metrics adopted by MWM tends to
interleave packets more than the simple round robin of
iSLIP. Indeed, assuming equal size packets, in the case
of round robin a packet can be interleaved with at most
2(N − 1) other packets, whereas for a longest queue8
 0.1
 1
 10
 100
 1000
 1  2  3  4  5  6  7  8  9  10
A
v
e
r
a
g
e
 
d
e
l
a
y
 
[
µ
s
]
Load [Gbps]
SYN-iSLIP-PM
SYN-MWM-PM
SYN-iSLIP-CM
SYN-MWM-CM
ASY-LQF
ASY-RR
Fig. 5. Average packet delay under uniform trafﬁc for a
trimodal packet size distribution
 1
 10
 100
 1000
 1  2  3  4  5  6  7  8  9  10
A
v
e
r
a
g
e
 
d
e
l
a
y
 
[
µ
s
]
Load [Gbps]
SYN-iSLIP-PM
SYN-MWM-PM
SYN-iSLIP-CM
SYN-MWM-CM
ASY-LQF
Fig. 6. Average packet delay under uniform trafﬁc for
large packets (l1 = 1500 bytes)
this value is unbounded. For small packet size, CM and
PM schedulers behave similarly under uniform trafﬁc,
because the packet interleaving is negligible with respect
to the packet delay.
Fig. 7 shows the performance achieved under bidiag-
onal trafﬁc and trimodal packet size distribution. This
trafﬁc scenario is very critical to be scheduled because of
the limited degrees of freedoms in choosing the match-
ings. It can be shown that, to achieve the maximum
throughput, the scheduler must cycle among only two
complete matchings M1 and M2, corresponding to the
two non-empty diagonals of the trafﬁc matrix. Whenever
the scheduler chooses a matching different from M1
and M2, the matching size is smaller than N, and a
throughput loss is experienced. The greedy choice of all
algorithms, except for SYN-MWM-CM and SYN-MWM-
PM, lead to matchings that “mix” M1 with M2. For
this reason, this trafﬁc pattern is considered a challeng-
ing scenario to assess the performance of non-optimal
algorithms for SYN switches. According to Fig. 7, for
SYN switches, MWM achieves 100% throughput and
outperforms iSLIP, which achieves a throughput less
 0.1
 1
 10
 100
 1000
 1  2  3  4  5  6  7  8  9  10
A
v
e
r
a
g
e
 
d
e
l
a
y
 
[
µ
s
]
Load [Gbps]
SYN-iSLIP-PM
SYN-MWM-PM
SYN-iSLIP-CM
SYN-MWM-CM
ASY-LQF
ASY-RR
Fig. 7. Average packet delay under bidiagonal trafﬁc for
the trimodal packet size distribution with N = 16.
 1
 10
 100
 1000
 1  2  3  4  5  6  7  8  9  10
A
v
e
r
a
g
e
 
d
e
l
a
y
 
[
µ
s
]
Load [Gbps]
SYN-iSLIP-PM
SYN-MWM-PM
SYN-iSLIP-CM
SYN-MWM-CM
ASY-LQF
Fig. 8. Average packet delay under bidiagonal trafﬁc for
large packets (l1 = 1500 bytes)
than 0.9 in CM and PM, as shown in Table 2. Note that
we omitted all the points for load larger than 9 Gbps
due to the large packet losses. On the contrary, ASY-LQF
and ASY-RR are able to achieve 100% throughput, even
if at the cost of large delays due to temporary starvation,
but outperforming the heuristic scheduling algorithms in
SYN switches. Similar performances are observed when
packets have a ﬁxed size. For example, Fig. 8 shows the
delays for large packets.
The good performance of ASY-LQF and ASY-RR could
be unexpected. Fig. 9 shows the total size of the match-
ing, and the number the edges belonging to M1 and M2.
We know that to achieve 100% throughput the matching
size should be N, and this can happen only when the
number of edges belonging either to M1 or to M2 is
also N. This must always occur, except for a negligible
time during which which the matching can be an hybrid
between M1 and M2. As shown in Fig. 9, most of the
time the switch is conﬁgured according to one of the two
optimal maximum matchings, and transitions between
matchings are fast, even if limited by Property 2. Indeed,
under bidiagonal trafﬁc, the output has a very limited9
 0
 2
 4
 6
 8
 10
 12
 14
 16
10.410 10.415 10.420 10.425 10.430 10.435 10.440 10.445 10.450
M
a
t
c
h
i
n
g
 
s
i
z
e
simulation time [µs]
Total M1 M2
Fig. 9. Total matching size and number of edges belong-
ing to M1 and M2 for an ASY switch under bidiagonal
trafﬁc and trimodal packet size distribution
TABLE 2
Maximum throughput achieved by SYN and ASY
switches with greedy schedulers for normalized input
load ρ = 0.99
System SYN ASY
Scheduler iSLIP-CM iSLIP-PM LQF RR
Uniform trafﬁc 0.99 0.99 0.99 0.99
Bidiagonal trafﬁc 0.88 0.89 0.99 0.99
Logdiagonal trafﬁc 0.84 0.87 0.99 0.98
degree of freedom in choosing the input to be matched.
In ASY switches the output scheduler serves a queue
exhaustively, and it changes its input only when another
output becomes free. This induces a “chain” reaction
among the ports.
Finally, Fig. 10 shows the delays under logdiagonal
trafﬁc. The qualitative behavior is the same as under
bidiagonal trafﬁc and still iSLIP achieves only 0.84 and
0.87 throughput in CM and PM respectively, as shown in
Table 2, whereas ASY-LQF and ASY-RR achieve almost
 0.1
 1
 10
 100
 1000
 1  2  3  4  5  6  7  8  9  10
A
v
e
r
a
g
e
 
d
e
l
a
y
 
[
µ
s
]
Load [Gbps]
SYN-iSLIP-PM
SYN-MWM-PM
SYN-iSLIP-CM
SYN-MWM-CM
ASY-LQF
ASY-RR
Fig. 10. Average packet delay under logdiagonal trafﬁc
for the trimodal packet size distribution
 0.86
 0.88
 0.9
 0.92
 0.94
 0.96
 0.98
 1
 0.1  1
T
h
r
o
u
g
h
p
u
t
α
SYN-RND-CM
SYN-RND-PM
ASY-RND
SYN-iSLIP-CM
SYN-iSLIP-PM
ASY-RR
Fig. 11. Maximum throughput under bidiagonal trafﬁc vs
packet length variation coefﬁcient, with N = 16.
 0.92
 0.93
 0.94
 0.95
 0.96
 0.97
 0.98
 0.99
 2  4  6  8  10  12  14  16
T
h
r
o
u
g
h
p
u
t
Number of ports
ASY-RND
ASY-RR
ASY-LQF
Fig. 12. Maximum throughput vs. the number of ports N
for an ASY VOQ switch under bidiagonal trafﬁc and α = 5
the maximum throughput.
Fig. 11 investigates the effect of α under bidiagonal
trafﬁc. We considered the RND and RR schedulers in
ASY switches and the corresponding schedulers in SYN
switches, in both CM and PM versions. ASY switches are
always outperforming SYN switches, for any α ≤ 5. It
is also worth to note that ASY, for ﬁxed sized packets
(α = 0), achieves almost the maximum throughput.
The small throughput loss is, in any case, smaller than
the average 10% loss due to packet segmentation (see
Sect. 2.2). Looking at the detailed behaviors of the dif-
ferent algorithms, consistently with Fig. 3, obtained with
a single FIFO per input, the throughput decreases for
larger α.
Finally, in Fig. 12 we investigate the behavior of ASY
switches as function of their number of ports N, in the
case of a very large packet-length variance (α = 5).
This value is taken more than twice than the maximum
observable in standard Ethernet networks (see the dis-
cussion of (5)) to highlight extreme starvation problems.
The throughput reduction is still limited (less than 5-
8%) and is more relevant for larger switches, since the10
temporary starvation due to long packet can delay the
transfer of packets from the other N − 1 ports.
5 BUFFERLESS SWITCHES
We focus now on bufferless architectures. Although they
are not a practical solution, especially in the electronic
domain, due to their high loss probability, there are two
main reasons why they are considered in the paper.
First, since optical memories are costly, complex and
power hungry, bufferless switches may be considered
as an interesting alternative in the case of packet or
bust switched optical networks, especially if exploiting
contention resolution in the wavelength domain [24].
Second, they permit to analyze switch performance ac-
cording to a trafﬁc proﬁle which is not distorted by the
queueing stage introduced before the switching fabric.
In bufferless switches fed by variable-length packets,
the scheduling policy is very simple: each output waits
for the ﬁrst cell of a packet and then serves the whole
packet exhaustively, according to the PM policy. Note
that packet interleaving cannot be allowed, because of
the lack of input queues.
5.1 Synchronous switch
The throughput obtained in a bufferless SYN switch un-
der uniform Bernoulli i.i.d. arrivals is around 58% [25].
We show here that a similar result holds also for a SYN
switch operating in PM, under uniform and variable-
length packets:
Property 3: Under uniform ON-OFF arrivals, a buffer-
less SYN-PM switch achieves a throughput in saturation
TSY N: 0.5 ≤ TSY N ≤ 1 − e−1 ≈ 0.63 for N → ∞.
Proof: Thanks to the memory-less geometric packet
length distribution, the average load ρ corresponds also
to the probability of observing a cell of a packet in
a generic timeslot. Now, following the same reasoning
of [25], we can deﬁne X as the number of cells arrived
in a generic timeslot and directed to a speciﬁc output.
Then, X follows a binomial distribution with parameters
(N,ρ/N) and the average throughput, in terms of cells,
is
P(X ≥ 1) = 1 − P(X = 0) = 1 −
￿
1 −
ρ
N
￿N
(8)
which converges to 1−e−1 ≈ 63% for ρ = 1 and N → ∞.
The PM scheduler transfers all the cells of a complete
packet in subsequent timeslots. After completing the
transmission of a packet, the output must wait for the
ﬁrst cell of a new packet and discard the incoming cells
of incomplete packets. Hence, (8) provides an upper
bound on the actual throughput, which can be evaluated
only in terms of cells belonging to complete packets. Fur-
thermore, for any ﬁxed N and large enough packet sizes,
the probability that packets start and end at the same
timeslot goes to zero. Hence, the system behavior tends
to the ASY one, for which we will prove (Property 4) that
the throughput is 0.5 under uniform ON-OFF arrivals.
Note that both bounds in the Property are tight: the
lower bound is achieved for average packet sizes very
large with respect to the cell size, whereas the upper
bound is achieved for average packet sizes equal to the
cell size (i.e. one packet lasts one cell).
5.2 Asynchronous switch
In an ASY switch, when a packet arrives and its destina-
tion is currently busy in transferring another packet, the
packet is lost; otherwise the destination output becomes
busy for a duration equal to the packet transmission
time.
Referring to the trafﬁc model deﬁned in Sec. 2.5, we
prove:
Property 4: Under uniform ON-OFF arrivals, a buffer-
less ASY-PM switch achieves a throughput in saturation,
for N → ∞, equal to TASY = 1/2.
Proof: The instants corresponding to any new packet
arrival deﬁne a renewal process at each input. We focus
on the renewal process of all the packet arrivals destined
to a given output y. Since the trafﬁc is admissible, in
saturation the renewal process at each input has rate
equal to (NmL)−1. Now, thanks to the superposition
limit theorem [12], for N → ∞, the aggregate arrival
process seen by output y is Poisson at rate m
−1
L , i.e. with
interarrival times given by an exponential distribution
with average mL, independently from the distribution of
L. Thanks to the memory-less property, when an input
ends transmitting a successful packet to output y, it has
to wait on average mL before a new packet becomes
available for y at any input. This implies that, on average,
mL is spent to transmit a successful packet and mL is
also spent to wait for a new packet. Thus, the throughput
in saturation is 0.5.
This approximated analysis holds for any packet dis-
tribution with ﬁnite average, and it is quite accurate
also for small values of N, as it can be seen in Fig. 13,
which shows the performance obtained by simulating
the chain for different values of N. These results have
been observed to be independent of the (ﬁnite) variance
of the packet duration.
By comparing Property 3 with Property 4, it is clear
that a small throughput degradation is experienced by a
bufferless ASY switch with respect to a SYN switch.
6 CONCLUSIONS
We compared the performance of SYN and ASY switches
for variable-size packet arrivals, considering (i) buffer-
less switches, (ii) IQ switches with a single FIFO queue
per input and (iii) IQ switches with VOQs. We show
that ASY switches experience either a small performance
degradation or better performance with respect to SYN
switches, even if not considering the segmentation over-
head in SYN switches. We also show that one of the
key trafﬁc parameters affecting the performance is the
variation coefﬁcient of the packet size, which is usually11
 0.5
 0.51
 0.52
 0.53
 0.54
 0.55
 0.56
 0.57
 4  8  16  32  64  128
T
h
r
o
u
g
h
p
u
t
N
Fig. 13. Throughput obtained in a bufferless ASY switch
with ON-OFF packet arrivals
small in realistic scenarios. Because of the small com-
plexity of ASY schedulers, we believe that the exploiting
an asynchronous architecture leads to more efﬁcient
switching architectures.
REFERENCES
[1] S. Yoo, “Energy efﬁciency in the future internet: The role of optical
packet switching and optical-label switching,” Selected Topics in
Quantum Electronics, IEEE Journal of, vol. 17, no. 2, pp. 406 –418,
march-april 2011.
[2] A. Bianco, D. Cuda, P. Giaccone, and F. Neri, “Asynchronous vs
synchronous input-queued switches,” in IEEE GLOBECOM 2010,
Dec. 2010, pp. 1 –5.
[3] M. Karol, M. Hluchyj, and S. Morgan, “Input versus output
queueing on a space-division packet switch,” Communications,
IEEE Transactions on, vol. 35, no. 12, pp. 1347–1356, Dec 1987.
[4] R. Birke, M. Mellia, M. Petracca, and D. Rossi, “Understanding
voip from backbone measurements,” in INFOCOM, 2007, pp.
2027–2035.
[5] K. Imran, M. Mellia, and M. Meo, “Measurements of multicast
television over ip,” in LANMAN, 2007, pp. 2027–2035.
[6] M. Ajmone Marsan, A. Bianco, P. Giaccone, E. Leonardi, and
F. Neri, “Packet-mode scheduling in input-queued cell-based
switches,” Networking, IEEE/ACM Transactions on, vol. 10, no. 5,
pp. 666–678, 2002.
[7] S. J. B. Yoo, “Optical packet and burst switching technologies
for the future photonic internet,” Lightwave Technology, Journal of,
vol. 24, no. 12, pp. 4468 –4492, dec. 2006.
[8] P. Hansen, S. Danielsen, and K. Stubkjaer, “Optical packet switch-
ing without packet alignment,” in Optical Communication, 1998.
24th European Conference on, vol. 1, sep 1998, pp. 591 –592 vol.1.
[9] “Omnet++ community site,” http://www.omnetpp.org/.
[10] J. J. A. Moors and J. Muilwijk, “An inequality for the variance of a
discrete random variable,” Sankhy: The Indian Journal of Statistics,
vol. 33, no. 3/4, pp. 385–388, Dec. 1971.
[11] S.-Q. Li, “Performance of a nonblocking space-division packet
switch with correlated input trafﬁc,” Communications, IEEE Trans-
actions on, vol. 4, no. 1, Jan 1992.
[12] K. Sriram and W. Whitt, “Characterizing superposition arrival
processes in packet multiplexers for voice and data,” Selected Areas
in Communications, IEEE Journal on, vol. 4, no. 6, Sep 1986.
[13] S. Fuhrmann, “Performance of a packet switch with crossbar
architecture,” Communications, IEEE Transactions on, vol. 41, no. 3,
pp. 486–491, Mar 1993.
[14] D. Manjunath and B. Sikdar, “Variable length packet switches:
Delay analysis of crossbar switches under poisson and self similar
trafﬁc,” in INFOCOM, 2000, pp. 1055–1064.
[15] I. Iliadis, “Synchronous versus asynchronous operation of a
packet switch with combined input and outpur queueing,” Per-
formance Evaluation, vol. 16, no. 1-3, pp. 241–250, 1992.
[16] N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand,
“Achieving 100% throughput in an input-queued switch,” Com-
munications, IEEE Transactions on, vol. 47, no. 8, pp. 1260–1267,
Aug 1999.
[17] L. Tassiulas, “Linear complexity algorithms for maximum throug-
put in radio networks and input queued switches,” in INFOCOM,
1998, pp. 533–539.
[18] P. Giaccone, B. Prabhakar, and D. Shah, “Towards simple, high-
performance schedulers for high-aggregate bandwidth switches,”
in INFOCOM, 2002.
[19] Y. Ganjali, A. Keshavarzian, and D. Shah, “Cell switching ver-
sus packet switching in input-queued switches,” Networking,
IEEE/ACM Transactions on, vol. 13, no. 4, pp. 782–789, 2005.
[20] G. Passas and M. Katevenis, “Asynchronous operation of buffer-
less crossbars,” in HPSR, June 2007, pp. 1–6.
[21] N. McKeown, “The islip scheduling algorithm for input-queued
switches,” Networking, IEEE/ACM Transactions on, vol. 7, no. 2, pp.
188–201, 1999.
[22] L. Tassiulas and P. Bhattacharya, “Allocation of interdependent
resources for maximal throughput,” Stochastic Models, vol. 16,
no. 1, pp. 27–48, 2000.
[23] J. Dai and W. Lin, “Maximum pressure policies in stochastic
processing networks,” Operations Research, vol. 53, no. 2, pp. 197–
218, 2005.
[24] A. Stavdas, A. Bianco, A. Pattavina, C. Raffaelli, C. Matrakidis,
C. Piglione, C. Politi, M. Savi, and R. Zanzottera, “Performance
evaluation of large capacity broadcast-and-select optical crosscon-
nects,” Optical Switching and Networking, vol. 9, no. 1, pp. 13–24,
2012.
[25] J. H. Patel, “Performance of processor-memory interconnections
for multiprocessors,” Computers, IEEE Transactions on, vol. 30,
no. 10, 1981.