Abstract-Optical buffering is fundamental to contention resolution in optical networks. The current works on this line mainly focus on the emulation of dedicated input/output buffer queue by using switched fiber delay lines (SDL). It is notable that the shared buffer queue, where a common buffer pool is shared by all the input/output ports of a switch, has the potential to significantly reduce the overall buffer capacity requirement. As far as we know, however, no related work is available yet on the exact emulation of a shared optical buffer queue with SDLs.
I. INTRODUCTION
A LL-OPTICAL 1 switching is attractive since it can eliminate the quite expensive optical-electronic-optical conversions and help us to make good use of the enormous bandwidth of optical networks. Time sliced (synchronous) optical switching is a simple yet cost-effective technology for implementing all-optical packet switching [1] - [5] , where optical buffers are required to resolve packets contention. Since optical-RAM is not available yet, the optical fiber delay line (FDL) is usually adopted to implement the buffering function. Unlike the traditional electronic memories with random access, a packet entering a FDL must propagate for a fixed amount of time and can not be retrieved anytime earlier.
As such, designing FDL-based optical buffers with the same throughput and delay performance as that of the electronic ones is still a challenging issue now.
Basically, we have three possibilities for packet buffering in a switch, namely input buffer queuing (buffering packets at the input side), output buffer queuing (buffering packets at the output side) and shared buffer queuing (buffering packets internally) [6] , [7] . Early works on the construction of optical buffer with switches and fiber delay lines (SDL) mainly focus on the feasibility study of such constructions, see, for example, the shared-memory optical ATM switch by Karol [8] , CORD (contention resolution by delay lines) by Chlamtac et al. [3] , COD (cascaded optical delay line) by Cruz et al. [2] and SLOB (switch with large buffer) by Hunter et al. [4] . Recently, C. S. Chang et al. demonstrated that it is possible for us to exactly emulate various optical buffer queues with SDL [9] - [15] . These works have been successful in implementing the optical counterparts of input buffer queue and output buffer queue, and some typical implementations among them are as follows.
• 1-to-1 FIFO queue. Via a concatenation of the 2 × 2 feedback switch elements, an interesting construction of 1-to-1 FIFO queue was suggested in [10] . Such an optical FIFO queue can achieve a buffer size of = 2 − 1 by using 2 switch elements and 3 ⋅ 2 −1 − 2 fiber length.
• 2-to-1 FIFO Multiplexer. A multiplexer works like a 'hopper', i.e., it always has a departure packet whenever it is nonempty. It has been proved in [14] that an ( + 2) × ( + 2) feedback switch has the capability to emulate a 2-to-1 FIFO multiplexer of size (2 ).
• 1-to-1 Priority queue. In a priority queue, the packet with the highest priority is always sent to output link when a departure request comes, while the packet with the lowest priority will be dropped whenever buffer overflow happens. The recent research indicated that the ( +1)×( +1) feedback switch can be used to emulate an 1-to-1 priority queue of size ( 3 ) [13] .
The shared buffer queuing, which consists of a common memory shared by all the inputs and outputs, is another attractive structure for implementing electronic ATM switches and IP routers [7] , [16] . In comparison with the input/output buffer queuing built on SDL, the corresponding shared buffer queuing structure has the potential to significantly reduce both the buffer capacity requirement and switch size 2 . However, no work is available yet on how to use SDL to exactly emulate shared buffer queue, which is the focus of this paper. Our main finding is that, by applying a slightly modified switching strategy to the feedback switch system proposed in [12] , such a system can actually work as an 1-to-2 optical shared buffer queue. This result is further extended to the -to-2 case with inputs. The work of this paper lays the foundation for the general design of share buffer queue, and the 1-to-2 (and also 2 In SDL based buffer queue, one FDL requires one dedicated switch port. -to-2) modules to be examined have the potential to serve as the basic building blocks for the future construction of largescale shared buffer structures.
The remainder of this paper is organized as follows: Section II presents the definition of 1-to-2 shared buffer queue, and Section III provides a construction for it via SDL. In Section IV, a lower bound on the buffer size of our construction is derived. Finally, this construction is further extended to the more general -to-2 case in Section V.
II. PRELIMINARIES
In this section, we first formally define an 1-to-2 shared buffer queue and then introduce a trivial construction of it.
A. 1-to-2 shared buffer queue
To simplify the design and operation of optical buffer queue, we assume that the time is sliced and synchronized, i.e., the boundaries of arrival packets are aligned with their corresponding time slots. To implement synchronization, we need the function of adjusting packets arrival time, which can be accomplished by using packet recognizer and a set of delay lines [17] - [19] . Without loss of generality, we further assume that the packet size is fixed, a packet can be transmitted within one time slot, and the length of a delay line is equal to an integer number of time slots.
Based on these assumptions, an 1-to-2 shared buffer queue can be formally defined as follows.
Definition 1 (1-to-2 shared buffer queue): An 1-to-2 shared buffer queue is a network element that has one input link, two control inputs, two output links for departure packets, and one output link for lost packets due to buffer overflow (as illustrated in Fig. 1 ). For time and output link , ∈ {0, 1}, we introduce the following notations: ( ): the set of arrival packet destined for output link at the time. (P1) Flow conservation: an arriving packet from the input link is either stored in the buffer or transmitted through one output link, i.e.,
(P2) Non-idling: if one output is enabled, there is always a departing packet from this output if there is at least one packet destined for the output in the buffer or the input, i.e.,
(P3) Maximum buffer usage: an arriving packet is lost only when the buffer is full and the corresponding output of this packet is disabled, i.e.,
(P4) FIFO: packets destined for the same output link depart in the First-In-First-Out (FIFO) order.
B. A simple construction of 1-to-2 shared buffer queue
As we know, the queue with buffer size = 1 can be constructed by using a 2×2 feedback switch element with one fiber delay line of length 1 [10] . Therefore, an 1-to-2 optical shared buffer queue with buffer size can be exactly emulated by using parallel feedback switch elements sandwiched between an 1 × crossbar and a × 2 crossbar (see Fig. 2 ). The newly arrived packets can be sent to any idle feedback switch elements through the 1 × crossbar, and the buffered packets can be read out from feedback switch elements and sent to the corresponding output links through the × 2 crossbar. If all the feedback switches are occupied and there are no departure requests, any newly arrived packets have to be dropped. Since the switch size is almost the same as the buffer size and only the delay lines of length 1 are adopted here, this structure has a high hardware complexity and also involves a large number of packet circulations 3 .
III. AN EFFECTIVE CONSTRUCTION OF SHARED BUFFER QUEUE
In this section, we introduce a more efficient construction of shared buffer queue by exploring the feedback switch architecture and a new packet switching strategy for it.
A. Architecture and switching algorithm
The feedback switch structure has been widely explored for the efficient construction of optical priority queues [11] , [12] . Here, we extended the symmetric feedback switch considered in [12] for the design of an 1-to-2 shared buffer queue. One such structure is illustrated in Fig. 3 , which consists of an ( + 2) × ( + 2) switch fabric, one input port ( 0 ), one lost port ( 0 ), two output ports ( 0 and 1 ), two control inputs ( 0 and 1 ) and delay lines connecting outputs back to inputs of the switch fabric.
To schedule packets properly inside the switch, all packets are classified into two flows ( 0 and 1 ) based on their destined output ports. The packets of one flow are assigned with priorities according to their arrival time, such that they depart in the FIFO order. To ensure that the oldest packet of each flow is always reachable in each time slot, the packet scheduling should satisfy the following rule:
(R1) The packet with priority can never be switched to a delay line with length longer than .
By applying the rule (R1) to the switch system, in each time slot the set of packets available at the inputs of the switch will contain the highest priority packets of both flows 0 and 1 . These packets are further sorted by two dedicated sorters (each for one flow) according to the order of their priorities. After sorting, the highest priority packet of a flow will be directly sent out if the corresponding output port is enabled, and the 3 Packet circulation through switch fabric causes significant attenuation of optical signal [5] , [20] . rest packets of this flow are sent to consecutive fiber delay lines in sequence, following the descending order of their priorities; On the other hand, if the output port is disabled, all the sorted packets of this flow need to be rebuffered and will be sent to fiber delay lines in the same way above. To share the common buffer properly among two flows, we assign their packets to the fiber delay lines in the back-to-back manner, as illustrated in Fig. 3 . More formally, the switching algorithm adopted here can be summarized as following:
Classify all the packets appearing at the + 1 inputs of switch into two sets according to their output ports. Denote by the set of packets belonging to 0. Denote by the set of packets belonging to 1. Sort packets in based on their priorities. Sort packets in based on their priorities. if 0( ) == 1 then
Remove the highest priority packet from and send it to Output 0.
Decrease the priority of all packets belonging to 0 in the system by 1.
Remove the highest priority packet from and send it to Output 1.
Decrease the priority of all packets belonging to 1 in the system by 1. Send packets in to FDL1, FDL2, . . . following the descending order of their priorities. Send packets in to FDL , FDL −1, . . . following the descending order of their priorities.
According to the above switching algorithm, if a packet of 0 is sent to the ( + 1) ℎ delay line, the delay lines indexed from 1 to must have been occupied by packets of the same flow with higher priorities. Similarly, for a packet of 1 , if it is assigned to the ( − ) ℎ delay line, there must be packets of this flow with higher priorities in the delay lines indexed from to + 1 − . Therefore, based on (R1), the length of ℎ delay line can be set as
It is notable that at most packets can be simultaneously inserted into the fiber delay lines, so conflict may happen if + 1 packets come to the inputs of the switch at the same time slot (i.e., one newly arrival packet to the input port and feedback packets from fiber delay lines). Definition 2: (Conflict) For the switch system in Fig. 3 , we say it is in conflict if at one time slot there are + 1 packets at the inputs of the switch but no departure is requested for any of them.
For an exactly emulation of 1-to-2 share buffer queue based on the switch system in Fig. 3 , we need to determine the condition under which the conflict can never happen.
B. Buffer size
To avoid conflict, the number of accommodated packets in the system must be constrained such that the acceptance of any newly arrival packet will not lead the system into the state of conflict defined above. For this purpose, we introduce the following definition.
Definition 3: (Buffer size ) For the 1-to-2 shared buffer queue construction in Fig. 3 with scheduling scheme (S1) and delay line setting in Equation (5), we define its buffer size as the maximum buffer capacity the construction can provide so that conflicts will never happen under any scenario.
Let and denote the set of packets buffered for flows 0 and 1 , so | | and | | refer to the total number of packets buffered for two flows, respectively. We further use x and y to denote the number of packets in and that appear at the inputs of switch and require for buffering. Then we can construct a packet conflict set as = {( , )| , conflict one another, i.e., x + y > } (6) Now, the buffer size is actually determined as
Obviously, the construction in Fig. 3 can exactly emulate any 1-to-2 shared buffer queue with buffer size not larger than .
Since a delay line with delay length can accommodate packets, one may wonder if = ∑
=1
for the construction in Fig. 3 . We use a counterexample here to show that when = 3, the buffer size of the construction in Fig. 3 is 3 although the total length of delay lines is 4 there.
Example 1: For the construction in Fig. 3 with delay lines ( = 3 here), the Slot Transition Table in Fig. 4 is adopted to illustrate the slot occupation state (i.e., the first slot states) of these delay lines after each switching operation [1] . Let (resp. ) be the ℎ arrival packet of 0 (resp. 1 ). Assume that the construction started from an empty system and there was no departure request from to + 4. As illustrated in Fig. 4 , the packet 1 arrived at time slot and was sent to FDL 1 , while the packet 1 arrived at time slot + 1 and was sent to FDL 3 . At time slot + 2, the packet 2 arrived and had to be sent to FDL 2 of length 2 according to the switching algorithm (S1). At time slot + 4, the packets 1 , 2 and 1 emerged from the outputs of delay lines and a new packet 3 arrived simultaneously. Since the output ports were sill disabled at + 4, to avoid conflict, 1 , 2 and 1 were rebuffered in FDL 1 , FDL 2 and FDL 3 , but the packet 3 had to be dropped. Thus, the buffer size of this construction is no more than 3 (please refer to Equation (7)). On the other hand, given that there are 3 distinct delay lines in this construction, it can always accommodate at least 3 packets without conflict. Therefore, the buffer size of this shared buffer queue construction is just 3.
In the following, we will establish a lower bound * on the buffer size of the 1-to-2 shared buffer queue construction in Fig. 3 , such that this construction can be used to exactly emulate an 1-to-2 shared buffer queue with size * . The following lemma is going to be used throughout the paper (we omit the proof due to its simplicity).
Lemma 1: For any positive integer , we have
IV. A LOWER BOUND OF BUFFER SIZE
From Equation (7) we can see that if we know how many packets have already been buffered in the network when conflict happens, then the buffer size can be determined by finding the minimum number of buffered packets among all possible conflict scenarios. It is notable, however, that although we can easily find all the conflict scenarios at the inputs of delay lines, it is very hard to determine all the possible sets of packets (and thus the number of packets) stored in the network for one given conflict scenario. Hereafter, we derive a lower bound of buffer size, which can serve as a sufficient condition for the construction of non-conflict shared buffer queues.
A. Buffered packets of one flow
For convenience, let "rebuffered packets" denote those packets that emerge from the delay lines and are rebuffered in the network after sorting. In the following, we first establish a lemma (Lemma 2) regarding one important property of the construction in Fig. 3 , then we determine in Lemma 3 that if there exist rebuffered packets of one flow, at least how many packets from the same flow must have been buffered in the delay lines.
Lemma 2: For the construction in Fig. 3 , suppose that at a given time we see ( ≤ ) rebuffered packets that belong to the same flow and come from consecutive FDLs, starting with the shortest delay line. Then we know that at least * ( ) packets (including these rebuffered packets) of this flow are currently being buffered in the network, where * ( ) is given by:
Proof: This lemma will be proved with the help of the slot transition table. For ease of comprehension, we first consider one example of the switch system with = 7. We assume that at time , there are four packets of one flow that emerged from FDL 1 to FDL 4 and are now rebuffered in these delay lines. Then the delay line occupation state at this time can be illustrated by the slot transition table in Fig. 5 (a.1) . Since these four packets came from FDL 1 to FDL 4 respectively, they must have been inserted into their corresponding delay lines at time slot − 1, − 2, − 3 and − 4, as illustrated in Fig. 5  (a.2) . Let us first consider the time slot − 4. As 4 occupied FDL 4 at this time, from the switching algorithm (S1) we know that the delay lines FDL 3 , FDL 2 and FDL 1 must have already been occupied by three other packets with higher priorities. We denote these three packets by 43 , 42 and 41 , as shown in Fig. 5 (a.3) . Similarly, we can deduce that two packets (marked as 32 and 31 ) with priorities higher than that of 3 must have occupied FDL 2 and FDL 1 at time slot − 3, and one packet (marked as 21 ) with priority higher than that of 2 must have occupied FDL 1 at time slot −2. Notice that the FDL can delay a packet for time slots only, so the packets 43 and 32 would emerge from FDL 3 and FDL 2 at time slot − 1. Since the departure request can come at any time slot, it may happen that at time slot − 1 one packet (assume 43 here) will depart from the output link, as shown in Fig. 5 (a.4) . Thus, we can conclude that by time slot at least five packets of the same flow are buffered in the system, namely 1 , 2 , 3 , 4 and the packet 32 in FDL 2 . Extending the idea of this example, we can get the following general results.
(a) 0 < ≤ ⌈ 
To explain the condition in this scenario, we adopt another example shown in Fig. 5 (b) , where = 5 (> ⌈7/2⌉) packets that emerged from FDL 1 to FDL 5 are rebuffered at time slot . Similar to the above scenario, by checking the − ℎ ( ∈ [1, ]) time slot in Fig. 5 (b.3) , we can see that at least packets were inserted into delay lines at this slot. To determine at least how many packets have been 
) packets belonging to set 2, as shown in Fig. 5 (b.4) . Based on the packets in set 1 and set 2, we can determine the number of packets buffered in the system by time slot as:
In particular, when = ⌈ 2 ⌉ , the Equation (10) reduces to Equation 
For the case when is even, the proof is similar and we only give the finial results as follows: Case 3.
is even and ≤ ⌊
In particular, when = 2 , the Equation (13) reduces to Equation (9), so *
Case 4. is even and 2 + ⌈
Combining the above results together, the Equation (8) follows.
The above analysis focuses on the case that all the rebuffered packets come from consecutive delay lines (from delay line 1 to delay line ) only. In practice, however, the rebuffered packets in the feedback construction may come from non-consecutive delay lines as well. Take the slot transition table in Fig. 6 as an example ( = 11), where six packets of one flow are rebuffered in the delay lines from FDL 1 to FDL 6 at time slot . From Fig. 6 we can see that if five of these packets came from the 1 to 5 ℎ consecutive delay lines but another packet came from the 7 ℎ delay line, there should be at least * (5) + 2 = 9 packets buffered by slot , including the two packets inserted into FDL 6 and FDL 7 at time slot − 5. On the other hand, if all these six rebuffered packets came from the 1 to 6 ℎ consecutive delay lines, we know from Lemma 2 that at least * (6) = 10 packets have been buffered at slot . This example indicates that the number of buffered packets in the non-consecutive case may be less than that of the consecutive case. This is due to the "packets overlap" phenomena in the non-consecutive case, i.e., two packets may feedback through two delay lines of same length. The observation in above example leads to the following lemma. Lemma 3: For the construction in Fig. 3 , if there are ( ≤ ) rebuffered packets of the same flow at one time, then the number of buffered packets (including these rebuffered packets) of this flow is at least * ( ) by this time, where When is odd
here ⌈ ⌋ denotes the integer that is most close to . Proof: See Appendix A.
B. A lower bound of buffer size
Since the number of buffered packets * ( ) derived in Lemma 3 serves as a lower bound of actually buffered packets of one flow, in this section we apply * ( ) (instead of | | and | | in (7)) to obtain a lower bound on the buffer size. Notice that if there are packets of flow 0 occupying delay lines indexed from 1 to (0 ≤ ≤ ⌈ 2 ⌉ ), there must be − packets of flow 1 occupying other delay lines after sorting. Therefore, a lower bound * on buffer size can be achieved by solving the following integer linear programming problem: * = min (
Subject to:
is an integer (18) By finding the close form solution of above problem, the following Theorem follows (please refer to Appendix B for the proof).
Theorem 1:
The construction in Fig. 3 is an 1 × 2 shared buffer queue with buffer size ≥ * , where
, if is even (19) Since the construction can accommodate at most ∑ =1 packets, the setting in Equation (5) implies that a shared buffer queue with at most ( 2 ) buffer size can be constructed via the ( + 2) × ( + 2) switch. Then we have the following corollary from the conclusion in Theorem 1:
Corollary 1: The construction in Fig. 3 can exactly emulate a shared optical queue with ( 2 ) buffer size. In comparison with the simple construction in Fig. 2 , where the achievable buffer size is the same as the number of delay lines , our feedback construction in Fig. 3 can provide a much larger buffer size ( ( 2 )). A further comparison among the implementations of different optical queues is shown in Table I .
C. Practical considerations
Now we investigate some practical issues for building an 1-to-2 shared buffer queue. The core of our construction is an ( + 2) × ( + 2) optical switch fabric, which in principle can be implemented by a non-blocking optical space switch. Two promising switching technologies for building high speed optical switch are arrayed-waveguide-granting (AWG) and directional-coupler (DC), because they can switch at the speed of nanoseconds and thus are suitable for supporting packet switching inside the shared buffer queue [21] , [22] . To build a large size switch fabric, the multistage switch architectures like Clos and Cantor are usually adopted to achieve a good scalability (The recent advances on the high speed optical switch design can be found in [23] , [24] and the references there in).
In an 1-to-2 shared buffer queue, it may happen that a packet needs to be recirculated many times in the system, since the departure time of the packet can not be determined in advance. Therefore, the optical signal may be significantly attenuated after many times of circulation in the system, and thus the optical amplifiers are necessary at the output ports of system for compensating the signal loss. It is notable, however, that the optical amplifier will introduce additional accumulated spontaneous emission noise and signal-level fluctuation [20] . Also, the crosstalk introduced by switch devices (such as AWG and DC) will further degrade the signal and increase the bit error rate [15] . How these constrains limit the maximum buffering time in and the practical design of shared buffer queue still deserve deliberate studies. II  BUFFER CAPACITY OF -TO-2 SHARED BUFFER QUEUE ACHIEVED BY THEOREM 2   M  1  2 3  8  15  32  63  128  255  512  1023  N=1  1  2 3  8  18  89  383  1587  6450  26609  104447  N=2  1  2 3  8  16  84  372  1563  6400  25908  104244  N=5  1  2 3  8  15  71  339  1491  6253  25607  103635 Fig . 7 . An -to-2 construction of shared buffer queue.
V. THE CONSTRUCTION OF -TO-2 SHARED BUFFER QUEUE
In this section, we extend the result in Section IV to the more general -to-2 shared buffer queue with input links, 2 control inputs, 2 departure links and lost links (see Fig.  7 ). At time , let ( ) be the states of the control inputs, ( ) be the set of arrival packets and ( ) be the set of departure packets at this time. Similar to the construction of the 1-to-2 shared buffer queue, the -to-2 construction also needs to satisfy the same properties of flow conservation (P1), nonidling (P2) and FIFO (P4). About the maximum buffer usage property (P3), it should be replaced by the following property (P3 ′ ) due to the multiple inputs here:
To guarantee the above properties, the -to-2 construction still adopts the same switching algorithm (S1) and delay setting (5) as that of its 1-to-2 counterpart. Here, we also explore a lower bound on the buffer size of the -to-2 shared buffer queue to guarantee no conflict for it under any scenario.
Since in one time slot up to packets (denoted as 1 , 2 ,. . . , here) may arrive in the queue, without loss of generality we only need to consider the following two conflict cases: Case 1. All packets 1 , . . . , are destined for output link 1 Assume that at time slot , after scheduling the packet is dropped to avoid conflict and packets 1 ,. . . , −1 are assigned to delay lines ( + 1), . . . , ( + − 1) (0 ≤ ≤ − + 1). Based on the Lemma 3 we know that at least * ( ) + * ( − ( + − 1)) + − 1 packets have been buffered in the system by this time. Case 2. (resp. − ) packets are destined for output link 1 (resp. 2), 1 ≤ < If none of those arrival packets is dropped at the inputs, after sorting two packets of different flows may compete for one common delay line. Suppose this competition happens at delay line ( + 1) (0 ≤ ≤ − + 1). In this case, only rebuffered packets of 0 , − ( + − 1 rebuffered packets of 1 ) and − 1 newly arrived packets can be inserted into delay lines. Again, we know that at least * ( )+ * ( −( + − 1)) + − 1 packets have been buffered in the system. As the results in above two cases are the same, a lower bound * on the buffer size of the -to-2 shared buffer queue can be obtained by solving a linear programming problem similar to (16) . Thus, we have the following theorem (Please refer to Appendix C for the proof).
Theorem 2:
and ≥ 2, then the construction in Fig. 7 is an × 2 shared buffer queue with buffer size ≥ * , here * =
and
To illustrate the condition developed in Theorem 2, Table  II shows the buffer capacity * for combinations of different and . We can see that for a given , the buffer capacity actually decreases as increases. This is because as increases, more packets may require buffering within one time slot. On the other hand, for a given , the buffer capacity grows monotonously as increases, and the growth of buffer capacity becomes significant when the value of is large enough. We can also see from Table II that when is small (say, less than 20), the lower bound is almost the same as the value of .
VI. CONCLUSIONS
In this paper, we studied the exact emulation of an 1-to-2 FIFO shared buffer queue based on the optical feedback SDL construction. The construction consists of an ( +2)×( +2) space switch and fiber delay lines connecting outputs of the switch fabric back to its inputs. We showed that by setting the length of the ℎ delay line as min( , + 1 − ), = 1, . . . , , such a construction can exactly emulate an 1-to-2 shared buffer queue with ( 2 ) buffer size. We then extended this construction to the more general -to-2 case.
Note that this paper only studied the design of the simple 1-to-2 (and also -to-2) shared buffer queues. How to extend the single stage construction in this paper directly to the general -to-case, and how to use the 1-to-2 (and -to-2) modules studied here as building blocks and apply multistage structures for constructing the -to-shared buffer queues can be some interesting future works.
APPENDIX A PROOF OF LEMMA 3
We consider here the non-consecutive case with packets overlap. First, we separate all the rebuffered packets into two groups: group 1 that contains the rebuffered packets from the first half of delay lines (from 1 to ⌈ 2 ⌉ ), and group 2 contains the rebuffered packets from the second half of delay lines (from ⌈ 2 ⌉ +1 to ). Analogous to the proof of Lemma 2 (Case 1), all the buffered packets in the system now can be divided into two sets: set 1 that contains the buffered packets deduced from the rebuffered packets in group 1, and set 2 that contains the buffered packets deduced from the rebuffered packets in group 2. Thus, a lower bound of buffer size can be achieved by finding the minimum numbers of buffered packets in set 1 and set 2. Without loss of generality, we assume that there are min( ⌈ In this case, there are − rebuffered packets from group 1 and rebuffered packets from group 2. Then we have that the number of buffered packets ( ) in the system satisfies the following inequality,
The minimum value of (22) is attained when = ⌈ −3
The minimum value of (23) 
Case 3. 
Therefore, the minimum number of buffered packets in this case is achieved when all the rebuffered packets come from consecutive delay lines 1 to , which is just
is even. For the sake of brevity, we omit the proof which is very similar to the cases when is odd. Summarizing the above results together, we have Lemma 3.
APPENDIX B PROOF OF THEOREM 1
We solve the linear programming problem in (16) 
, we have * ( )
As the stationary point of the term enclosed within above floor function is = 2 , the minimum value of Case 1 is 
Now, we will find out the minimum value of Function (16) > 0, the above expression indicates that the minimum value of Case 1 is smaller than that of Case 2. Summarizing the above results, we know that when is odd the minimum value of Function (16) is given by (27).
(b)
is even The proof is similar to the above cases when is odd, here we omit it.
APPENDIX C PROOF OF THEOREM 2
Generally, the computation of the lower bound * on buffer size can be expressed as the following integer linear programming problem: * = min ( * ( ) + * ( − − + 1) + − 1) (31)
is an integer (33) When − 1 ≥ , the maximum acceptable buffer size is , so we only need to consider the condition − 1 < in the following proof, i.e., 
Summarizing the above expressions, we know that when is odd the minimum value of Function (31) is given by (38).
(b)
is even We omit the proof since it is quite similar to the proof when is odd. Summarizing the above results together, the Theorem 2 follows.
