164 research outputs found

    Delivering 100% throughput in a Buffered Crossbar with Round Robin scheduling

    Get PDF

    Size-Based Flow Scheduling in a CICQ Switch

    Get PDF
    In the context of flow-aware networking, size-based (SB) scheduling policies have been shown to improve response times of small flows, without degrading the performance of large flows. But these differentiating policies are designed for Output-queued (OQ) switch architecture, which is known to have scalability issues. On the other hand, the buffered-crossbar (BX) switch architecture is currently being pursued as a potential next-generation scalable switch architecture. This work looks into the problem of performing SB scheduling in BX switches. In particular, the design goals, with respect to each output port, are (i) to transmit high-priority packet(s) as long as there is at least one present, and (ii) to respect the FIFO order among high-priority packets. In this direction, we propose a CICQ switches using a single PIFO queue at each crosspoint to schedule packets according to the priority assigned. pCICQ-1 switch uses a simple design to guarantee that packet-priorities are respected once they are in the crosspoint queues. But it does not maintain the FIFO order of high-priority packets, besides letting a bounded number low-priority packets to depart through an output, when there are one or more high-priority packets for the same output. To solve this, we propose an enhancement in pCICQ-2 switch, that uses a sequence controller to respect packet-priorities as well as arrival order for high-priority packets

    Design and stability analysis of high performance packet switches

    Get PDF
    With the rapid development of optical interconnection technology, high-performance packet switches are required to resolve contentions in a fast manner to satisfy the demand for high throughput and high speed rates. Combined input-crosspoint buffered (CICB) switches are an alternative to input-buffered (IB) packet switches to provide high-performance switching and to relax arbitration timing for packet switches with high-speed ports. A maximum weight matching (MWM) scheme can provide 100% throughput under admissible traffic for lB switches. However, the high complexity of MWM prohibits its implementation in high-speed switches. In this dissertation, a feedback-based arbitration scheme for CICB switches is studied, where cell selection is based on the provided service to virtual output queues (VOQs). The feedback-based scheme is named round-robin with adaptable frame size (RR-AF) arbitration. The frame size in RR-AF is adaptably changed by the serviced and unserviced traffic. If a switch is stable, the switch provides 100% throughput. Here, it is proved that RR-AF can achieve 100% throughput under uniform admissible traffic. Switches with crosspoint buffers need to consider the transmission delays, or round-trip times to define the crosspoint buffer size. As the buffered crossbar switch can be physically located far from the input ports, actual round-trip times can be non-negligible. To support non-negligible round-trip times in a buffered crossbar switch, the crosspoint buffer size needs to be increased. To satisfy this demand, this dissertation investigates how to select the crosspoint buffer size under non-negligible round trip times and under uniform traffic. With the analysis of stability margin, the relationship between the crosspoint buffer size and round-trip time is derived. Considering that CICB switches deliver higher performance than lB switches and require no speedup, this dissertation investigates the maximum throughput performance that these switches can achieve. It is shown that CICB switches without speedup achieve 100% throughput under any admissible traffic through a fluid model. In addition, a new hybrid scheme, based on longest queue-first (as input arbitration) and longest column occupancy first (as output arbitration) is proposed, which achieves 100% throughput under uniform and non-uniform traffic patterns. In order to give a better insight of the feedback nature of arbitration scheme for CICB switches, a frame-based round-robin arbitration scheme with explicit feedback control (FRE) is introduced. FRE dynamically sets the frame size according to the input load and to the accumulation of cells in a VOQ. FRE is used as the input arbitration scheme and it is combined with RR, PRR, and FRE as output arbitration schemes. These combined schemes deliver high performance under uniform and nonuniform traffic models using a buffered crossbar with one-cell crosspoint buffers. The novelty of FRE lies in that each VOQ sets the frame size by an adjustable parameter, Δ(i,j) which indicates the degree of service needed by VOQ(i, j). This value is adjusted according to the input loading and the accumulation of cells experienced in previous service cycles. This dissertation also explores an analysis technique based on feedback control theory. This methodology is proposed to study the stability of arbitration and matching schemes for packet switches. A continuous system is used and a control model is used to emulate a queuing system. The technique is applied to a matching scheme. In addition, the study shows that the dwell time, which is defined as the time a queue receives service in a service opportunity, is a factor that affects the stability of a queuing system. This feedback control model is an alternative approach to evaluate the stability of arbitration and matching schemes

    Strong Performance Guarantees for Asynchronous Buffered Crossbar Schedulers

    Get PDF
    Crossbar-based switches are commonly used to implement routers with throughputs up to about 1 Tb/s. The advent of crossbar scheduling algorithms that provide strong performance guarantees now makes it possible to engineer systems that perform well, even under extreme traffic conditions. Until recently, such performance guarantees have only been developed for crossbars that switch cells rather than variable length packets. Cell-based crossbars incur a worst-case bandwidth penalty of up to a factor of two, since they must fragment variable length packets into fixed length cells. In addition, schedulers for cell-based crossbars may fail to deliver the expected performance guarantees when used in routers that forward packets. We show how to obtain performance guarantees for asynchronous crossbars that are directly comparable to those previously developed for synchronous, cell-based crossbars. In particular we define derivatives of the Group by Virtual Output Queue (GVOQ) scheduler of Chuang et al. and the Least Occupied Output First Scheduler of Krishna et al. and show that both can provide strong performance guarantees in systems with speedup 2. Specifically, we show that these schedulers are work-conserving and that they can emulate an output-queued switch using any queueing discipline in the class of restricted Push-In, First-Out queueing disciplines. We also show that there are schedulers for segment-based crossbars, (introduced recently by Katevenis and Passas) that can deliver strong performance guarantees with small buffer requirements and no bandwidth fragmentation

    Multicast scheduling in feedback-based two-stage switch

    Get PDF
    Proceedings of the IEEE Workshop on High Performance Switching and Routing, 2009, p. 28-33Scalability is of paramount importance in high-speed switch design. Two limiting factors are the complexity of switch fabric and the need for a sophisticated central scheduler. In this paper, we focus on designing a scalable multicast switch. Given the fact that the majority traffic on the Internet is unicast, a cost-effective solution is to adopt a unicast switch fabric for handling both unicast and multicast traffic. Unlike existing approaches, we choose to base our multicast switch design on the load-balanced two-stage switch architecture because it does not require a central scheduler, and its unicast switch fabric only needs to realize N switch configurations. Specifically, we adopt the feedback-based two-stage switch architecture [10], because it elegantly solves the notorious packet mis-sequencing problem, and yet renders an excellent throughput-delay performance. By slightly modifying the operation of the original feedback-based two-stage switch, a simple distributed multicast scheduling algorithm is proposed. Simulation results show that with packet duplication at both input ports and middle-stage ports, the proposed multicast scheduling algorithm significantly cuts down the average packet delay and delay variation among different copies of the same multicast packet. Keywords-Feedback-based two-stage switch, scalable multicast switch, load-balanced switch. © 2009 IEEE.published_or_final_versio

    Load balancing and scalable clos-network packet switches

    Get PDF
    In this dissertation three load-balancing Clos-network packet switches that attain 100% throughput and forward cells in sequence are introduced. The configuration schemes and the in-sequence forwarding mechanisms devised for these switches are also introduced. Also proposed is the use of matrix analysis as a tool for throughput analysis. In Chapter 2, a configuration scheme for a load-balancing Clos-network packet switch that has split central modules and buffers in between the split modules is introduced. This switch is called split-central-buffered Load-Balancing Clos-network (LBC) switch and it is cell based. The switch has four stages, namely input, central-input, central-output, and output stages. The proposed configuration scheme uses a pre-determined and periodic interconnection pattern in the input and split central modules to load-balance and route traffic. The LBC switch has low configuration complexity. The operation of the switch includes a mechanism applied at input and split-central modules to forward cells in sequence. The switch achieves 100% throughput under uniform and nonuniform admissible traffic with independent and identical distributions (i.i.d.). The high switching performance and low complexity of the switch are achieved while performing in-sequence forwarding and without resorting to memory speedup or central-stage expansion. This discussion includes both throughput analysis, where the operations that the configuration mechanism performs on the traffic traversing the switch are described, and a proof of in-sequence forwarding. Simulation analysis is presented as a practical demonstration of the switch performance on uniform and nonuniform i.i.d. traffic.In Chapter 3, a three-stage load balancing packet switch and its configuration scheme are introduced. The input- and central-stage switches are bufferless crossbars and the output-stage switches are buffered crossbars. This switch is called ThRee-stage Clos-network swItch and has queues at the middle stage and DEtermiNisTic scheduling (TRIDENT) and it is cell based. The proposed configuration scheme uses a pre-determined and periodic interconnection pattern in the input and central modules to load-balance and route traffic; therefore, it has low configuration complexity. The operation of the switch includes a mechanism applied at input and output modules to forward cells in sequence. In Chapter 4, a highly scalable load balancing three-stage Clos-network switch with Virtual Input-module output queues at ceNtral stagE (VINE) and crosspoint-buffers at output modules and its configuration scheme are introduced. VINE uses space switching in the first stage and buffered crossbars in the second and third stages. The proposed configuration scheme uses pre-determined and periodic interconnection patterns in the input modules for load balancing. The mechanism applied at the inputs, used to forward cells in sequence, is also introduced. VINE achieves 100% throughput under uniform and nonuniform admissible i.i.d. traffic. VINE achieves high switching performance, low configuration complexity, and in-sequence forwarding without resorting to memory speedup. In Chapter 5, matrix analysis is introduced as a tool for modeling, describing the internal operations, and analyzing the throughput of a packet switch
    • …
    corecore