858 research outputs found

    Minimizing the communication overhead of iterative scheduling algorithms for input-queued switches

    Get PDF
    Communication overhead should be minimized when designing iterative scheduling algorithms for input-queued packet switches. In general, the overall communication overhead is a function of the number of iterations required per time slot (M) and the data bits exchanged in an input-output pair per iteration (B). In this paper, we aim at maximizing switch throughput while minimizing communication overhead. We first propose a single-iteration scheduling algorithm called Highest Rank First (HRF). In HRF, the highest priority is given to the preferred input-output pair calculated in each local port at a RR (Round Robin) order. Only when the preferred VOQ(i,j) is empty, input i sends a request with a rank number r to each output. The request from a longer VOQ carries a smaller r. Higher scheduling priority is given to the request with a smaller r. To further cut down its communication overhead to 1 bit per request, we design HRF with Request Compression (HRF/RC). The basic idea is that we transmit a single bit code in request phase. Then r can be decoded at output ports from the current and historical codes received. The overall communication overhead for HRF/RC becomes 2 bits only, i.e. 1 bit in request phase and 1 bit in grant phase. We show that HRF/RC renders a much lower hardware cost than multi-iteration algorithms and a single-iteration algorithm π-RGA [11]. Compared with other iterative algorithms with the same communication overhead (i.e. SRR [10] and 1-iteration iSLIP [6]), simulation results show that HRF/RC always produces the best delay-throughput performance. © 2011 IEEE.published_or_final_versionProceedings of the IEEE Global Telecommunications Conference (GLOBECOM 2011), Houston, TX, USA, 5-9 December 201

    Experimental survey of FPGA-based monolithic switches and a novel queue balancer

    Get PDF
    This paper studies small to medium-sized monolithic switches for FPGA implementation and presents a novel switch design that achieves high algorithmic performance and FPGA implementation efficiency. Crossbar switches based on virtual output queues (VOQs) and variations have been rather popular for implementing switches on FPGAs, with applications in network switches, memory interconnects, network-on-chip (NoC) routers etc. The implementation efficiency of crossbar-based switches is well-documented on ASICs, though we show that their disadvantages can outweigh their advantages on FPGAs. One of the most important challenges in such input-queued switches is the requirement for iterative scheduling algorithms. In contrast to ASICs, this is more harmful on FPGAs, as the reduced operating frequency and narrower packets cannot “hide” multiple iterations of scheduling that are required to achieve a modest scheduling performance.Our proposed design uses an output-queued switch internally for simplifying scheduling, and a queue balancing technique to avoid queue fragmentation and reduce the need for memory-sharing VOQs. Its implementation approaches the scheduling performance of a state-of-the-art FPGA-based switch, while requiring considerably fewer resources

    Towards Terabit Carrier Ethernet and Energy Efficient Optical Transport Networks

    Get PDF

    Feedback-based scheduling for load-balanced two-stage switches

    Get PDF
    A framework for designing feedback-based scheduling algorithms is proposed for elegantly solving the notorious packet missequencing problem of a load-balanced switch. Unlike existing approaches, we show that the efforts made in load balancing and keeping packets in order can complement each other. Specifically, at each middle-stage port between the two switch fabrics of a load-balanced switch, only a single-packet buffer for each virtual output queueing (VOQ) is required. Although packets belonging to the same flow pass through different middle-stage VOQs, the delays they experience at different middle-stage ports will be identical. This is made possible by properly selecting and coordinating the two sequences of switch configurations to form a joint sequence with both staggered symmetry property and in-order packet delivery property. Based on the staggered symmetry property, an efficient feedback mechanism is designed to allow the right middle-stage port occupancy vector to be delivered to the right input port at the right time. As a result, the performance of load balancing as well as the switch throughput is significantly improved. We further extend this feedback mechanism to support the multicabinet implementation of a load-balanced switch, where the propagation delay between switch linecards and switch fabrics is nonnegligible. As compared to the existing load-balanced switch architectures and scheduling algorithms, our solutions impose a modest requirement on switch hardware, but consistently yield better delay-throughput performance. Last but not least, some extensions and refinements are made to address the scalability, implementation, and fairness issues of our solutions. © 2009 IEEE.published_or_final_versio

    High Performance Queueing and Scheduling in Support of Multicasting in Input-Queued Switches

    Get PDF
    Due to its mild requirement on the bandwidth of switching fabric and internal memory, the input-queued architecture is a practical solution for today\u27s very high-speed switches. One of the notoriously difficult problems in the design of input-queued switches with very high link rates is the high performance queueing and scheduling of multicast traffic. This dissertation focuses on proposing novel solutions for this problem. The design challenge stems from the nature of multicast traffic, i.e., a multicast packet typically has multiple destinations. On the one hand, this nature makes queueing and scheduling of multicast traffic much more difficult than that of unicast traffic. For example, virtual output queueing is widely used to completely avoid the head-of-line blocking and achieve 100% throughput for unicast traffic. Nevertheless, the exhaustive, multicast virtual output queueing is impractical and results in out-of-order delivery. On the other hand, in spite of extensive studies in the context of either pure unicast traffic or pure multicast traffic, the results from a study in one context are not applicable to the other context due to the difference between the natures of unicast and multicast traffic. The design of integrated scheduling for both types of traffic remains an open issue. The main contribution of this dissertation is twofold: firstly, the performance of an interesting approach to efficiently mitigate head-of-line blocking for multicast traffic is theoretically analyzed; secondly, two novel algorithms are proposed to efficiently integrate unicast and multicast scheduling within one switching fabric. The research work presented in this dissertation concludes that (1) a small number of queues are sufficient to maximize the saturation throughput and delay performances of a large multicast switch with multiple first-in-first-out queues per input port; (2) the theoretical analysis results are indeed valid for practical large-sized switches; (3) for a large M × N multicast switch, the final achievable saturation throughput decreases as the ratio of M/N decreases; (4) and the two proposed integration algorithms exhibit promising performances in terms of saturation throughput, delay, and packet loss ratio under both uniform Bernoulli and uniform bursty traffic

    On packet switch design

    Get PDF

    On scheduling input queued cell switches

    Get PDF
    Output-queued switching, though is able to offer high throughput, guaranteed delay and fairness, lacks scalability owing to the speed up problem. Input-queued switching, on the other hand, is scalable, and is thus becoming an attractive alternative. This dissertation presents three approaches toward resolving the major problem encountered in input-queued switching that has prohibited the provision of quality of service guarantees. First, we proposed a maximum size matching based algorithm, referred to as min-max fair input queueing (MFIQ), which minimizes the additional delay caused by back pressure, and at the same time provides fair service among competing sessions. Like any maximum size matching algorithm, MFIQ performs well for uniform traffic, in which the destinations of the incoming cells are uniformly distributed over all the outputs, but is not stable for non-uniform traffic. Subse-quently, we proposed two maximum weight matching based algorithms, longest normalized queue first (LNQF) and earliest due date first matching (EDDFM), which are stable for both uniform and non-uniform traffic. LNQF provides fairer service than longest queue first (LQF) and better traffic shaping than oldest cell first (OCF), and EDDEM has lower probability of delay overdue than LQF, LNQF, and OCF. Our third approach, referred to as store-sort-and-forward (SSF), is a frame based scheduling algorithm. SSF is proved to be able to achieve strict sense 100% throughput, and provide bounded delay and delay jitter for input-queued switches if the traffic conforms to the (r, T) model
    corecore