1,538 research outputs found

    Detecting Markov Chain Instability: A Monte Carlo Approach

    Get PDF
    We devise a Monte Carlo based method for detecting whether a non-negative Markov chain is stable for a given set of parameter values. More precisely, for a given subset of the parameter space, we develop an algorithm that is capable of deciding whether the set has a subset of positive Lebesgue measure for which the Markov chain is unstable. The approach is based on a variant of simulated annealing, and consequently only mild assumptions are needed to obtain performance guarantees. The theoretical underpinnings of our algorithm are based on a result stating that the stability of a set of parameters can be phrased in terms of the stability of a single Markov chain that searches the set for unstable parameters. Our framework leads to a procedure that is capable of performing statistically rigorous tests for instability, which has been extensively tested using several examples of standard and non-standard queueing networks

    PERFORMANCE ASSESSMENT OF SCHEDULERS IN OPTICAL INTERCONNECTION NETWORKS

    Get PDF
    With ever-increasing demand for high-performance computing systems, interconnection networks, serving as the communication links in multicore architectures have become a key element for guaranteeing the system performance. Compared with bandwidth-limited power hungry electrical interconnection networks, optical integrated interconnection networks also referred to as networks-on-chip (ONoC) architectures are emerging as a promising alternative to enable future computing performance. In ONoC architectures, scheduling algorithms are necessary for avoiding packet collisions while achieving high throughput, low latency, and good fairness. Scheduling algorithms exist for non-blocking electrical NoC. These algorithms can be applied to ONoC, while accounting for additional constraints arising from optical component limitations. In this thesis various scheduling algorithms are simulated, With the objective of comparing their latency and throughput using C + + programming language for ONoC with bus and ring topologies. An optimal scheduler based on two-step scheduling (TSS) technique is proposed. The optimal TSS models the scheduling problem in two steps for ONoC. The first step is the matching step which is done by representing each node pair as input bipartite graph then matching takes place between the input and output ports. The second step performs the wavelength assignment between each paired node while avoiding collisions and also with the consideration of wavelength continuity. The two-step approach with the iSLIP and MWM algorithms are considered. The proposed optimal TSS is simulated and its performances are evaluated. The optimal scheduler with maximum weighted matching (MWM) scheduling policy achieves better results in comparison to iSLIP scheduling policy based on queue length under any packet arrival process. The optimal MWM scheduling policy achieved better performance for both bus and ring topologies. The main result is that unidirectional ring topology outperforms the bus topology for any number of wavelengths less or equal to the number of ONoC port, even if the average path length is longer. The reason is that in the bus topology half of the wavelengths are allocated in each direction, fixing the maximum number of packets in each direction using two transceivers per node can compensate this issue, reaching to better performance than the ring

    Experimental survey of FPGA-based monolithic switches and a novel queue balancer

    Get PDF
    This paper studies small to medium-sized monolithic switches for FPGA implementation and presents a novel switch design that achieves high algorithmic performance and FPGA implementation efficiency. Crossbar switches based on virtual output queues (VOQs) and variations have been rather popular for implementing switches on FPGAs, with applications in network switches, memory interconnects, network-on-chip (NoC) routers etc. The implementation efficiency of crossbar-based switches is well-documented on ASICs, though we show that their disadvantages can outweigh their advantages on FPGAs. One of the most important challenges in such input-queued switches is the requirement for iterative scheduling algorithms. In contrast to ASICs, this is more harmful on FPGAs, as the reduced operating frequency and narrower packets cannot “hide” multiple iterations of scheduling that are required to achieve a modest scheduling performance.Our proposed design uses an output-queued switch internally for simplifying scheduling, and a queue balancing technique to avoid queue fragmentation and reduce the need for memory-sharing VOQs. Its implementation approaches the scheduling performance of a state-of-the-art FPGA-based switch, while requiring considerably fewer resources

    Configurable data center switch architectures

    Get PDF
    In this thesis, we explore alternative architectures for implementing con_gurable Data Center Switches along with the advantages that can be provided by such switches. Our first contribution centers around determining switch architectures that can be implemented on Field Programmable Gate Array (FPGA) to provide configurable switching protocols. In the process, we identify a gap in the availability of frameworks to realistically evaluate the performance of switch architectures in data centers and contribute a simulation framework that relies on realistic data center traffic patterns. Our framework is then used to evaluate the performance of currently existing as well as newly proposed FPGA-amenable switch designs. Through collaborative work with Meng and Papaphilippou, we establish that only small-medium range switches can be implemented on today's FPGAs. Our second contribution is a novel switch architecture that integrates a custom in-network hardware accelerator with a generic switch to accelerate Deep Neural Network training applications in data centers. Our proposed accelerator architecture is prototyped on an FPGA, and a scalability study is conducted to demonstrate the trade-offs of an FPGA implementation when compared to an ASIC implementation. In addition to the hardware prototype, we contribute a light weight load-balancing and congestion control protocol that leverages the unique communication patterns of ML data-parallel jobs to enable fair sharing of network resources across different jobs. Our large-scale simulations demonstrate the ability of our novel switch architecture and light weight congestion control protocol to both accelerate the training time of machine learning jobs by up to 1.34x and benefit other latency-sensitive applications by reducing their 99%-tile completion time by up to 4.5x. As for our final contribution, we identify the main requirements of in-network applications and propose a Network-on-Chip (NoC)-based architecture for supporting a heterogeneous set of applications. Observing the lack of tools to support such research, we provide a tool that can be used to evaluate NoC-based switch architectures.Open Acces

    Towards Terabit Carrier Ethernet and Energy Efficient Optical Transport Networks

    Get PDF

    On packet switch design

    Get PDF
    • 

    corecore