4,948 research outputs found

    OFAR-CM: Efficient Dragonfly networks with simple congestion management

    Get PDF
    Dragonfly networks are appealing topologies for large-scale Data center and HPC networks, that provide high throughput with low diameter and moderate cost. However, they are prone to congestion under certain frequent traffic patterns that saturate specific network links. Adaptive non-minimal routing can be used to avoid such congestion. That kind of routing employs longer paths to circumvent local or global congested links. However, if a distance-based deadlock avoidance mechanism is employed, more Virtual Channels (VCs) are required, what increases design complexity and cost. OFAR (On-the-Fly Adaptive Routing) is a previously proposed routing that decouples VCs from deadlock avoidance, making local and global misrouting affordable. However, the severity of congestion with OFAR is higher, as it relies on an escape sub network with low bisection bandwidth. Additionally, OFAR allows for unlimited misroutings on the escape sub network, leading to unbounded paths in the network and long latencies. In this paper we propose and evaluate OFAR-CM, a variant of OFAR combined with a simple congestion management (CM) mechanism which only relies on local information, specifically the credit count of the output ports in the local router. With simple escape sub networks such as a Hamiltonian ring or a tree, OFAR outperforms former proposals with distance-based deadlock avoidance. Additionally, although long paths are allowed in theory, in practice packets arrive at their destination in a small number of hops. Altogether, OFAR-CM constitutes the first practicable mechanism to the date that supports both local and global misrouting in Dragonfly networks.The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement n. ERC-2012-Adg-321253- RoMoL, the Spanish Ministry of Science under contracts TIN2010-21291-C02-02, TIN2012-34557, and by the European HiPEAC Network of Excellence. M. García participated in this work while affiliated with the University of Cantabria.Peer ReviewedPostprint (author's final draft

    Optimal Networks from Error Correcting Codes

    Full text link
    To address growth challenges facing large Data Centers and supercomputing clusters a new construction is presented for scalable, high throughput, low latency networks. The resulting networks require 1.5-5 times fewer switches, 2-6 times fewer cables, have 1.2-2 times lower latency and correspondingly lower congestion and packet losses than the best present or proposed networks providing the same number of ports at the same total bisection. These advantage ratios increase with network size. The key new ingredient is the exact equivalence discovered between the problem of maximizing network bisection for large classes of practically interesting Cayley graphs and the problem of maximizing codeword distance for linear error correcting codes. Resulting translation recipe converts existent optimal error correcting codes into optimal throughput networks.Comment: 14 pages, accepted at ANCS 2013 conferenc

    An Energy and Performance Exploration of Network-on-Chip Architectures

    Get PDF
    In this paper, we explore the designs of a circuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area metrics are also presented to allow a comparison of implementation costs
    • 

    corecore