418 research outputs found

    CLEX: Yet Another Supercomputer Architecture?

    Get PDF
    We propose the CLEX supercomputer topology and routing scheme. We prove that CLEX can utilize a constant fraction of the total bandwidth for point-to-point communication, at delays proportional to the sum of the number of intermediate hops and the maximum physical distance between any two nodes. Moreover, % applying an asymmetric bandwidth assignment to the links, all-to-all communication can be realized (1+o(1))(1+o(1))-optimally both with regard to bandwidth and delays. This is achieved at node degrees of nεn^{\varepsilon}, for an arbitrary small constant ε(0,1]\varepsilon\in (0,1]. In contrast, these results are impossible in any network featuring constant or polylogarithmic node degrees. Through simulation, we assess the benefits of an implementation of the proposed communication strategy. Our results indicate that, for a million processors, CLEX can increase bandwidth utilization and reduce average routing path length by at least factors 1010 respectively 55 in comparison to a torus network. Furthermore, the CLEX communication scheme features several other properties, such as deadlock-freedom, inherent fault-tolerance, and canonical partition into smaller subsystems

    Symmetric Interconnection Networks from Cubic Crystal Lattices

    Full text link
    Torus networks of moderate degree have been widely used in the supercomputer industry. Tori are superb when used for executing applications that require near-neighbor communications. Nevertheless, they are not so good when dealing with global communications. Hence, typical 3D implementations have evolved to 5D networks, among other reasons, to reduce network distances. Most of these big systems are mixed-radix tori which are not the best option for minimizing distances and efficiently using network resources. This paper is focused on improving the topological properties of these networks. By using integral matrices to deal with Cayley graphs over Abelian groups, we have been able to propose and analyze a family of high-dimensional grid-based interconnection networks. As they are built over nn-dimensional grids that induce a regular tiling of the space, these topologies have been denoted \textsl{lattice graphs}. We will focus on cubic crystal lattices for modeling symmetric 3D networks. Other higher dimensional networks can be composed over these graphs, as illustrated in this research. Easy network partitioning can also take advantage of this network composition operation. Minimal routing algorithms are also provided for these new topologies. Finally, some practical issues such as implementability and preliminary performance evaluations have been addressed

    Information Spreading on Almost Torus Networks

    Get PDF
    Epidemic modeling has been extensively used in the last years in the field of telecommunications and computer networks. We consider the popular Susceptible-Infected-Susceptible spreading model as the metric for information spreading. In this work, we analyze information spreading on a particular class of networks denoted almost torus networks and over the lattice which can be considered as the limit when the torus length goes to infinity. Almost torus networks consist on the torus network topology where some nodes or edges have been removed. We find explicit expressions for the characteristic polynomial of these graphs and tight lower bounds for its computation. These expressions allow us to estimate their spectral radius and thus how the information spreads on these networks

    OutFlank Routing: Increasing Throughput in Toroidal Interconnection Networks

    Full text link
    We present a new, deadlock-free, routing scheme for toroidal interconnection networks, called OutFlank Routing (OFR). OFR is an adaptive strategy which exploits non-minimal links, both in the source and in the destination nodes. When minimal links are congested, OFR deroutes packets to carefully chosen intermediate destinations, in order to obtain travel paths which are only an additive constant longer than the shortest ones. Since routing performance is very sensitive to changes in the traffic model or in the router parameters, an accurate discrete-event simulator of the toroidal network has been developed to empirically validate OFR, by comparing it against other relevant routing strategies, over a range of typical real-world traffic patterns. On the 16x16x16 (4096 nodes) simulated network OFR exhibits improvements of the maximum sustained throughput between 14% and 114%, with respect to Adaptive Bubble Routing.Comment: 9 pages, 5 figures, to be presented at ICPADS 201

    Power analysis with variable traffic loads for next generation interconnection networks

    Get PDF
    Power consumption is the most important factor for the consideration of next generation supercomputers. In addition, the requirement of power usages can be even scaled up to more than 300MW (which is nearly equal to the one nuclear power plant) with the conventional networks. On the other hand, hierarchical interconnection networks can be a possible solution to those issues. 3D-TTN is a hierarchical interconnection network where lowest level is configured as the 3Dtorus network, following the 2Dtorus network at the higher-level networks. The main focus for this paper is the power analysis with variable traffic load along with the fault tolerance, cost, packing density and message traffic density of 3D-TTN comparing against various other networks. In our early research, 3D-TTN has achieved near about 21% better diameter performance, 12% better average distance performance and eventually required about 32.48% less router power usage at the lowest level than the 5Dtorus network for 1% traffic load. This paper shows the power comparison with the router and link power rather than considering the router power only. Our analysis shows that 3DTTN will require about 39.96% less router and link power than the 5Dtorus network for 10% traffic. With 30% traffic load, 3DTTN will require about 38.42% less power than the 5Dtorus network for the on-chip network. Even considering some topological parameters, 3D-TTN could also achieve some desirable performance by comparing other networks

    Task mapping in rectangular twisted tori

    Get PDF
    Twisted torus topologies have been proposed as an alternative to toroidal rectangular networks, improving distance parameters and providing network symmetry. However, twisting is apparently less amenable to task mapping algorithms of real life applications. In this paper we make an analytical study of different mapping and concentration techniques on 2D twisted tori that try to compensate for the twisted peripheral links. We introduce a performance model based on the network average distance and the detection of the set of links which receive the highest load. The model also considers the amount of local and global communications in the network. Our model shows that the twisted torus can improve latency and maximum throughput over rectangular torus, especially when global communications dominate over local ones and when some concentration is employed. Simulation results corroborate our synthetic model. For real applications from the NPB benchmark suite, the use of the twisted topologies with an appropriate mapping provides overall average application speedups of 2.9%, which increase to 4.9% when concentrated topologies (c = 2) are considered.This work has been supported by the Spanish Ministry of Science under contracts TIN2010-21291-C02-02, TIN-2007- 60625, AP2010-4900 and CONSOLIDER Project CSD2007-00050, and by the European HiPEAC Network of Excellence. M. Moreto is supported by a MEC/Fulbright Fellowship.Postprint (author’s final draft
    corecore