418 research outputs found
CLEX: Yet Another Supercomputer Architecture?
We propose the CLEX supercomputer topology and routing scheme. We prove that
CLEX can utilize a constant fraction of the total bandwidth for point-to-point
communication, at delays proportional to the sum of the number of intermediate
hops and the maximum physical distance between any two nodes. Moreover, %
applying an asymmetric bandwidth assignment to the links, all-to-all
communication can be realized -optimally both with regard to
bandwidth and delays. This is achieved at node degrees of ,
for an arbitrary small constant . In contrast, these
results are impossible in any network featuring constant or polylogarithmic
node degrees. Through simulation, we assess the benefits of an implementation
of the proposed communication strategy. Our results indicate that, for a
million processors, CLEX can increase bandwidth utilization and reduce average
routing path length by at least factors respectively in comparison to
a torus network. Furthermore, the CLEX communication scheme features several
other properties, such as deadlock-freedom, inherent fault-tolerance, and
canonical partition into smaller subsystems
Symmetric Interconnection Networks from Cubic Crystal Lattices
Torus networks of moderate degree have been widely used in the supercomputer
industry. Tori are superb when used for executing applications that require
near-neighbor communications. Nevertheless, they are not so good when dealing
with global communications. Hence, typical 3D implementations have evolved to
5D networks, among other reasons, to reduce network distances. Most of these
big systems are mixed-radix tori which are not the best option for minimizing
distances and efficiently using network resources. This paper is focused on
improving the topological properties of these networks.
By using integral matrices to deal with Cayley graphs over Abelian groups, we
have been able to propose and analyze a family of high-dimensional grid-based
interconnection networks. As they are built over -dimensional grids that
induce a regular tiling of the space, these topologies have been denoted
\textsl{lattice graphs}. We will focus on cubic crystal lattices for modeling
symmetric 3D networks. Other higher dimensional networks can be composed over
these graphs, as illustrated in this research. Easy network partitioning can
also take advantage of this network composition operation. Minimal routing
algorithms are also provided for these new topologies. Finally, some practical
issues such as implementability and preliminary performance evaluations have
been addressed
Information Spreading on Almost Torus Networks
Epidemic modeling has been extensively used in the last years in the field of
telecommunications and computer networks. We consider the popular
Susceptible-Infected-Susceptible spreading model as the metric for information
spreading. In this work, we analyze information spreading on a particular class
of networks denoted almost torus networks and over the lattice which can be
considered as the limit when the torus length goes to infinity. Almost torus
networks consist on the torus network topology where some nodes or edges have
been removed. We find explicit expressions for the characteristic polynomial of
these graphs and tight lower bounds for its computation. These expressions
allow us to estimate their spectral radius and thus how the information spreads
on these networks
OutFlank Routing: Increasing Throughput in Toroidal Interconnection Networks
We present a new, deadlock-free, routing scheme for toroidal interconnection
networks, called OutFlank Routing (OFR). OFR is an adaptive strategy which
exploits non-minimal links, both in the source and in the destination nodes.
When minimal links are congested, OFR deroutes packets to carefully chosen
intermediate destinations, in order to obtain travel paths which are only an
additive constant longer than the shortest ones. Since routing performance is
very sensitive to changes in the traffic model or in the router parameters, an
accurate discrete-event simulator of the toroidal network has been developed to
empirically validate OFR, by comparing it against other relevant routing
strategies, over a range of typical real-world traffic patterns. On the
16x16x16 (4096 nodes) simulated network OFR exhibits improvements of the
maximum sustained throughput between 14% and 114%, with respect to Adaptive
Bubble Routing.Comment: 9 pages, 5 figures, to be presented at ICPADS 201
Power analysis with variable traffic loads for next generation interconnection networks
Power consumption is the most important factor for
the consideration of next generation supercomputers. In
addition, the requirement of power usages can be even scaled up to more than 300MW (which is nearly equal to the one nuclear power plant) with the conventional networks. On the other hand, hierarchical interconnection networks can be a possible solution to those issues. 3D-TTN is a hierarchical interconnection network where lowest level is configured as the 3Dtorus network, following the 2Dtorus network at the higher-level networks. The main focus for this paper is the power analysis with variable traffic load along with the fault tolerance, cost, packing density
and message traffic density of 3D-TTN comparing against
various other networks. In our early research, 3D-TTN has
achieved near about 21% better diameter performance, 12%
better average distance performance and eventually required
about 32.48% less router power usage at the lowest level than the 5Dtorus network for 1% traffic load. This paper shows the power comparison with the router and link power rather than considering the router power only. Our analysis shows that 3DTTN will require about 39.96% less router and link power than the 5Dtorus network for 10% traffic. With 30% traffic load, 3DTTN will require about 38.42% less power than the 5Dtorus network for the on-chip network. Even considering some topological parameters, 3D-TTN could also achieve some desirable performance by comparing other networks
Task mapping in rectangular twisted tori
Twisted torus topologies have been proposed as an alternative to toroidal rectangular networks, improving distance parameters and providing network symmetry. However, twisting is apparently less amenable to task mapping algorithms of real life applications. In this paper we make an analytical study of different mapping and concentration techniques on 2D twisted tori that try to compensate for the twisted peripheral links. We introduce a performance model based on the network average distance and the detection of the set of links which receive the highest load. The model also considers the amount of local and global communications in the network. Our model shows that the twisted torus can improve latency and maximum throughput over rectangular torus, especially when global communications dominate over local ones and when some concentration is employed. Simulation results corroborate our synthetic model. For real applications from the NPB benchmark suite, the use of the twisted topologies with an appropriate mapping provides overall average application speedups of 2.9%, which increase to 4.9% when concentrated topologies (c = 2) are considered.This work has been supported by the Spanish Ministry of Science under contracts TIN2010-21291-C02-02, TIN-2007-
60625, AP2010-4900 and CONSOLIDER Project CSD2007-00050, and by the European HiPEAC Network of Excellence. M. Moreto is supported by a MEC/Fulbright Fellowship.Postprint (author’s final draft
- …