4 research outputs found
An In-Depth Analysis of the Slingshot Interconnect
The interconnect is one of the most critical components in large scale
computing systems, and its impact on the performance of applications is going
to increase with the system size. In this paper, we will describe Slingshot, an
interconnection network for large scale computing systems. Slingshot is based
on high-radix switches, which allow building exascale and hyperscale
datacenters networks with at most three switch-to-switch hops. Moreover,
Slingshot provides efficient adaptive routing and congestion control
algorithms, and highly tunable traffic classes. Slingshot uses an optimized
Ethernet protocol, which allows it to be interoperable with standard Ethernet
devices while providing high performance to HPC applications. We analyze the
extent to which Slingshot provides these features, evaluating it on
microbenchmarks and on several applications from the datacenter and AI worlds,
as well as on HPC applications. We find that applications running on Slingshot
are less affected by congestion compared to previous generation networks.Comment: To be published in Proceedings of The International Conference for
High Performance Computing Networking, Storage, and Analysis (SC '20) (2020
A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network
Novel low-diameter network topologies such as Slim Fly (SF) offer significant
cost and power advantages over the established Fat Tree, Clos, or Dragonfly. To
spearhead the adoption of low-diameter networks, we design, implement, deploy,
and evaluate the first real-world SF installation. We focus on deployment,
management, and operational aspects of our test cluster with 200 servers and
carefully analyze performance. We demonstrate techniques for simple cabling and
cabling validation as well as a novel high-performance routing architecture for
InfiniBand-based low-diameter topologies. Our real-world benchmarks show SF's
strong performance for many modern workloads such as deep neural network
training, graph analytics, or linear algebra kernels. SF outperforms
non-blocking Fat Trees in scalability while offering comparable or better
performance and lower cost for large network sizes. Our work can facilitate
deploying SF while the associated (open-source) routing architecture is fully
portable and applicable to accelerate any low-diameter interconnect