5 research outputs found
FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short
We introduce FatPaths: a simple, generic, and robust routing architecture
that enables state-of-the-art low-diameter topologies such as Slim Fly to
achieve unprecedented performance. FatPaths targets Ethernet stacks in both HPC
supercomputers as well as cloud data centers and clusters. FatPaths exposes and
exploits the rich ("fat") diversity of both minimal and non-minimal paths for
high-performance multi-pathing. Moreover, FatPaths uses a redesigned "purified"
transport layer that removes virtually all TCP performance issues (e.g., the
slow start), and incorporates flowlet switching, a technique used to prevent
packet reordering in TCP networks, to enable very simple and effective load
balancing. Our design enables recent low-diameter topologies to outperform
powerful Clos designs, achieving 15% higher net throughput at 2x lower latency
for comparable cost. FatPaths will significantly accelerate Ethernet clusters
that form more than 50% of the Top500 list and it may become a standard routing
scheme for modern topologies