    Deterministic Routing with HoL-Blocking-Awareness for Direct Topologies

    AbstractRouting is a key design factor to obtain the maximum performance out of interconnection networks. Depending on the number of routing options that packets may use, routing algorithms are classified into two categories. If the packet can only use a single predetermined path, routing is deterministic, whereas if several paths are available, it is adaptive. It is well-known that adaptive routing usually outperforms deterministic routing. However, adaptive routers are more complex and introduces out-of-order delivery of packets. In this paper, we take up the challenge of developing a deterministic routing algorithm for direct topologies that can obtain a similar performance than adaptive routing, while providing the inherent advantages of deterministic routing such as in-order delivery of packets and implementation simplicity. The proposed deterministic routing algorithm is aware of the HoL-blocking effect, and it is designed to reduce it, which, as known, it is a key contributor to degrade interconnection network performance

    Node-Type-Based Load-Balancing Routing for Parallel Generalized Fat-Trees

    High-Performance Computing (HPC) clusters are made up of a variety of node types (usually compute, I/O, service, and GPGPU nodes) and applications don't use nodes of a different type the same way. Resulting communication patterns reflect organization of groups of nodes, and current optimal routing algorithms for all-to-all patterns will not always maximize performance for group-specific communications. Since application communication patterns are rarely available beforehand, we choose to rely on node types as a good guess for node usage. We provide a description of node type heterogeneity and analyse performance degradation caused by unlucky repartition of nodes of the same type. We provide an extension to routing algorithms for Parallel Generalized Fat-Tree topologies (PGFTs) which balances load amongst groups of nodes of the same type. We show how it removes these performance issues by comparing results in a variety of situations against corresponding classical algorithms

    Control de Congestión Eficiente para Redes HPC con Encaminamiento Adaptativo

    La red de interconexión es el elemento principal en los clusters de computación de alto rendimiento (HPC) y centros de datos (DC), donde miles de nodos deben comunicarse de forma rápida y fiable. El rendimiento de la red depende de varias opciones de diseño, como la topología, el algoritmo de encaminamiento, la arquitectura del switch, etc. En la literatura se han propuesto algoritmos de encaminamiento altamente eficientes, ya sean deterministas o adaptativos, para equilibrar de forma inteligente los flujos de tráfico dependiendo de la topología de red, pero su rendimiento se reduce en los escenarios en los que la congestión y sus efectos negativos (por ejemplo, el HoL blocking) aparecen. En particular, en escenarios donde la congestión es intensa y persistente, el HoL blocking puede degradar drásticamente el rendimiento de los algoritmos de encaminamiento adaptativo, ya que pueden extender los flujos de tráfico congestionado por todas las rutas disponibles. Además, como hemos demostrado en estudios anteriores, la dispersi´on de los flujos congestionados puede deteriorar el rendimiento de los esquemas de colas estáticos utilizados para reducir el HoL blocking mediante la separación de los flujos en diferentes colas del switch buffer. De hecho, como estos sistemas se basan en un criterio estático, definido antes de la inyección del tráfico en la red, no pueden evitar que los flujos congestionados y no congestionados compartan colas cuando se combinan con un encaminamiento adaptativo. En este trabajo, proponemos utilizar algunos esquemas de colas estáticos existentes junto a la asignación dinámica de canales virtuales (VC) para aislar en una solo VC los flujos cuyas rutas han sido encaminadas de forma adaptativa, con el fin de evitar que el impacto de la congestión se extienda a través de varias rutas. Básicamente, los flujos adaptados se mueven a un canal especial de flujos adaptados (AFC), de modo que no interactúan con los flujos asignados a otros VC por el esquema de colas estático. De esta manera, se evita el HoL blocking que los flujos adaptados podrían causar a los flujos no adaptados, incluso si los flujos congestionados se han extendido a través de varias rutas. Por otro lado, el esquema de colas estático reducirá sin ninguna interferencia el HoL blocking que puede aparecer entre los flujos no adaptados. Para evaluar nuestra propuesta hemos realizado experimentos de simulación modelando grandes redes de interconexión basadas en la topología Fat-tree. De los resultados obtenidos, podemos concluir que nuestra técnica reduce de manera eficiente y significativa el impacto del HoLblocking en las redes de interconexión utilizando encaminamiento adaptativo y esquemas de colas cuando aparece la congestión

    Slim Fly: A Cost Effective Low-Diameter Network Topology

    Abstract—We introduce a high-performance cost-effective net-work topology called Slim Fly that approaches the theoretically optimal network diameter. Slim Fly is based on graphs that approximate the solution to the degree-diameter problem. We analyze Slim Fly and compare it to both traditional and state-of-the-art networks. Our analysis shows that Slim Fly has significant advantages over other topologies in latency, bandwidth, resiliency, cost, and power consumption. Finally, we propose deadlock-free routing schemes and physical layouts for large computing centers as well as a detailed cost and power model. Slim Fly enables constructing cost effective and highly resilient datacenter and HPC networks that offer low latency and high bandwidth under different HPC workloads such as stencil or graph computations. I

    Balanceo distribuido del encaminamiento para topologías fat-tree sobre redes Infiniband

    Las redes de interconexión juegan un papel importante en el rendimiento de los sistemas de altas prestaciones. Actualmente la gestión del encaminamiento de los mensajes es un factor determinante para mantener las prestaciones de la red. Nuestra propuesta es trabajar sobre un algoritmo de encaminamiento adaptativo, que distribuye el encaminamiento de los mensajes para evitar los problemas de congestión en las redes de interconexión, que aparecen por el gran volumen de comunicaciones de aplicaciones científicas ó comerciales. El objetivo es ajustar el algoritmo a una topología muy utilizada en los sistemas actuales como lo es el fat-tree, e implementarlo en una tecnología Infiniband. En la experimentación realizada comparamos el método de control de congestión de la arquitectura Infiniband, con nuestro algoritmo. Los resultados obtenidos muestran que mejoramos los niveles de latencia por encima de un 50% y de throughput entre un 38% y un 81%.Les xarxes de interconnexió juguen un paper molt important en el rendiment dels sistemes d'altes prestacions. Actualment la gestió de l'encaminament dels missatges és un factor determinant per mantenir les prestacions de la xarxa. La nostra proposta és dissenyar un algorisme de encaminament adaptatiu que distribueixi el encaminament dels missatges per evitar els problemes de congestió en les xarxes de interconnexió, els quals apareixen pel gran volum de comunicacions de aplicacions científiques o comercials. L'objectiu és ajustar l'algorisme a una topologia molt utilitzada en els sistemes actuals como ho es el fat-tree, i implementar-ho per a una tecnologia Infiniband. En l'experimentació realitzada comparem el mètode de control de congestió de lʹarquitectura Infiniband amb el nostre algorisme. Els resultats obtinguts mostren que millorem els nivells de latència per sobre dʹun 50% i de throughput entre un 38% i un 81%.Interconnection networks play an important role in the throughput of high performance systems. Currently, the message routing management is a key factor to maintain network performance. Our proposal is to work on an adaptive routing algorithm, which distributes message routing to avoid congestion problems on interconnection networks that appear due to the large volume of scientific or commercial application communications. The aim is to adjust the algorithm to a topology that is widely used in existing systems such as fat-tree, and couple it with Infiniband technology. In our experiments we compare the control congestion method on Infiniband architecture, with our algorithm. The results obtained shown that latency levels have been improved above 50% and throughput between 38% and 81%

    A distributed algorithm to maintain and repair the trail networks of arboreal ants

    We study how the arboreal turtle ant (Cephalotes goniodontus) solves a fundamental computing problem: maintaining a trail network and finding alternative paths to route around broken links in the network. Turtle ants form a routing backbone of foraging trails linking several nests and temporary food sources. This species travels only in the trees, so their foraging trails are constrained to lie on a natural graph formed by overlapping branches and vines in the tangled canopy. Links between branches, however, can be ephemeral, easily destroyed by wind, rain, or animal movements. Here we report a biologically feasible distributed algorithm, parameterized using field data, that can plausibly describe how turtle ants maintain the routing backbone and find alternative paths to circumvent broken links in the backbone. We validate the ability of this probabilistic algorithm to circumvent simulated breaks in synthetic and real-world networks, and we derive an analytic explanation for why certain features are crucial to improve the algorithm's success. Our proposed algorithm uses fewer computational resources than common distributed graph search algorithms, and thus may be useful in other domains, such as for swarm computing or for coordinating molecular robots

    The k-ary n-direct s-indirect family of topologies for large-scale interconnection networks

    The final publication is available at Springer via http://dx.doi.org/10.1007/s11227-016-1640-zIn large-scale supercomputers, the interconnection network plays a key role in system performance. Network topology highly defines the performance and cost of the interconnection network. Direct topologies are sometimes used due to its reduced hardware cost, but the number of network dimensions is limited by the physical 3D space, which leads to an increase of the communication latency and a reduction of network throughput for large machines. Indirect topologies can provide better performance for large machines, but at higher hardware cost. In this paper, we propose a new family of hybrid topologies, the k-ary n-direct s-indirect, that combines the best features from both direct and indirect topologies to efficiently connect an extremely high number of processing nodes. The proposed network is an n-dimensional topology where the k nodes of each dimension are connected through a small indirect topology of s stages. This combination results in a family of topologies that provides high performance, with latency and throughput figures of merit close to indirect topologies, but at a lower hardware cost. In particular, it doubles the throughput obtained per cost unit compared with indirect topologies in most of the cases. Moreover, their fault-tolerance degree is similar to the one achieved by direct topologies built with switches with the same number of ports. 