Search CORE

8 research outputs found

Recommended from our members

Performance analysis and improvement of InfiniBand networks. Modelling and effective Quality-of-Service mechanisms for interconnection networks in cluster computing systems.

Author: Yan Shihang
Publication venue: Department of Computing, School of Computing, Informatics and Media
Publication date: 01/01/2012
Field of study

The InfiniBand Architecture (IBA) network has been proposed as a new industrial standard with high-bandwidth and low-latency suitable for constructing high-performance interconnected cluster computing systems. This architecture replaces the traditional bus-based interconnection with a switch-based network for the server Input-Output (I/O) and inter-processor communications. The efficient Quality-of-Service (QoS) mechanism is fundamental to ensure the import at QoS metrics, such as maximum throughput and minimum latency, leaving aside other aspects like guarantee to reduce the delay, blocking probability, and mean queue length, etc. Performance modelling and analysis has been and continues to be of great theoretical and practical importance in the design and development of communication networks. This thesis aims to investigate efficient and cost-effective QoS mechanisms for performance analysis and improvement of InfiniBand networks in cluster-based computing systems. Firstly, a rate-based source-response link-by-link admission and congestion control function with improved Explicit Congestion Notification (ECN) packet marking scheme is developed. This function adopts the rate control to reduce congestion of multiple-class traffic. Secondly, a credit-based flow control scheme is presented to reduce the mean queue length, throughput and response time of the system. In order to evaluate the performance of this scheme, a new queueing network model is developed. Theoretical analysis and simulation experiments show that these two schemes are quite effective and suitable for InfiniBand networks. Finally, to obtain a thorough and deep understanding of the performance attributes of InfiniBand Architecture network, two efficient threshold function flow control mechanisms are proposed to enhance the QoS of InfiniBand networks; one is Entry Threshold that sets the threshold for each entry in the arbitration table, and other is Arrival Job Threshold that sets the threshold based on the number of jobs in each Virtual Lane. Furthermore, the principle of Maximum Entropy is adopted to analyse these two new mechanisms with the Generalized Exponential (GE)-Type distribution for modelling the inter-arrival times and service times of the input traffic. Extensive simulation experiments are conducted to validate the accuracy of the analytical models

Bradford Scholars

Hot-Spot Avoidance With Multi-Pathing Over Infiniband: An MPI Perspective

Author: Koop M
Mamidala A R
Moody A
Narravula S
Panda D K
Vishnu A
Publication venue: Lawrence Livermore National Laboratory
Publication date: 06/03/2007
Field of study

Large scale InfiniBand clusters are becoming increasingly popular, as reflected by the TOP 500 Supercomputer rankings. At the same time, fat tree has become a popular interconnection topology for these clusters, since it allows multiple paths to be available in between a pair of nodes. However, even with fat tree, hot-spots may occur in the network depending upon the route configuration between end nodes and communication pattern(s) in the application. To make matters worse, the deterministic routing nature of InfiniBand limits the application from effective use of multiple paths transparently and avoid the hot-spots in the network. Simulation based studies for switches and adapters to implement congestion control have been proposed in the literature. However, these studies have focused on providing congestion control for the communication path, and not on utilizing multiple paths in the network for hot-spot avoidance. In this paper, we design an MPI functionality, which provides hot-spot avoidance for different communications, without a priori knowledge of the pattern. We leverage LMC (LID Mask Count) mechanism of InfiniBand to create multiple paths in the network and present the design issues (scheduling policies, selecting number of paths, scalability aspects) of our design. We implement our design and evaluate it with Pallas collective communication and MPI applications. On an InfiniBand cluster with 48 processes, collective operations like MPI All-to-all Personalized and MPI Reduce Scatter show an improvement of 27% and 19% respectively. Our evaluation with MPI applications like NAS Parallel Benchmarks and PSTSWM on 64 processes shows significant improvement in execution time with this functionality

Crossref

UNT Digital Library

Balanceo distribuido del encaminamiento para topologías fat-tree sobre redes Infiniband

Author: Franco Puntes Daniel
Mex Uc Belmar
Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 01/01/2008
Field of study

Las redes de interconexión juegan un papel importante en el rendimiento de los sistemas de altas prestaciones. Actualmente la gestión del encaminamiento de los mensajes es un factor determinante para mantener las prestaciones de la red. Nuestra propuesta es trabajar sobre un algoritmo de encaminamiento adaptativo, que distribuye el encaminamiento de los mensajes para evitar los problemas de congestión en las redes de interconexión, que aparecen por el gran volumen de comunicaciones de aplicaciones científicas ó comerciales. El objetivo es ajustar el algoritmo a una topología muy utilizada en los sistemas actuales como lo es el fat-tree, e implementarlo en una tecnología Infiniband. En la experimentación realizada comparamos el método de control de congestión de la arquitectura Infiniband, con nuestro algoritmo. Los resultados obtenidos muestran que mejoramos los niveles de latencia por encima de un 50% y de throughput entre un 38% y un 81%.Les xarxes de interconnexió juguen un paper molt important en el rendiment dels sistemes d'altes prestacions. Actualment la gestió de l'encaminament dels missatges és un factor determinant per mantenir les prestacions de la xarxa. La nostra proposta és dissenyar un algorisme de encaminament adaptatiu que distribueixi el encaminament dels missatges per evitar els problemes de congestió en les xarxes de interconnexió, els quals apareixen pel gran volum de comunicacions de aplicacions científiques o comercials. L'objectiu és ajustar l'algorisme a una topologia molt utilitzada en els sistemes actuals como ho es el fat-tree, i implementar-ho per a una tecnologia Infiniband. En l'experimentació realitzada comparem el mètode de control de congestió de lʹarquitectura Infiniband amb el nostre algorisme. Els resultats obtinguts mostren que millorem els nivells de latència per sobre dʹun 50% i de throughput entre un 38% i un 81%.Interconnection networks play an important role in the throughput of high performance systems. Currently, the message routing management is a key factor to maintain network performance. Our proposal is to work on an adaptive routing algorithm, which distributes message routing to avoid congestion problems on interconnection networks that appear due to the large volume of scientific or commercial application communications. The aim is to adjust the algorithm to a topology that is widely used in existing systems such as fat-tree, and couple it with Infiniband technology. In our experiments we compare the control congestion method on Infiniband architecture, with our algorithm. The results obtained shown that latency levels have been improved above 50% and throughput between 38% and 81%

Diposit Digital de Documents de la UAB

Exploring InfiniBand Congestion Control

Author: Mahamud Ahmed Yusuf
Publication venue
Publication date: 01/01/2015
Field of study

Congestion Control (CC) is used to achieve high performance and good utilization of network resources during high load in lossless interconnection networks. Without CC a congestion which started from a single node can grow, spread and degrade the performance of the network. Congestion can affect both the contributors of the congestion and also other traffic flows in the network. Infiniband (IB) is one of communication standards providing support for Congestion Control. The IB standard describes the CC functionality for detecting and resolving congestion. The behavior of the IB CC mechanism depends on the values of CC parameters. The given values of the parameters will determine characteristics like how aggressive the congestion detection should be, the rate of feedback from the forwarding node detecting congestion to the contributors of the congestion - and how much and for how long the contributors should lower their injection rates. But there are very few guidelines about how to set the values of the CC parameters for IB CC it to be efficient. In this thesis, an experiment of a Mesh network topology will be conducted using OmNet++ as a simulation platform. Large amount of traffic will be generated and fed to the network until a congestion is contributed. The performance will be measured when Infiniband congestion control is disable and when it is enabled. The results from those simulations will be compared and analysed. The topology s host-to-switch link capacities are to be increased. There will be a search for proper IB CC parameters and finally, we will learn more about how IB CC parameters influence performance by focusing on some of the parameters

NORA - Norwegian Open Research Archives

Control de congestión adaptativo en redes Infiniband

Author: Franco Puntes Daniel
Lugones Diego Fernando
Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 01/01/2007
Field of study

El uso de recursos compartidos en las redes de interconexión de alta performance puede provocar situaciones de congestión de mensajes que degradan notablemente las prestaciones, aumentando la latencia de trasporte y disminuyendo la utilización de la red. Hasta el momento las técnicas que intentan solucionar este problema utilizan la regulación de la inyección de mensajes. Esta limitación de la inyección traslada la contención de mensajes desde los conmutadores hacia los nodos fuente, incrementando el valor de la latencia promedio global, pudiendo alcanzar valores muy elevados. En este trabajo, proponemos una técnica de control de congestión para redes InfiniBand basada en un mecanismo de encaminamiento adaptativo que distribuye el volumen de comunicaciones entre diversas trayectorias alternativas quitando carga de la zona de congestión, lo que permite eliminarla. La experimentación realizada muestra la mejora obtenida en latencia y throughput, respecto al mecanismo de control de congestión original de InfiniBand basado en la regulación de la inyección. El mecanismo propuesto es totalmente compatible y no requiere que se modifique ningún aspecto de la especificación, debido a que se utilizan componentes de gestión definidos en el estándar InfiniBand

Diposit Digital de Documents de la UAB

Topology Agnostic Methods for Routing, Reconfiguration and Virtualization of Interconnection Networks

Author: Solheim Åshild Grønstad
Publication venue
Publication date: 01/01/2012
Field of study

Modern computing systems, such as supercomputers, data centers and multicore chips, generally require efficient communication between their different system units; tolerance towards component faults; flexibility to expand or merge; and a high utilization of their resources. Interconnection networks are used in a variety of such computing systems in order to enable communication between their diverse system units. Investigation and proposal of new or improved solutions to topology agnostic routing and reconfiguration of interconnection networks are main objectives of this thesis. In addition, topology agnostic routing and reconfiguration algorithms are utilized in the development of new and flexible approaches to processor allocation. The thesis aims to present versatile solutions that can be used for the interconnection networks of a number of different computing systems. No particular routing algorithm was specified for an interconnection network technology which is now incorporated in Dolphin Express. The thesis states a set of criteria for a suitable routing algorithm, evaluates a number of existing routing algorithms, and recommend that one of the algorithms – which fulfils all of the criteria – is used. Further investigations demonstrate how this routing algorithm inherently supports fault-tolerance, and how it can be optimized for some network topologies. These considerations are also relevant for the InfiniBand interconnection network technology. Reconfiguration of interconnection networks (change of routing function) is a deadlock prone process. Some existing reconfiguration strategies include deadlock avoidance mechanisms that significantly reduce the network service offered to running applications. The thesis expands the area of application for one of the most versatile and efficient reconfiguration algorithms available in the literature, and proposes an optimization of this algorithm that improves the network service offered to running applications. Moreover, a new reconfiguration algorithm is presented that supports a replacement of the routing function without causing performance penalties. Processor allocation strategies that guarantee traffic-containment commonly pose strict requirements on the shape of partitions, and thus achieve only a limited utilization of a system’s computing resources. The thesis introduces two new approaches that are more flexible. Both approaches utilize the properties of a topology agnostic routing algorithm in order to enforce traffic-containment within arbitrarily shaped partitions. Consequently, a high resource utilization as well as isolation of traffic between different partitions is achieved

NORA - Norwegian Open Research Archives

Adaptive Multipath Routing for Congestion Control in InfiniBand Networks

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref