48 research outputs found

    Analytical performance modelling of adaptive wormhole routing in the star interconnection network

    Get PDF
    The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormhole-switched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions

    Performance analysis of wormhole routing in multicomputer interconnection networks

    Get PDF
    Perhaps the most critical component in determining the ultimate performance potential of a multicomputer is its interconnection network, the hardware fabric supporting communication among individual processors. The message latency and throughput of such a network are affected by many factors of which topology, switching method, routing algorithm and traffic load are the most significant. In this context, the present study focuses on a performance analysis of k-ary n-cube networks employing wormhole switching, virtual channels and adaptive routing, a scenario of especial interest to current research. This project aims to build upon earlier work in two main ways: constructing new analytical models for k-ary n-cubes, and comparing the performance merits of cubes of different dimensionality. To this end, some important topological properties of k-ary n-cubes are explored initially; in particular, expressions are derived to calculate the number of nodes at/within a given distance from a chosen centre. These results are important in their own right but their primary significance here is to assist in the construction of new and more realistic analytical models of wormhole-routed k-ary n-cubes. An accurate analytical model for wormhole-routed k-ary n-cubes with adaptive routing and uniform traffic is then developed, incorporating the use of virtual channels and the effect of locality in the traffic pattern. New models are constructed for wormhole k-ary n-cubes, with the ability to simulate behaviour under adaptive routing and non-uniform communication workloads, such as hotspot traffic, matrix-transpose and digit-reversal permutation patterns. The models are equally applicable to unidirectional and bidirectional k-ary n-cubes and are significantly more realistic than any in use up to now. With this level of accuracy, the effect of each important network parameter on the overall network performance can be investigated in a more comprehensive manner than before. Finally, k-ary n-cubes of different dimensionality are compared using the new models. The comparison takes account of various traffic patterns and implementation costs, using both pin-out and bisection bandwidth as metrics. Networks with both normal and pipelined channels are considered. While previous similar studies have only taken account of network channel costs, our model incorporates router costs as well thus generating more realistic results. In fact the results of this work differ markedly from those yielded by earlier studies which assumed deterministic routing and uniform traffic, illustrating the importance of using accurate models to conduct such analyses

    Contention and achieved performance in multicomputer wormhole routing networks

    Get PDF

    Small-world interconnection networks for large parallel computer systems

    Get PDF
    The use of small-world graphs as interconnection networks of multicomputers is proposed and analysed in this work. Small-world interconnection networks are constructed by adding (or modifying) edges to an underlying local graph. Graphs with a rich local structure but with a large diameter are shown to be the most suitable candidates for the underlying graph. Generation models based on random and deterministic wiring processes are proposed and analysed. For the random case basic properties such as degree, diameter, average length and bisection width are analysed, and the results show that a fast transition from a large diameter to a small diameter is experienced when the number of new edges introduced is increased. Random traffic analysis on these networks is undertaken, and it is shown that although the average latency experiences a similar reduction, networks with a small number of shortcuts have a tendency to saturate as most of the traffic flows through a small number of links. An analysis of the congestion of the networks corroborates this result and provides away of estimating the minimum number of shortcuts required to avoid saturation. To overcome these problems deterministic wiring is proposed and analysed. A Linear Feedback Shift Register is used to introduce shortcuts in the LFSR graphs. A simple routing algorithm has been constructed for the LFSR and extended with a greedy local optimisation technique. It has been shown that a small search depth gives good results and is less costly to implement than a full shortest path algorithm. The Hilbert graph on the other hand provides some additional characteristics, such as support for incremental expansion, efficient layout in two dimensional space (using two layers), and a small fixed degree of four. Small-world hypergraphs have also been studied. In particular incomplete hypermeshes have been introduced and analysed and it has been shown that they outperform the complete traditional implementations under a constant pinout argument. Since it has been shown that complete hypermeshes outperform the mesh, the torus, low dimensional m-ary d-cubes (with and without bypass channels), and multi-stage interconnection networks (when realistic decision times are accounted for and with a constant pinout), it follows that incomplete hypermeshes outperform them as well

    System level modelling and design of hypergraph based wireless system area networks for multi-computer systems

    Get PDF
    This thesis deals with issues pertaining the wireless multicomputer interconnection networks namely topology and Medium Access Control (MAC). It argues that new channel assignment technique based on regular low-dimensional hypergraph networks, the dual radio wireless hypermesh, represents a promising alternative high-performance wireless interconnection network for the future multicomputers to shared communication medium networks and/or ordinary wireless mesh networks, which have been widely used in current wireless networks. The focus of this work is on improving the network throughput while maintaining a relatively low latency of a wireless network system. By means of a Carrier Sense Multiple Access (CSMA) based design of the MAC protocol and based on the desirable features of hypermesh network topology a relatively high performance network has been introduced. Compared to the CSMA shared communication channel model, which is currently the de facto MAC protocol for most of wireless networks, our design is shown to achieve a significant increase in network throughput with less average network latency for large number of communication nodes. SystemC model of the proposed wireless hypermesh, validated through mathematical models, are then introduced. The analysis has been incorporated in the proper SystemC design methodology which facilitates the integration of communication modelling into the design modelling at the early stages of the system development. Another important application of SystemC modelling techniques is to perform meaningful comparative studies of different protocols, or new implementations to determine which communication scenario performs better and the ability to modify models to test system sensitivity and tune performance. Effects of different design parameters (e.g., packet sizes, number of nodes) has been carried out throughout this work. The results shows that the proposed structure has out perform the existing shared medium network structure and it can support relatively high number of wireless connected computers than conventional networks

    New fault-tolerant routing algorithms for k-ary n-cube networks

    Get PDF
    The interconnection network is one of the most crucial components in a multicomputer as it greatly influences the overall system performance. Networks belonging to the family of k-ary n-cubes (e.g., tori and hypercubes) have been widely adopted in practical machines due to their desirable properties, including a low diameter, symmetry, regularity, and ability to exploit communication locality found in many real-world parallel applications. A routing algorithm specifies how a message selects a path to cross from source to destination, and has great impact on network performance. Routing in fault-free networks has been extensively studied in the past. As the network size scales up the probability of processor and link failure also increases. It is therefore essential to design fault-tolerant routing algorithms that allow messages to reach their destinations even in the presence of faulty components (links and nodes). Although many fault-tolerant routing algorithms have been proposed for common multicomputer networks, e.g. hypercubes and meshes, little research has been devoted to developing fault-tolerant routing for well-known versions of k-ary n-cubes, such as 2 and 3- dimensional tori. Previous work on fault-tolerant routing has focused on designing algorithms with strict conditions imposed on the number of faulty components (nodes and links) or their locations in the network. Most existing fault-tolerant routing algorithms have assumed that a node knows either only the status of its neighbours (such a model is called local-information-based) or the status of all nodes (global-information-based). The main challenge is to devise a simple and efficient way of representing limited global fault information that allows optimal or near-optimal fault-tolerant routing. This thesis proposes two new limited-global-information-based fault-tolerant routing algorithms for k-ary n-cubes, namely the unsafety vectors and probability vectors algorithms. While the first algorithm uses a deterministic approach, which has been widely employed by other existing algorithms, the second algorithm is the first that uses probability-based fault- tolerant routing. These two algorithms have two important advantages over those already existing in the relevant literature. Both algorithms ensure fault-tolerance under relaxed assumptions, regarding the number of faulty components and their locations in the network. Furthermore, the new algorithms are more general in that they can easily be adapted to different topologies, including those that belong to the family of k-ary n-cubes (e.g. tori and hypercubes) and those that do not (e.g., generalised hypercubes and meshes). Since very little work has considered fault-tolerant routing in k-ary n-cubes, this study compares the relative performance merits of the two proposed algorithms, the unsafety and probability vectors, on these networks. The results reveal that for practical number of faulty nodes, both algorithms achieve good performance levels. However, the probability vectors algorithm has the advantage of being simpler to implement. Since previous research has focused mostly on the hypercube, this study adapts the new algorithms to the hypercube in order to conduct a comparative study against the recently proposed safety vectors algorithm. Results from extensive simulation experiments demonstrate that our algorithms exhibit superior performance in terms of reachability (chances of a message reaching its destination), deviation from optimality (average difference between minimum distance and actual routing distance), and looping (chances of a message continuously looping in the network without reaching destination) to the safety vectors

    Analyse und Erweiterung eines fehler-toleranten NoC für SRAM-basierte FPGAs in Weltraumapplikationen

    Get PDF
    Data Processing Units for scientific space mission need to process ever higher volumes of data and perform ever complex calculations. But the performance of available space-qualified general purpose processors is just in the lower three digit megahertz range, which is already insufficient for some applications. As an alternative, suitable processing steps can be implemented in hardware on a space-qualified SRAM-based FPGA. However, suitable devices are susceptible against space radiation. At the Institute for Communication and Network Engineering a fault-tolerant, network-based communication architecture was developed, which enables the construction of processing chains on the basis of different processing modules within suitable SRAM-based FPGAs and allows the exchange of single processing modules during runtime, too. The communication architecture and its protocol shall isolate non SEU mitigated or just partial SEU mitigated modules affected by radiation-induced faults to prohibit the propagation of errors within the remaining System-on-Chip. In the context of an ESA study, this communication architecture was extended with further components and implemented in a representative hardware platform. Based on the acquired experiences during the study, this work analyses the actual fault-tolerance characteristics as well as weak points of this initial implementation. At appropriate locations, the communication architecture was extended with mechanisms for fault-detection and fault-differentiation as well as with a hardware-based monitoring solution. Both, the former measures and the extension of the employed hardware-platform with selective fault-injection capabilities for the emulation of radiation-induced faults within critical areas of a non SEU mitigated processing module, are used to evaluate the effects of radiation-induced faults within the communication architecture. By means of the gathered results, further measures to increase fast detection and isolation of faulty nodes are developed, selectively implemented and verified. In particular, the ability of the communication architecture to isolate network nodes without SEU mitigation could be significantly improved.Instrumentenrechner für wissenschaftliche Weltraummissionen müssen ein immer höheres Datenvolumen verarbeiten und immer komplexere Berechnungen ausführen. Die Performanz von verfügbaren qualifizierten Universalprozessoren liegt aber lediglich im unteren dreistelligen Megahertz-Bereich, was für einige Anwendungen bereits nicht mehr ausreicht. Als Alternative bietet sich die Implementierung von entsprechend geeigneten Datenverarbeitungsschritten in Hardware auf einem qualifizierten SRAM-basierten FPGA an. Geeignete Bausteine sind jedoch empfindlich gegenüber der Strahlungsumgebung im Weltraum. Am Institut für Datentechnik und Kommunikationsnetze wurde eine fehlertolerante netzwerk-basierte Kommunikationsarchitektur entwickelt, die innerhalb eines geeigneten SRAM-basierten FPGAs Datenverarbeitungsmodule miteinander nach Bedarf zu Verarbeitungsketten verbindet, sowie den Austausch von einzelnen Modulen im Betrieb ermöglicht. Nicht oder nur partiell SEU mitigierte Module sollen bei strahlungsbedingten Fehlern im Modul durch das Protokoll und die Fehlererkennungsmechanismen der Kommunikationsarchitektur isoliert werden, um ein Ausbreiten des Fehlers im restlichen System-on-Chip zu verhindern. Im Kontext einer ESA Studie wurde diese Kommunikationsarchitektur um Komponenten erweitert und auf einer repräsentativen Hardwareplattform umgesetzt. Basierend auf den gesammelten Erfahrungen aus der Studie, wird in dieser Arbeit eine Analyse der tatsächlichen Fehlertoleranz-Eigenschaften sowie der Schwachstellen dieser ursprünglichen Implementierung durchgeführt. Die Kommunikationsarchitektur wurde an geeigneten Stellen um Fehlerdetektierungs- und Fehlerunterscheidungsmöglichkeiten erweitert, sowie um eine hardwarebasierte Überwachung ergänzt. Sowohl diese Maßnahmen, als auch die Erweiterung der Hardwareplattform um gezielte Fehlerinjektions-Möglichkeiten zum Emulieren von strahlungsinduzierten Fehlern in kritischen Komponenten eines nicht SEU mitigierten Prozessierungsmoduls werden genutzt, um die tatsächlichen auftretenden Effekte in der Kommunikationsarchitektur zu evaluieren. Anhand der Ergebnisse werden weitere Verbesserungsmaßnahmen speziell zur schnellen Detektierung und Isolation von fehlerhaften Knoten erarbeitet, selektiv implementiert und verifiziert. Insbesondere die Fähigkeit, fehlerhafte, nicht SEU mitigierte Netzwerkknoten innerhalb der Kommunikationsarchitektur zu isolieren, konnte dabei deutlich verbessert werden
    corecore