4 research outputs found

    New Theory for Deadlock-Free Multicast Routing in Wormhole-Switched Virtual-Channelless Networks-on-Chip

    Get PDF
    A new theory for deadlock-free multicast routing especially used for on-chip interconnection network (NoC) is presented in this paper. The NoC router hardware solution that enables the deadlock-free multicast routing without utilizing virtual channels is\ud introduced formally. The special characteristic of the NoC is that, wormhole packets can cut-through at flit-level and can be interleaved in the same channel with other flits of different packets by multiplexing it using a rotating flit-by-flit arbitration. The routing paths of each flit can be guaranteed correct because flits belonging to the same packet are labeled with the same local Id-tag on every communication channel. Hence, multicast deadlock problem can be solved at each router by further applying a hold-release tagging\ud mechanism to control and manage conflicting multicast requests

    Performance evaluation of distributed crossbar switch hypermesh

    Get PDF
    The interconnection network is one of the most crucial components in any multicomputer as it greatly influences the overall system performance. Several recent studies have suggested that hypergraph networks, such as the Distributed Crossbar Switch Hypermesh (DCSH), exhibit superior topological and performance characteristics over many traditional graph networks, e.g. k-ary n-cubes. Previous work on the DCSH has focused on issues related to implementation and performance comparisons with existing networks. These comparisons have so far been confined to deterministic routing and unicast (one-to-one) communication. Using analytical models validated through simulation experiments, this thesis extends that analysis to include adaptive routing and broadcast communication. The study concentrates on wormhole switching, which has been widely adopted in practical multicomputers, thanks to its low buffering requirement and the reduced dependence of latency on distance under low traffic. Adaptive routing has recently been proposed as a means of improving network performance, but while the comparative evaluation of adaptive and deterministic routing has been widely reported in the literature, the focus has been on graph networks. The first part of this thesis deals with adaptive routing, developing an analytical model to measure latency in the DCSH, and which is used throughout the rest of the work for performance comparisons. Also, an investigation of different routing algorithms in this network is presented. Conventional k-ary n-cubes have been the underlying topology of contemporary multicomputers, but it is only recently that adaptive routing has been incorporated into such systems. The thesis studies the relative performance merits of the DCSH and k-ary n-cubes under adaptive routing strategy. The analysis takes into consideration real-world factors, such as router complexity and bandwidth constraints imposed by implementation technology. However, in any network, the routing of unicast messages is not the only factor in traffic control. In many situations (for example, parallel iterative algorithms, memory update and invalidation procedures in shared memory systems, global notification of network errors), there is a significant requirement for broadcast traffic. The DCSH, by virtue of its use of hypergraph links, can implement broadcast operations particularly efficiently. The second part of the thesis examines how the DCSH and k-ary n-cube performance is affected by the presence of a broadcast traffic component. In general, these studies demonstrate that because of their relatively high diameter, k-ary n-cubes perform poorly when message lengths are short. This is consistent with earlier more simplistic analyses which led to the proposal for the express-cube, an enhancement of the basic k-ary n-cube structure, which provides additional express channels, allowing messages to bypass groups of nodes along their paths. The final part of the thesis investigates whether this "partial bypassing" can compete with the "total bypassing" capability provided inherently by the DCSH topology

    On Near Optimal Time and Dynamic Delay and Delay Variation Multicast Algorithms

    Get PDF
    Multicast is one of the most prevalent communication modes in computer networks. A plethora of systems and applications today rely on multicast communication to disseminate traffic including but not limited to teleconferencing, videoconferencing, stock exchanges, supercomputers, software update distribution, distributed database systems, and gaming. This dissertation elaborates and addresses key research challenges and problems related to the design and implementation of multicast algorithms. In particular, it investigates the problems of (1) Designing near optimal multicast time algorithms for mesh and torus connected systems and (2) Designing efficient algorithms for Delay and Delay Variation Bounded Multicast (DVBM). To achieve the first goal, improvements on four tree based multicast algorithms are made: Modified PAIR (MPAIR), Modified DIAG (MDIAG), Modified MIN (MMIN), and Modified DIST (MDIST). The proof that MDIAG generates optimal or optimal plus one multicast time in 2-Dimensional (2D) mesh networks is provided. The hybrid version of MDIAG (HMDIAG) is designed, that gives a 3-additive approximation algorithm on multicast time in 2D torus networks. To make HMDIAG applicable on systems using higher dimensional meshes and tori, it is extended and the proof that it gives a (2n-1)-additive approximation algorithm on multicast time in nD torus networks is given. To address the second goal, Directional Core Selection (DCS) algorithm for core selection and DVBM Tree generation is designed. To further reduce the delay variation of trees generated by DCS, a k-shortest-path based algorithm, Build Lower Variation Tree (BLVT) is designed. To tackle dynamic join/leave requests to the ongoing multicast session, the dynamic version of both algorithms is given that responds to requests by reorganizing the tree and avoiding session disruption. To solve cases where single-core based algorithms fail to construct a DVBM tree, a dynamic three-phase algorithm, Multi-core DVBM Trees (MCDVBMT) is designed, that semi-matches group members to core nodes

    Performance analysis of wormhole routing in multicomputer interconnection networks

    Get PDF
    Perhaps the most critical component in determining the ultimate performance potential of a multicomputer is its interconnection network, the hardware fabric supporting communication among individual processors. The message latency and throughput of such a network are affected by many factors of which topology, switching method, routing algorithm and traffic load are the most significant. In this context, the present study focuses on a performance analysis of k-ary n-cube networks employing wormhole switching, virtual channels and adaptive routing, a scenario of especial interest to current research. This project aims to build upon earlier work in two main ways: constructing new analytical models for k-ary n-cubes, and comparing the performance merits of cubes of different dimensionality. To this end, some important topological properties of k-ary n-cubes are explored initially; in particular, expressions are derived to calculate the number of nodes at/within a given distance from a chosen centre. These results are important in their own right but their primary significance here is to assist in the construction of new and more realistic analytical models of wormhole-routed k-ary n-cubes. An accurate analytical model for wormhole-routed k-ary n-cubes with adaptive routing and uniform traffic is then developed, incorporating the use of virtual channels and the effect of locality in the traffic pattern. New models are constructed for wormhole k-ary n-cubes, with the ability to simulate behaviour under adaptive routing and non-uniform communication workloads, such as hotspot traffic, matrix-transpose and digit-reversal permutation patterns. The models are equally applicable to unidirectional and bidirectional k-ary n-cubes and are significantly more realistic than any in use up to now. With this level of accuracy, the effect of each important network parameter on the overall network performance can be investigated in a more comprehensive manner than before. Finally, k-ary n-cubes of different dimensionality are compared using the new models. The comparison takes account of various traffic patterns and implementation costs, using both pin-out and bisection bandwidth as metrics. Networks with both normal and pipelined channels are considered. While previous similar studies have only taken account of network channel costs, our model incorporates router costs as well thus generating more realistic results. In fact the results of this work differ markedly from those yielded by earlier studies which assumed deterministic routing and uniform traffic, illustrating the importance of using accurate models to conduct such analyses
    corecore