253 research outputs found

    A general analytical model of adaptive wormhole routing in k-ary n-cubes

    Get PDF
    Several analytical models of fully adaptive routing have recently been proposed for k-ary n-cubes and hypercube networks under the uniform traffic pattern. Although,hypercube is a special case of k-ary n-cubes topology, the modeling approach for hypercube is more accurate than karyn-cubes due to its simpler structure. This paper proposes a general analytical model to predict message latency in wormhole-routed k-ary n-cubes with fully adaptive routing that uses a similar modeling approach to hypercube. The analysis focuses Duato's fully adaptive routing algorithm [12], which is widely accepted as the most general algorithm for achieving adaptivity in wormhole-routed networks while allowing for an efficient router implementation. The proposed model is general enough that it can be used for hypercube and other fully adaptive routing algorithms

    Analytical modelling of hot-spot traffic in deterministically-routed k-ary n-cubes

    Get PDF
    Many research studies have proposed analytical models to evaluate the performance of k-ary n-cubes with deterministic wormhole routing. Such models however have so far been confined to uniform traffic distributions. There has been hardly any model proposed that deal with non-uniform traffic distributions that could arise due to, for instance, the presence of hot-spots in the network. This paper proposes the first analytical model to predict message latency in k-ary n-cubes with deterministic routing in the presence of hot-spots. The validity of the model is demonstrated by comparing analytical results with those obtained through extensive simulation experiments

    K-ary n-cube based off-chip communications architecture for high-speed packet processors

    Get PDF
    A k-ary n-cube interconnect architecture is proposed, as an off-chip communications architecture for line cards, to increase the throughput of the currently used memory system. The k-ary n-cube architecture allows multiple packet processing elements on a line card to access multiple memory modules. The main advantage of the proposed architecture is that it can sustain current line rates and higher while distributing the load among multiple memories. Moreover, the proposed interconnect can scale to adopt more memories and/or processors and as a result increasing the line card processing power. Our results portray that k-ary n-cube sustained higher incoming traffic load while keeping latency lower than its shared-bus competitor. © 2005 IEEE

    K-ary n-cube based off-chip communications architecture for high-speed packet processors

    Get PDF
    We present a detailed study of Higgs boson production in association with a single top quark at the LHC, at next-to-leading order accuracy in QCD. We consider total and differential cross sections, at the parton level as well as by matching short distance events to parton showers, for both t-channel and s-channel production. We provide predictions relevant for the LHC at 13 TeV together with a thorough evaluation of the residual uncertainties coming from scale variation, parton distributions, strong coupling constant and heavy quark masses. In addition, for t-channel production, we compare results as obtained in the 4-flavour and 5-flavour schemes, pinning down the most relevant differences between them. Finally, we study the sensitivity to a non-standard-model relative phase between the Higgs couplings to the top quark and to the weak bosons

    Performance modeling of fault-tolerant circuit-switched communication networks

    Get PDF
    Circuit switching (CS) has been suggested as an efficient switching method for supporting simultaneous communications (such as data, voice, and images) across parallel systems due to its ability to preserve both communication performance and fault-tolerant demands in such systems. In this paper we present an efficient scheme to capture the mean message latency in 2D torus with CS in the presence of faulty components. We have also conducted extensive simulation experiments, the results of which are used to validate the analytical mode

    Analytical performance modelling of adaptive wormhole routing in the star interconnection network

    Get PDF
    The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormhole-switched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions

    General broadcasting algorithms in one-port wormhole routed hypercubes

    Full text link
    Wormhole routing has been accepted as an efficient switching mechanism in point-to-point interconnection networks. Here the network resource, i.e. node buffers and communication channels, are effectively utilized to deliver message across the network; We consider the problem of broadcasting a message in the hypercue equipped with the wormhole switching mechanism. The model is a generalization of an earlier work and considers a broadcast path-length of {dollar}m\ (1\leq m\leq n{dollar}) in the n-cube with a single-port communication capability. In this thesis, the scheme of e-cube and a Gray code path routing and intermediate reception capability have been adopted in order to solve the problem of broadcasting in one-port wormhole routed hypercubes. Two methods have been suggested; one is based on utilizing the Gray codes (Gray code path-based routing), while the other is based on the recursive partitioning of the cube (cube-based routing). The number of routing steps in both methods are compared to those in the previous results, as well as to the lower bounds derived based on the path-length m assumption. A cube-based and a path-based algorithm give {dollar}T(R)+(k\sb{c}+1)T(m){dollar} and {dollar}k\sb{G} +T(m){dollar} routing steps, respectively. By comparison with routing steps of both algorithms, the performance of the path-based algorithm shows better than that of the cube-based; The results of this work are significant and can be used for immediate implementation in contemporary machines most of which are equipped with wormhole routing and serial communication capability

    Performance analysis of wormhole switched interconnection networks with virtual channels and finite buffers

    Get PDF
    An efficient interconnection network that provides high bandwidth and low latency interprocessor communication is critical to harness fully the computational power of large scale multicomputer. K-ary n-cube networks have been widely adopted in contemporary multicomputers due to their desirable properties. As such, the present study focuses on a performance analysis of K-ary n-cubes employing wormhole switching, virtual channels, and adaptive routing. The objective of this dissertation is twofold: to examine the performance of these networks, and to compare the performance merits of various topologies under different working conditions, by means of analytical modelling. Most existing analytical models reported in the literature have used a method originally proposed by Dally to capture the effects of virtual channels on network performance. This method is based on a Markov chain and it has been shown that its prediction accuracy degrades as traffic increases. Moreover, these studies have also constrained the buffer capacity to a single flit per channel, a simplifying assumption that has often been invoked to ease the derivation of the analytical models. Motivated by these observations, the first part of this research proposes a new method for modelling virtual channels, based on an M/G/1 queue. Owing to the generality of this method. Daily's method is shown to be a special case when the message service time is exponentially distributed. The second part of this research uses theoretical results of queuing systems to relax the single-flit buffer assumption. New analytical models are then proposed to capture the effects of deploying arbitrary size buffers on the performance of deterministic and adaptive routing algorithms. Simulation experiments reveal that results from the proposed analytical models are in close agreement with those obtained through simulation. Building on these new analytical models, the third part of this research compares the relative performance merits of K-ary n-cubes under different operating conditions, in the presence of finite size buffers and multiple virtual channels. Namely, the analysis first revisits the relative performance merits of the well-known 2D torus, 3D torus and hypercube under different implementation constraints. The analysis has then been extended to investigate the performance impact of arranging the total buffer space, allocated to a physical channel, into multiple virtual channels. Finally, the performance of adaptive routing has been compared to that of deterministic routing. While previous similar studies have only taken account of channel and router costs, the present analysis incorporates different intra-router delays, as well, and thus generates more realistic results. In fact, the results of this research differ notably from those reported in previous studies, illustrating the sensitivity of such studies to the level of detail, degree of accuracy and the realism of the assumptions adopted

    OutFlank Routing: Increasing Throughput in Toroidal Interconnection Networks

    Full text link
    We present a new, deadlock-free, routing scheme for toroidal interconnection networks, called OutFlank Routing (OFR). OFR is an adaptive strategy which exploits non-minimal links, both in the source and in the destination nodes. When minimal links are congested, OFR deroutes packets to carefully chosen intermediate destinations, in order to obtain travel paths which are only an additive constant longer than the shortest ones. Since routing performance is very sensitive to changes in the traffic model or in the router parameters, an accurate discrete-event simulator of the toroidal network has been developed to empirically validate OFR, by comparing it against other relevant routing strategies, over a range of typical real-world traffic patterns. On the 16x16x16 (4096 nodes) simulated network OFR exhibits improvements of the maximum sustained throughput between 14% and 114%, with respect to Adaptive Bubble Routing.Comment: 9 pages, 5 figures, to be presented at ICPADS 201
    • …
    corecore