112 research outputs found

    New Fault Tolerant Multicast Routing Techniques to Enhance Distributed-Memory Systems Performance

    Get PDF
    Distributed-memory systems are a key to achieve high performance computing and the most favorable architectures used in advanced research problems. Mesh connected multicomputer are one of the most popular architectures that have been implemented in many distributed-memory systems. These systems must support communication operations efficiently to achieve good performance. The wormhole switching technique has been widely used in design of distributed-memory systems in which the packet is divided into small flits. Also, the multicast communication has been widely used in distributed-memory systems which is one source node sends the same message to several destination nodes. Fault tolerance refers to the ability of the system to operate correctly in the presence of faults. Development of fault tolerant multicast routing algorithms in 2D mesh networks is an important issue. This dissertation presents, new fault tolerant multicast routing algorithms for distributed-memory systems performance using wormhole routed 2D mesh. These algorithms are described for fault tolerant routing in 2D mesh networks, but it can also be extended to other topologies. These algorithms are a combination of a unicast-based multicast algorithm and tree-based multicast algorithms. These algorithms works effectively for the most commonly encountered faults in mesh networks, f-rings, f-chains and concave fault regions. It is shown that the proposed routing algorithms are effective even in the presence of a large number of fault regions and large size of fault region. These algorithms are proved to be deadlock-free. Also, the problem of fault regions overlap is solved. Four essential performance metrics in mesh networks will be considered and calculated; also these algorithms are a limited-global-information-based multicasting which is a compromise of local-information-based approach and global-information-based approach. Data mining is used to validate the results and to enlarge the sample. The proposed new multicast routing techniques are used to enhance the performance of distributed-memory systems. Simulation results are presented to demonstrate the efficiency of the proposed algorithms

    A performance model of multicast communication in wormhole-routed networks on-chip

    Get PDF
    Collective communication operations form a part of overall traffic in most applications running on platforms employing direct interconnection networks. This paper presents a novel analytical model to compute communication latency of multicast as a widely used collective communication operation. The novelty of the model lies in its ability to predict the latency of the multicast communication in wormhole-routed architectures employing asynchronous multi-port routers scheme. The model is applied to the Quarc NoC and its validity is verified by comparing the model predictions against the results obtained from a discrete-event simulator developed using OMNET++

    A communication model of broadcast in wormhole-routed networks on-chip

    Get PDF
    This paper presents a novel analytical model to compute communication latency of broadcast as the most fundamental collective communication operation. The novelty of the model lies in its ability to predict the broadcast communication latency in wormhole-routed architectures employing asynchronous multi-port routers scheme. The model is applied to the Quarc NoC and its validity is verified by comparing the model predictions against the results obtained from a discrete-event simulator developed using OMNET++

    On the performance of routing algorithms in wormhole-switched multicomputer networks

    Get PDF
    This paper presents a comparative performance study of adaptive and deterministic routing algorithms in wormhole-switched hypercubes and investigates the performance vicissitudes of these routing schemes under a variety of network operating conditions. Despite the previously reported results, our results show that the adaptive routing does not consistently outperform the deterministic routing even for high dimensional networks. In fact, it appears that the superiority of adaptive routing is highly dependent to the broadcast traffic rate generated at each node and it begins to deteriorate by growing the broadcast rate of generated message

    The Effect Of Hot Spots On The Performance Of Mesh--Based Networks

    Get PDF
    Direct network performance is affected by different design parameters which include number of virtual channels, number of ports, routing algorithm, switching technique, deadlock handling technique, packet size, and buffer size. Another factor that affects network performance is the traffic pattern. In this thesis, we study the effect of hotspot traffic on system performance. Specifically, we study the effect of hotspot factor, hotspot number, and hot spot location on the performance of mesh-based networks. Simulations are run on two network topologies, both the mesh and torus. We pay more attention to meshes because they are widely used in commercial machines. Comparisons between oblivious wormhole switching and chaotic packet switching are reported. Overall packet switching proved to be more efficient in terms of throughput when compared to wormhole switching. In the case of uniform random traffic, it is shown that the differences between chaotic and oblivious routing are indistinguishable. Networks with low number of hotspots show better performance. As the number of hotspots increases network latency tends to increase. It is shown that when the hotspot factor increases, performance of packet switching is better than that of wormhole switching. It is also shown that the location of hotspots affects network performance particularly with the oblivious routers since their achieved latencies proved to be more vulnerable to changes in the hotspot location. It is also shown that the smaller the size of the network the earlier network saturation occurs. Further, it is shown that the chaos router’s adaptivity is useful in this case. Finally, for tori, performance is not greatly affected by hotspot presence. This is mostly due to the symmetric nature of tori

    On the performance of broadcast algorithms in interconnection networks

    Get PDF
    Broadcast Communication is among the most primitive collective capabilities of any message passing network. Broadcast algorithms for the mesh have been widely reported in the literature. However, most existing algorithms have been studied within limited conditions, such as light traffic load and fixed network sizes. In other words, most of these algorithms have not been studied at different Quality of Service (QoS) levels. In contrast, this study examines the broadcast operation, taking into account the scalability, parallelism, a wide range of traffic loads through the propagation of broadcast messages. To the best of our knowledge, this study is the first to consider the issue of broadcast latency at both the network and node levels across different traffic loads. Results are shown from a comparative analysis confirming that the coded-path based broadcast algorithms exhibit superior performance characteristics over some existing algorithms

    High-Speed Message Routing Mechanisms for Massively Parallel Computers

    Get PDF
    現在超並列処理システム(MPP)は、伝統的なベクトルプロセッサやSIMDマシンの 牙城であった多くの分野に進出している。これらのシステムは、入手が容易な高性能 CPUの急激な進歩をうまく利用し、これらを数百~数千個接続して均質なマルチプ ロセッサのシステムとして構成したものである。しかし、これらのシステムの性能は、 現実の問題を解くときは必ずしも良くなく、常に公称の最高性能にははるかに及ばな いのが現状である。これらのシステムではプロセッサ間の通信はすべて相互結合網に よって行われるので、実現可能な最高性能を決める決定的な要素は相互結合網と、そ れに使われる通信機構である。 本論文ではMPPの相互結合網に使われる、効率的な通信機構を実現する2つの方法 を提案する。第1は「特急ルータ」の提案であり、これを相互結合網に用いた場合の 適合性を検註する。特急ルータは多重の単方向レジスタ挿入パスを利用して、時間 空間混合分割型ネットワークを実現するためのものである。異なる基数や次元数につ いて、特急ルータのスイッチ回路とバッファ回路の性能を予測するための正確なモデ ルを開発した。この結果、特急ルータは効率的な通信を行うためのすべての条件を満 足していることが確かめられた。さらに重要な点は、特急ルータはネットワークに故 障のある場合や、通信が錯綜する場合にも、低遅延時間、高スループットを損なわな い経路制御が行えることである。シミュレーションによって評価した特急ルータのの 性能は、これまでに発表された固定経路選択方式のルータより優れており、また他の 適応経路制御方式のルータに比べても、同程度あるいはそれを越えていることが確か められた。 第2は経路長制限方式のマルチキャスト通信の提案である。マルチキャスト通信は 多くの並列処理問題において速度向上に寄与する通信方式である。そこでワームホー ル通信方式において問題となるマルチキャスト通信におけるデッドロックの問題につ いて研究した。そしてこの問題を解決する方法として経路長制限方式のマルチキャス ト通信を提案し、この方式による通信性能をシミュレーションによって評価し、ユニ キャスト方式やマルチパス方式によるマルチキャスト通信の性能と比較した。その結 果、提案する経路長制限方式のマルチキャスト通信は、パリヤ同期のためのクラスタ へのマルチキャスト通信や、最近傍ノードへのマルチキャストや全ノードへの放送の 場合に、特に優れた解決法となることを明らかにした

    Efficient Multicast Algorithms for Mesh and Torus Networks

    Get PDF
    With the increasing popularity of multicomputers, efficient way of communication within its processors has become a popular area of research. Multicomputers refer to a computer system that has multiple processors, they have high computational power and they can perform multiple tasks concurrently. Mesh and Torus are some of the commonly used network topologies in building multicomputer systems. Their performance highly depends on the underlying network communication such as multicast. Multicast is a communication method in which a message is sent from a source node to a certain number of destinations. Two major parameters used to evaluate multicast are time that a multicast process takes to deliver the message to all destinations and traffic that indicates the number of links used for this process. Research indicates that in general, it is NP- complete to find an optimal multicasting algorithm which is efficient on both time and traffic. This thesis suggests two new algorithms to achieve multicast in mesh and torus networks. Extensive simulations of these algorithms show that in practice they perform better than existing ones
    corecore