21 research outputs found
Distributed Computing in the Asynchronous LOCAL model
The LOCAL model is among the main models for studying locality in the
framework of distributed network computing. This model is however subject to
pertinent criticisms, including the facts that all nodes wake up
simultaneously, perform in lock steps, and are failure-free. We show that
relaxing these hypotheses to some extent does not hurt local computing. In
particular, we show that, for any construction task associated to a locally
checkable labeling (LCL), if is solvable in rounds in the LOCAL model,
then remains solvable in rounds in the asynchronous LOCAL model.
This improves the result by Casta\~neda et al. [SSS 2016], which was restricted
to 3-coloring the rings. More generally, the main contribution of this paper is
to show that, perhaps surprisingly, asynchrony and failures in the computations
do not restrict the power of the LOCAL model, as long as the communications
remain synchronous and failure-free
New fault-tolerant routing algorithms for k-ary n-cube networks
The interconnection network is one of the most crucial components in a multicomputer as it greatly influences the overall system performance. Networks belonging to the family of k-ary n-cubes (e.g., tori and hypercubes) have been widely adopted in practical machines due to their desirable properties, including a low diameter, symmetry, regularity, and ability to exploit communication locality found in many real-world parallel applications. A routing algorithm specifies how a message selects a path to cross from source to destination, and has great impact on network performance. Routing in fault-free networks has been extensively studied in the past. As the network size scales up the probability of processor and link failure also increases. It is therefore essential to design fault-tolerant routing algorithms that allow messages to reach their destinations even in the presence of faulty components (links and nodes). Although many fault-tolerant routing algorithms have been proposed for common multicomputer networks, e.g. hypercubes and meshes, little research has been devoted to developing fault-tolerant routing for well-known versions of k-ary n-cubes, such as 2 and 3- dimensional tori. Previous work on fault-tolerant routing has focused on designing algorithms with strict conditions imposed on the number of faulty components (nodes and links) or their locations in the network. Most existing fault-tolerant routing algorithms have assumed that a node knows either only the status of its neighbours (such a model is called local-information-based) or the status of all nodes (global-information-based). The main challenge is to devise a simple and efficient way of representing limited global fault information that allows optimal or near-optimal fault-tolerant routing. This thesis proposes two new limited-global-information-based fault-tolerant routing algorithms for k-ary n-cubes, namely the unsafety vectors and probability vectors algorithms. While the first algorithm uses a deterministic approach, which has been widely employed by other existing algorithms, the second algorithm is the first that uses probability-based fault- tolerant routing. These two algorithms have two important advantages over those already existing in the relevant literature. Both algorithms ensure fault-tolerance under relaxed assumptions, regarding the number of faulty components and their locations in the network. Furthermore, the new algorithms are more general in that they can easily be adapted to different topologies, including those that belong to the family of k-ary n-cubes (e.g. tori and hypercubes) and those that do not (e.g., generalised hypercubes and meshes). Since very little work has considered fault-tolerant routing in k-ary n-cubes, this study compares the relative performance merits of the two proposed algorithms, the unsafety and probability vectors, on these networks. The results reveal that for practical number of faulty nodes, both algorithms achieve good performance levels. However, the probability vectors algorithm has the advantage of being simpler to implement. Since previous research has focused mostly on the hypercube, this study adapts the new algorithms to the hypercube in order to conduct a comparative study against the recently proposed safety vectors algorithm. Results from extensive simulation experiments demonstrate that our algorithms exhibit superior performance in terms of reachability (chances of a message reaching its destination), deviation from optimality (average difference between minimum distance and actual routing distance), and looping (chances of a message continuously looping in the network without reaching destination) to the safety vectors
Design and analysis of a 3-dimensional cluster multicomputer architecture using optical interconnection for petaFLOP computing
In this dissertation, the design and analyses of an extremely scalable distributed
multicomputer architecture, using optical interconnects, that has the potential to
deliver in the order of petaFLOP performance is presented in detail. The design
takes advantage of optical technologies, harnessing the features inherent in optics,
to produce a 3D stack that implements efficiently a large, fully connected system of
nodes forming a true 3D architecture. To adopt optics in large-scale multiprocessor
cluster systems, efficient routing and scheduling techniques are needed. To this
end, novel self-routing strategies for all-optical packet switched networks and on-line
scheduling methods that can result in collision free communication and achieve real
time operation in high-speed multiprocessor systems are proposed. The system is designed
to allow failed/faulty nodes to stay in place without appreciable performance
degradation. The approach is to develop a dynamic communication environment that
will be able to effectively adapt and evolve with a high density of missing units or
nodes. A joint CPU/bandwidth controller that maximizes the resource allocation in
this dynamic computing environment is introduced with an objective to optimize the
distributed cluster architecture, preventing performance/system degradation in the
presence of failed/faulty nodes. A thorough analysis, feasibility study and description of the characteristics of a 3-Dimensional multicomputer system capable of achieving
100 teraFLOP performance is discussed in detail. Included in this dissertation is
throughput analysis of the routing schemes, using methods from discrete-time queuing
systems and computer simulation results for the different proposed algorithms. A
prototype of the 3D architecture proposed is built and a test bed developed to obtain
experimental results to further prove the feasibility of the design, validate initial assumptions,
algorithms, simulations and the optimized distributed resource allocation
scheme. Finally, as a prelude to further research, an efficient data routing strategy
for highly scalable distributed mobile multiprocessor networks is introduced
Near-optimal broadcast in all-port wormhole-routed hypercubes using error-correcting codes
A new broadcasting method is presented for hypercubes with wormhole routing mechanism. The communication model assumed allows an n-dimensional hypercube to have at most n concurrent I/O communication along its ports. It assumes a distance insensitivity of (n + 1) with no intermediate reception capability for the nodes. The approach is based on determination of the set of nodes called stations in the hypercube. Once stations are identified, node disjoint paths are formed from the source to all stations. The broadcasting is accomplished first by sending the message to all stations, which will inform the rest of the nodes. To establish node-disjoint paths between the source node and all stations, we introduce a new routing strategy. We prove that multicasting can be done in one routing step as long as the number of destination nodes are at most n in an n-dimensional hypercube. The number of broadcasting steps using our routing is equal to or smaller than that obtained in an earlier work; this number is optimal for all hypercube dimensions n ≤ 12, except for n = 10
High-Speed Message Routing Mechanisms for Massively Parallel Computers
現在超並列処理システム(MPP)は、伝統的なベクトルプロセッサやSIMDマシンの
牙城であった多くの分野に進出している。これらのシステムは、入手が容易な高性能
CPUの急激な進歩をうまく利用し、これらを数百~数千個接続して均質なマルチプ
ロセッサのシステムとして構成したものである。しかし、これらのシステムの性能は、
現実の問題を解くときは必ずしも良くなく、常に公称の最高性能にははるかに及ばな
いのが現状である。これらのシステムではプロセッサ間の通信はすべて相互結合網に
よって行われるので、実現可能な最高性能を決める決定的な要素は相互結合網と、そ
れに使われる通信機構である。
本論文ではMPPの相互結合網に使われる、効率的な通信機構を実現する2つの方法
を提案する。第1は「特急ルータ」の提案であり、これを相互結合網に用いた場合の
適合性を検註する。特急ルータは多重の単方向レジスタ挿入パスを利用して、時間
空間混合分割型ネットワークを実現するためのものである。異なる基数や次元数につ
いて、特急ルータのスイッチ回路とバッファ回路の性能を予測するための正確なモデ
ルを開発した。この結果、特急ルータは効率的な通信を行うためのすべての条件を満
足していることが確かめられた。さらに重要な点は、特急ルータはネットワークに故
障のある場合や、通信が錯綜する場合にも、低遅延時間、高スループットを損なわな
い経路制御が行えることである。シミュレーションによって評価した特急ルータのの
性能は、これまでに発表された固定経路選択方式のルータより優れており、また他の
適応経路制御方式のルータに比べても、同程度あるいはそれを越えていることが確か
められた。
第2は経路長制限方式のマルチキャスト通信の提案である。マルチキャスト通信は
多くの並列処理問題において速度向上に寄与する通信方式である。そこでワームホー
ル通信方式において問題となるマルチキャスト通信におけるデッドロックの問題につ
いて研究した。そしてこの問題を解決する方法として経路長制限方式のマルチキャス
ト通信を提案し、この方式による通信性能をシミュレーションによって評価し、ユニ
キャスト方式やマルチパス方式によるマルチキャスト通信の性能と比較した。その結
果、提案する経路長制限方式のマルチキャスト通信は、パリヤ同期のためのクラスタ
へのマルチキャスト通信や、最近傍ノードへのマルチキャストや全ノードへの放送の
場合に、特に優れた解決法となることを明らかにした
Analysis of wormhole routings in cayley graphs of permutation groups.
Over a decade, a new class of switching technology, called wormhole routing, has been investigated in the multicomputer interconnection network field. Several classes of wormhole routing algorithms have been proposed. Most of the algorithms have been centered on the traditional binary hypercube, k-ary n-cube mesh, and torus networks. In the design of a wormhole routing algorithm, deadlock avoidance scheme is the main concern. Recently, new classes of networks called Cayley graphs of permutation groups are considered very promising alternatives. Although proposed Cayley networks have superior topological properties over the traditional network topologies, the design of the deadlock-free wormhole routing algorithm in these networks is not simple. In this dissertation, we investigate deadlock free wormhole routing algorithms in the several classes of Cayley networks, such as complete-transposition and star networks. We evaluate several classes of routing algorithms on these networks, and compare the performance of each algorithm to the simulation study. Also, the performances of these networks are compared to the traditional networks. Through extensive simulation we found that adaptive algorithm outperformed deterministic algorithm in general with more virtual channels. On the network performance comparison, the complete transposition network showed the best performance among the similar sized networks, and the binary hypercube performed better compared to the star graph