7,877 research outputs found

    Deterministic 1-k routing on meshes with applications to worm-hole routing

    Get PDF
    In 11-kk routing each of the n2n^2 processing units of an n×nn \times n mesh connected computer initially holds 11 packet which must be routed such that any processor is the destination of at most kk packets. This problem reflects practical desire for routing better than the popular routing of permutations. 11-kk routing also has implications for hot-potato worm-hole routing, which is of great importance for real world systems. We present a near-optimal deterministic algorithm running in \sqrt{k} \cdot n / 2 + \go{n} steps. We give a second algorithm with slightly worse routing time but working queue size three. Applying this algorithm considerably reduces the routing time of hot-potato worm-hole routing. Non-trivial extensions are given to the general ll-kk routing problem and for routing on higher dimensional meshes. Finally we show that kk-kk routing can be performed in \go{k \cdot n} steps with working queue size four. Hereby the hot-potato worm-hole routing problem can be solved in \go{k^{3/2} \cdot n} steps

    Randomized Routing and Sorting on the Reconfigurable Mesh

    Get PDF
    In this paper we demonstrate the power of reconfiguration by presenting efficient randomized algorithms for both packet routing and sorting on a reconfigurable mesh connected computer (referred to simply as the mesh from hereon). The run times of these algorithms are better than the best achievable time bounds on a conventional mesh. In particular, we show that permutation routing problem can be solved on a linear array of size n in 3/4n steps, whereas n-1 is the best possible run time without reconfiguration. We also show that permutation routing on an n x n reconfigurable mesh can be done in time n + o(n)using a randomized algorithm or in time 1.25n + o(n) deterministically. In contrast, 2n-2 is the diameter of a conventional mesh and hence routing and sorting will need at least 2n-2 steps on a conventional mesh. In addition we show that the problem of sorting can be solved in time n+ o(n). All these time bounds hold with high probability. The bisection lower bound for both sorting and routing on the mesh is n/2, and hence our algorithms have nearly optimal time bounds

    Mesh Connected Computers With Multiple Fixed Buses: Packet Routing, Sorting and Selection

    Get PDF
    Mesh connected computers have become attractive models of computing because of their varied special features. In this paper we consider two variations of the mesh model: 1) a mesh with fixed buses, and 2) a mesh with reconfigurable buses. Both these models have been the subject matter of extensive previous research. We solve numerous important problems related to packet routing, sorting, and selection on these models. In particular, we provide lower bounds and very nearly matching upper bounds for the following problems on both these models: 1) Routing on a linear array; and 2) k-k routing, k-k sorting, and cut through routing on a 2D mesh for any k ≥ 12. We provide an improved algorithm for 1-1 routing and a matching sorting algorithm. In addition we present greedy algorithms for 1-1 routing, k-k routing, cut through routing, and k-k sorting that are better on average and supply matching lower bounds. We also show that sorting can be performed in logarithmic time on a mesh with fixed buses. As a consequence we present an optimal randomized selection algorithm. In addition we provide a selection algorithm for the mesh with reconfigurable buses whose time bound is significantly better than the existing ones. Our algorithms have considerably better time bounds than many existing best known algorithms

    Randomized Algorithms For Packet Routing on the Mesh

    Get PDF
    Packet routing is an important problem of parallel computing since a fast algorithm for packet routing will imply 1) fast inter-processor communication, and 2) fast algorithms for emulating ideal models like PRAMs on fixed connection machines.There are three different models of packet routing, namely 1) Store and forward, 2) Multipacket, and 3) Cut through. In this paper we provide a survey of the best known randomized algorithms for store and forward routing, k-k routing, and cut through routing on the Mesh Connected Computers

    \u3cem\u3ek-k\u3c/em\u3e Routing, \u3cem\u3ek-k\u3c/em\u3e Sorting, and Cut Through Routing on the Mesh

    Get PDF
    In this paper we present randomized algorithms for k-k routing, k-k sorting, and cut through routing. The stated resource bounds hold with high probability. The algorithm for k-k routing runs in [k/2]n+o(kn) steps. We also show that k-k sorting can be accomplished within [k/2] n+n+o(kn) steps, and cut through routing can be done in [3/4]kn+[3/2]n+o(kn) steps. The best known time bounds (prior to this paper) for all these three problems were kn+o(kn). [kn/2] is a known lower bound for all the three problems (which is the bisection bound), and hence our algorithms are very nearly optimal. All the above mentioned algorithms have optimal queue length, namely k+o(k). These algorithms also extend to higher dimensional meshes

    Towards practical permutation routing on meshes

    Get PDF
    We consider the permutation routing problem on two-dimensional n×nn \times n meshes. To be practical, a routing algorithm is required to ensure very small queue sizes QQ, and very low running time TT, not only asymptotically but particularly also for the practically important nn up to 10001000. With a technique inspired by a scheme of Kaklamanis/Krizanc/Rao, we obtain a near-optimal result: T=2⋅n+O(1)T = 2 \cdot n + {\cal O}(1) with Q=2Q = 2. Although QQ is very attractive now, the lower order terms in TT make this algorithm highly impractical. Therefore we present simple schemes which are asymptotically slower, but have TT around 3⋅n3 \cdot n for {\em all} nn and QQ between 2 and 8

    Adaptive AT2 Optimal Algorithms on reconfigurable meshes

    No full text
    Recently a few self-simulation algorithms have been developed to execute algorithms on a reconfigurable mesh (RM) of size smaller than recommended in those algorithms. Optimal slowdown, in self-simulation, has been achieved with the compromise that the resultant algorithms fail to remain AT2 optimal. In this paper we have introduced, for the first time, the idea of adaptive algorithm which runs on RM of variable sizes without compromising the AT2 optimality. We have supported our idea by developing adaptive algorithms, for sorting items and computing the contour of maximal elements of a set of planar points on RM. We have also conjectured that to obtain an AT2 algorithm to solve a problem of size n with I(n) information content on an RM of size p x q where pq=kI(n), it is sufficient to form buses of length O (k)

    Optimal Randomized Algorithms for Multipacket and Wormhole Routing on the Mesh

    Get PDF
    In this paper, we present a randomized algorithm for the multipacket (i.e., k - k) routing problem on an n x n mesh. The algorithm competes with high probability in at most kn + O(k log n) parallel communication steps, with a constant queue size of O(k). The previous best known algorithm [4] takes [5/4] kn + O([kn/f(n)]) steps with a queue size of O(k f(n)) (for any 1 ≤ f (n) ≤ n). We will also present a randomized algorithm for the wormhole model permutation routing problem for the mesh that completes in at the most kn + O(k log n) steps, with a constant queue size of O(k), where k is the number of flits that each packet is divided into. The previous best result [6] was also randomized and had a time bound of kn + O ([kn/f(n)]) with a queue size of O(k f(n)) for any 1 ≤ f(n). The two algorithms that we will present are optimal with respect to queue size. The time bounds are within a factor of two of the only known lower bound

    A Comparison of Meshes With Static Buses and Unidirectional Wrap-Arounds

    Get PDF
    We investigate the relative computational powers of a mesh with static buses and a mesh with unidirectional wrap-mounds. A mesh with unidirectional wraparounds is a torus with the restriction that any wraparoundlink of the architecture can only transmit data in one of the two directions at any clock tick. We show that the problem of packet routing can be solved as efficiently on a linear array with unidirectional wrap-around link as on a linear array with a broadcast bus. We also present a routing algorithm for a twcdimensional torus with unidirectional wraparound links whose run time is close to that of the best known algorithm for routing on a mesh with broadcast buses in each dimension. In addition, we show that on a mesh with broadcast buses, sorting can be done in time that is essentially the same as the time needed for packet routing

    Parallel Out-of-Core Sorting: The Third Way

    Get PDF
    Sorting very large datasets is a key subroutine in almost any application that is built on top of a large database. Two ways to sort out-of-core data dominate the literature: merging-based algorithms and partitioning-based algorithms. Within these two paradigms, all the programs that sort out-of-core data on a cluster rely on assumptions about the input distribution. We propose a third way of out-of-core sorting: oblivious algorithms. In all, we have developed six programs that sort out-of-core data on a cluster. The first three programs, based completely on Leighton\u27s columnsort algorithm, have a restriction on the maximum problem size that they can sort. The other three programs relax this restriction; two are based on our original algorithmic extensions to columnsort. We present experimental results to show that our algorithms perform well. To the best of our knowledge, the programs presented in this thesis are the first to sort out-of-core data on a cluster without making any simplifying assumptions about the distribution of the data to be sorted
    • …
    corecore