Search CORE

231 research outputs found

Deterministic 1-k routing on meshes with applications to worm-hole routing

Author: Kaufmann M.
Sibeyn J.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1993
Field of study

1

k

routing each of the

n^2

processing units of an

n \times n

mesh connected computer initially holds

1

packet which must be routed such that any processor is the destination of at most

k

packets. This problem reflects practical desire for routing better than the popular routing of permutations.

1

k

routing also has implications for hot-potato worm-hole routing, which is of great importance for real world systems. We present a near-optimal deterministic algorithm running in \sqrt{k} \cdot n / 2 + \go{n} steps. We give a second algorithm with slightly worse routing time but working queue size three. Applying this algorithm considerably reduces the routing time of hot-potato worm-hole routing. Non-trivial extensions are given to the general

l

k

routing problem and for routing on higher dimensional meshes. Finally we show that

k

k

routing can be performed in \go{k \cdot n} steps with working queue size four. Hereby the hot-potato worm-hole routing problem can be solved in \go{k^{3/2} \cdot n} steps

Online Permutation Routing in Partitioned Optical Passive Star Networks

Author: Mei Alessandro
Rizzi Romeo
Publication venue
Publication date: 25/02/2005
Field of study

This paper establishes the state of the art in both deterministic and randomized online permutation routing in the POPS network. Indeed, we show that any permutation can be routed online on a POPS network either with

O(\frac{d}{g}\log g)

deterministic slots, or, with high probability, with

5c\lceil d/g\rceil+o(d/g)+O(\log\log g)

randomized slots, where constant

c=\exp (1+e^{-1})\approx 3.927

. When

d=\Theta(g)

, that we claim to be the "interesting" case, the randomized algorithm is exponentially faster than any other algorithm in the literature, both deterministic and randomized ones. This is true in practice as well. Indeed, experiments show that it outperforms its rivals even starting from as small a network as a POPS(2,2), and the gap grows exponentially with the size of the network. We can also show that, under proper hypothesis, no deterministic algorithm can asymptotically match its performance

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Simulation Of Multi-core Systems And Interconnections And Evaluation Of Fat-Mesh Networks

Author: Zhang Yu
Publication venue
Publication date: 28/01/2009
Field of study

Simulators are very important in computer architecture research as they enable the exploration of new architectures to obtain detailed performance evaluation without building costly physical hardware. Simulation is even more critical to study future many-core architectures as it provides the opportunity to assess currently non-existing computer systems. In this thesis, a multiprocessor simulator is presented based on a cycle accurate architecture simulator called SESC. The shared L2 cache system is extended into a distributed shared cache (DSC) with a directory-based cache coherency protocol. A mesh network module is extended and integrated into SESC to replace the bus for scalable inter-processor communication. While these efforts complete an extended multiprocessor simulation infrastructure, two interconnection enhancements are proposed and evaluated. A novel non-uniform fat-mesh network structure similar to the idea of fat-tree is proposed. This non-uniform mesh network takes advantage of the average traffic pattern, typically all-to-all in DSC, to dedicate additional links for connections with heavy traffic (e.g., near the center) and fewer links for lighter traffic (e.g., near the periphery). Two fat-mesh schemes are implemented based on different routing algorithms. Analytical fat-mesh models are constructed by presenting the expressions for the traffic requirements of personalized all-to-all traffic. Performance improvements over the uniform mesh are demonstrated in the results from the simulator. A hybrid network consisting of one packet switching plane and multiple circuit switching planes is constructed as the second enhancement. The circuit switching planes provide fast paths between neighbors with heavy communication traffic. A compiler technique that abstracts the symbolic expressions of benchmarks' communication patterns can be used to help facilitate the circuit establishment

Towards practical permutation routing on meshes

Author: Kaufmann M.
Meyer U.
Sibeyn J.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1994
Field of study

We consider the permutation routing problem on two-dimensional

n \times n

meshes. To be practical, a routing algorithm is required to ensure very small queue sizes

Q

, and very low running time

T

, not only asymptotically but particularly also for the practically important

n

up to

1000

. With a technique inspired by a scheme of Kaklamanis/Krizanc/Rao, we obtain a near-optimal result:

T = 2 \cdot n + {\cal O}(1)

with

Q = 2

. Although

Q

is very attractive now, the lower order terms in

T

make this algorithm highly impractical. Therefore we present simple schemes which are asymptotically slower, but have

T

around

3 \cdot n

for {\em all}

n

and

Q

between 2 and 8

Turbo NOC: a framework for the design of Network On Chip based turbo decoder architectures

Author: Guido Masera
Maurizio Martina
Member Ieee
Senior Member Ieee
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/09/2009
Field of study

This work proposes a general framework for the design and simulation of network on chip based turbo decoder architectures. Several parameters in the design space are investigated, namely the network topology, the parallelism degree, the rate at which messages are sent by processing nodes over the network and the routing strategy. The main results of this analysis are: i) the most suited topologies to achieve high throughput with a limited complexity overhead are generalized de-Bruijn and generalized Kautz topologies; ii) depending on the throughput requirements different parallelism degrees, message injection rates and routing algorithms can be used to minimize the network area overhead.Comment: submitted to IEEE Trans. on Circuits and Systems I (submission date 27 may 2009

arXiv.org e-Print Archive

CiteSeerX

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Randomized Routing and Sorting on the Reconfigurable Mesh

Author: McKendall Theodore
Rajasekaran Sanguthevar
Publication venue: ScholarlyCommons
Publication date: 01/05/1992
Field of study

In this paper we demonstrate the power of reconfiguration by presenting efficient randomized algorithms for both packet routing and sorting on a reconfigurable mesh connected computer (referred to simply as the mesh from hereon). The run times of these algorithms are better than the best achievable time bounds on a conventional mesh. In particular, we show that permutation routing problem can be solved on a linear array of size n in 3/4n steps, whereas n-1 is the best possible run time without reconfiguration. We also show that permutation routing on an n x n reconfigurable mesh can be done in time n + o(n)using a randomized algorithm or in time 1.25n + o(n) deterministically. In contrast, 2n-2 is the diameter of a conventional mesh and hence routing and sorting will need at least 2n-2 steps on a conventional mesh. In addition we show that the problem of sorting can be solved in time n+ o(n). All these time bounds hold with high probability. The bisection lower bound for both sorting and routing on the mesh is n/2, and hence our algorithms have nearly optimal time bounds

The Effect Of Hot Spots On The Performance Of Mesh--Based Networks

Author: Al-Issa Yazan M
Publication venue: LSU Digital Commons
Publication date: 01/08/1999
Field of study

Direct network performance is affected by different design parameters which include number of virtual channels, number of ports, routing algorithm, switching technique, deadlock handling technique, packet size, and buffer size. Another factor that affects network performance is the traffic pattern. In this thesis, we study the effect of hotspot traffic on system performance. Specifically, we study the effect of hotspot factor, hotspot number, and hot spot location on the performance of mesh-based networks. Simulations are run on two network topologies, both the mesh and torus. We pay more attention to meshes because they are widely used in commercial machines. Comparisons between oblivious wormhole switching and chaotic packet switching are reported. Overall packet switching proved to be more efficient in terms of throughput when compared to wormhole switching. In the case of uniform random traffic, it is shown that the differences between chaotic and oblivious routing are indistinguishable. Networks with low number of hotspots show better performance. As the number of hotspots increases network latency tends to increase. It is shown that when the hotspot factor increases, performance of packet switching is better than that of wormhole switching. It is also shown that the location of hotspots affects network performance particularly with the oblivious routers since their achieved latencies proved to be more vulnerable to changes in the hotspot location. It is also shown that the smaller the size of the network the earlier network saturation occurs. Further, it is shown that the chaos router’s adaptivity is useful in this case. Finally, for tori, performance is not greatly affected by hotspot presence. This is mostly due to the symmetric nature of tori

Louisiana State University