Search CORE

441 research outputs found

Towards practical permutation routing on meshes

Author: Kaufmann M.
Meyer U.
Sibeyn J.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1994
Field of study

We consider the permutation routing problem on two-dimensional

n \times n

meshes. To be practical, a routing algorithm is required to ensure very small queue sizes

Q

, and very low running time

T

, not only asymptotically but particularly also for the practically important

n

up to

1000

. With a technique inspired by a scheme of Kaklamanis/Krizanc/Rao, we obtain a near-optimal result:

T = 2 \cdot n + {\cal O}(1)

with

Q = 2

. Although

Q

is very attractive now, the lower order terms in

T

make this algorithm highly impractical. Therefore we present simple schemes which are asymptotically slower, but have

T

around

3 \cdot n

for {\em all}

n

and

Q

between 2 and 8

MPG.PuRe

A Benes Based NoC Switching Architecture for Mixed Criticality Embedded Systems

Author: Eder Kerstin
Kerrison Steve
May David
Publication venue
Publication date: 01/01/2016
Field of study

Multi-core, Mixed Criticality Embedded (MCE) real-time systems require high timing precision and predictability to guarantee there will be no interference between tasks. These guarantees are necessary in application areas such as avionics and automotive, where task interference or missed deadlines could be catastrophic, and safety requirements are strict. In modern multi-core systems, the interconnect becomes a potential point of uncertainty, introducing major challenges in proving behaviour is always within specified constraints, limiting the means of growing system performance to add more tasks, or provide more computational resources to existing tasks. We present MCENoC, a Network-on-Chip (NoC) switching architecture that provides innovations to overcome this with predictable, formally verifiable timing behaviour that is consistent across the whole NoC. We show how the fundamental properties of Benes networks benefit MCE applications and meet our architecture requirements. Using SystemVerilog Assertions (SVA), formal properties are defined that aid the refinement of the specification of the design as well as enabling the implementation to be exhaustively formally verified. We demonstrate the performance of the design in terms of size, throughput and predictability, and discuss the application level considerations needed to exploit this architecture

arXiv.org e-Print Archive

Crossref

ResearchOnline at James Cook University

Explore Bristol Research

Online Permutation Routing in Partitioned Optical Passive Star Networks

Author: Mei Alessandro
Rizzi Romeo
Publication venue
Publication date: 25/02/2005
Field of study

This paper establishes the state of the art in both deterministic and randomized online permutation routing in the POPS network. Indeed, we show that any permutation can be routed online on a POPS network either with

O(\frac{d}{g}\log g)

deterministic slots, or, with high probability, with

5c\lceil d/g\rceil+o(d/g)+O(\log\log g)

randomized slots, where constant

c=\exp (1+e^{-1})\approx 3.927

. When

d=\Theta(g)

, that we claim to be the "interesting" case, the randomized algorithm is exponentially faster than any other algorithm in the literature, both deterministic and randomized ones. This is true in practice as well. Indeed, experiments show that it outperforms its rivals even starting from as small a network as a POPS(2,2), and the gap grows exponentially with the size of the network. We can also show that, under proper hypothesis, no deterministic algorithm can asymptotically match its performance

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Archivio della ricerca- Università di Roma La Sapienza

Matrix transpose on meshes with buses

Author: Békési József
Galambos Gábor
Publication venue
Publication date: 01/01/2016
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Simulation Of Multi-core Systems And Interconnections And Evaluation Of Fat-Mesh Networks

Author: Zhang Yu
Publication venue
Publication date: 28/01/2009
Field of study

Simulators are very important in computer architecture research as they enable the exploration of new architectures to obtain detailed performance evaluation without building costly physical hardware. Simulation is even more critical to study future many-core architectures as it provides the opportunity to assess currently non-existing computer systems. In this thesis, a multiprocessor simulator is presented based on a cycle accurate architecture simulator called SESC. The shared L2 cache system is extended into a distributed shared cache (DSC) with a directory-based cache coherency protocol. A mesh network module is extended and integrated into SESC to replace the bus for scalable inter-processor communication. While these efforts complete an extended multiprocessor simulation infrastructure, two interconnection enhancements are proposed and evaluated. A novel non-uniform fat-mesh network structure similar to the idea of fat-tree is proposed. This non-uniform mesh network takes advantage of the average traffic pattern, typically all-to-all in DSC, to dedicate additional links for connections with heavy traffic (e.g., near the center) and fewer links for lighter traffic (e.g., near the periphery). Two fat-mesh schemes are implemented based on different routing algorithms. Analytical fat-mesh models are constructed by presenting the expressions for the traffic requirements of personalized all-to-all traffic. Performance improvements over the uniform mesh are demonstrated in the results from the simulator. A hybrid network consisting of one packet switching plane and multiple circuit switching planes is constructed as the second enhancement. The circuit switching planes provide fast paths between neighbors with heavy communication traffic. A compiler technique that abstracts the symbolic expressions of benchmarks' communication patterns can be used to help facilitate the circuit establishment

D-Scholarship@Pitt

Shared memory with hidden latency on a family of mesh-like networks

Author: Harris Tim J.
Publication venue: The University of Edinburgh
Publication date: 01/01/1995
Field of study

Edinburgh Research Archive

Towards better algorithms for parallel backtracking

Author: Sanders Peter
Publication venue
Publication date: 02/08/2007
Field of study

Many algorithms in operations research and artificial intelligence are based on depth first search in implicitly defined trees. For parallelizing these algorithms, a load balancing scheme is needed which is able to evenly distribute parts of an irregularly shaped tree over the processors. It should work with minimal interprocessor communication and without prior knowledge of the tree\u27s shape. Previously known load balancing algorithms either require sending a message for each tree node or they only work efficiently for large search trees. This paper introduces new randomized dynamic load balancing algorithms for {\em tree structured computations}, a generalization of backtrack search.These algorithms only need to communicate when necessary and have an asymptotically optimal scalability for many important cases. They work work on hypercubes, butterflies, meshes and many other architectures

KITopen

Communication algorithms for isotropic tasks in hypercubes and wraparound meshes

Author
Publication venue: Massachusetts Institute of Technology, Laboratory for Information and Decision Systems
Publication date: 01/01/1990
Field of study

Cover title.Includes bibliographical references (p. 29-30).Research supported by the NSF. NSF-ECS-8519058 Research supported by the ARO. DAAL03-86-K-0171by Emmanouel A. Varvarigos and Dimitri P. Bertsekas

DSpace@MIT