782 research outputs found
Routing Permutations in Partitioned Optical Passive Star Networks
It is shown that a POPS network with g groups and d processors per group can
efficiently route any permutation among the n=dg processors. The number of
slots used is optimal in the worst case, and is at most the double of the
optimum for all permutations p such that p(i)i for all i.Comment: 8 pages, 3 figure
Expanded delta networks for very large parallel computers
In this paper we analyze a generalization of the traditional delta network, introduced by Patel [21], and dubbed Expanded Delta Network (EDN). These networks provide in general multiple paths that can be exploited to reduce contention in the network resulting in increased performance. The crossbar and traditional delta networks are limiting cases of this class of networks. However, the delta network does not provide the multiple paths that the more general expanded delta networks provide, and crossbars are to costly to use for large networks. The EDNs are analyzed with respect to their routing capabilities in the MIMD and SIMD models of computation.The concepts of capacity and clustering are also addressed. In massively parallel SIMD computers, it is the trend to put a larger number processors on a chip, but due to I/O constraints only a subset of the total number of processors may have access to the network. This is introduced as a Restricted Access Expanded Delta Network of which the MasPar MP-1 router network is an example
A Benes Based NoC Switching Architecture for Mixed Criticality Embedded Systems
Multi-core, Mixed Criticality Embedded (MCE) real-time systems require high
timing precision and predictability to guarantee there will be no interference
between tasks. These guarantees are necessary in application areas such as
avionics and automotive, where task interference or missed deadlines could be
catastrophic, and safety requirements are strict. In modern multi-core systems,
the interconnect becomes a potential point of uncertainty, introducing major
challenges in proving behaviour is always within specified constraints,
limiting the means of growing system performance to add more tasks, or provide
more computational resources to existing tasks.
We present MCENoC, a Network-on-Chip (NoC) switching architecture that
provides innovations to overcome this with predictable, formally verifiable
timing behaviour that is consistent across the whole NoC. We show how the
fundamental properties of Benes networks benefit MCE applications and meet our
architecture requirements. Using SystemVerilog Assertions (SVA), formal
properties are defined that aid the refinement of the specification of the
design as well as enabling the implementation to be exhaustively formally
verified. We demonstrate the performance of the design in terms of size,
throughput and predictability, and discuss the application level considerations
needed to exploit this architecture
Analytical performance modelling of adaptive wormhole routing in the star interconnection network
The star graph was introduced as an attractive alternative to the well-known hypercube and its properties have been well studied in the past. Most of these studies have focused on topological properties and algorithmic aspects of this network. Although several analytical models have been proposed in the literature for different interconnection networks, none of them have dealt with star graphs. This paper proposes the first analytical model to predict message latency in wormhole-switched star interconnection networks with fully adaptive routing. The analysis focuses on a fully adaptive routing algorithm which has shown to be the most effective for star graphs. The results obtained from simulation experiments confirm that the proposed model exhibits a good accuracy under different operating conditions
A system for routing arbitrary directed graphs on SIMD architectures
There are many problems which can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from connecting vertices. A method is given for parallelizing such problems on an SIMD machine model that is bit-serial and uses only nearest neighbor connections for communication. Each vertex of the graph will be assigned to a processor in the machine. Algorithms are given that will be used to implement movement of data along the arcs of the graph. This architecture and algorithms define a system that is relatively simple to build and can do graph processing. All arcs can be transversed in parallel in time O(T), where T is empirically proportional to the diameter of the interconnection network times the average degree of the graph. Modifying or adding a new arc takes the same time as parallel traversal
Symmetric Interconnection Networks from Cubic Crystal Lattices
Torus networks of moderate degree have been widely used in the supercomputer
industry. Tori are superb when used for executing applications that require
near-neighbor communications. Nevertheless, they are not so good when dealing
with global communications. Hence, typical 3D implementations have evolved to
5D networks, among other reasons, to reduce network distances. Most of these
big systems are mixed-radix tori which are not the best option for minimizing
distances and efficiently using network resources. This paper is focused on
improving the topological properties of these networks.
By using integral matrices to deal with Cayley graphs over Abelian groups, we
have been able to propose and analyze a family of high-dimensional grid-based
interconnection networks. As they are built over -dimensional grids that
induce a regular tiling of the space, these topologies have been denoted
\textsl{lattice graphs}. We will focus on cubic crystal lattices for modeling
symmetric 3D networks. Other higher dimensional networks can be composed over
these graphs, as illustrated in this research. Easy network partitioning can
also take advantage of this network composition operation. Minimal routing
algorithms are also provided for these new topologies. Finally, some practical
issues such as implementability and preliminary performance evaluations have
been addressed
- …