38 research outputs found
Wildcard dimensions, coding theory and fault-tolerant meshes and hypercubes
Hypercubes, meshes and tori are well known interconnection networks for parallel computers. The sets of edges in those graphs can be partitioned to dimensions. It is well known that the hypercube can be extended by adding a wildcard dimension resulting in a folded hypercube that has better fault-tolerant and communication capabilities. First we prove that the folded hypercube is optimal in the sense that only a single wildcard dimension can be added to the hypercube. We then investigate the idea of adding wildcard dimensions to d-dimensional meshes and tori. Using techniques from error correcting codes we construct d-dimensional meshes and tori with wildcard dimensions. Finally, we show how these constructions can be used to tolerate edge and node faults in mesh and torus networks
Fault-Tolerant Cube Graphs and Coding Theory
Hypercubes, meshes, tori and Omega networks are well known interconnection
networks for parallel computers. The structure of those graphs can be described in a
more general framework called cube graphs. The idea is to assume that every node in
a graph with q to the power of l (letter l) nodes is represented by a unique string of l (letter l) symbols over GF(q). The edges are specified by a set of offsets, those are vectors of length l (letter l) over GF(q), where the two endpoints of an edge are an offset apart. We study techniques for tolerating edge faults in cube graphs that are based on adding redundant edges. The redundant
graph has the property that the structure of the original graph can be maintained
in the presence of edge faults. Our main contribution is a technique for adding the
redundant edges that utilizes constructions of error-correcting codes and generalizes
existing ad-hoc techniques
On the structure of the adjacency matrix of the line digraph of a regular digraph
We show that the adjacency matrix M of the line digraph of a d-regular
digraph D on n vertices can be written as M=AB, where the matrix A is the
Kronecker product of the all-ones matrix of dimension d with the identity
matrix of dimension n and the matrix B is the direct sum of the adjacency
matrices of the factors in a dicycle factorization of D.Comment: 5 page
An analytical performance model for the Spidergon NoC
Networks on chip (NoC) emerged as a promising alternative to bus-based interconnect networks to handle the increasing communication requirements of the large systems on chip. Employing an appropriate topology for a NoC is of high importance mainly because it typically trade-offs between cross-cutting concerns such as performance and cost. The spidergon topology is a novel architecture which is proposed recently for NoC domain. The objective of the spidergon NoC has been addressing the need for a fixed and optimized topology to realize cost effective multi-processor SoC (MPSoC) development [7]. In this paper we analyze the traffic behavior in the spidergon scheme and present an analytical evaluation of the average message latency in the architecture. We prove the validity of the analysis by comparing the model against the results produced by a discreteevent simulator
Optical Switching and Routing Architectures for Fiber-optic Computer Communication Networks
Optical technology has become a significant part of communication networks. We propose an Optical Interface Message Processor (OPTIMP) that exploits high-bandwidth, parallelism, multi-dimensional capability, and high storage density offered by optics. The most time consuming operations such as switching and routing in communication networks are performed in optical domain in the proposed system. Our design does not suffer from the optical/electrical conversion bottlenecks and can perform switching and routing in the range of Gigabits/s. The proposed design can have significant impact in high-speed communication networks as well as high-speed interconnection networks for parallel computers. The source-destination (S-D) information from a message is first converted to the spatial domain. The routing table stores all S-D codes and the corresponding control codes for the switching module. Using a cylindrical system, the routing table is searched in parallel (single step) and control signals corresponding to the matched S-D row from the table are used to control the switching module. The switching module, based on the SEED array technology, can be reconfigured in GHz range and provide high bandwidth
Distance-hereditary embeddings of circulant graphs
©2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.In this paper we present a distance-hereditary decomposition of optimal chordal rings of 2k2 nodes into a set of rings of 2k nodes, where k is the diameter. All the rings belonging to this set have the same length and their diameter corresponds to the diameter of the chordal ring in which they are embedded. The members of this embedded set of rings are non-disjoint and preserve the minimal routing of the original circulant graph. Besides its practical consequences, our research allows the presentation of these optimal circulant graphs as a particular evolution of the traditional ring topology.Carmen Martinez, Beivide Beivide, Jaime Gutierrez, [Maria] Cruz Iz
Quarc: an architecture for efficient on-chip communication
The exponential downscaling of the feature size has enforced a paradigm shift from computation-based design to communication-based design in system on chip development. Buses, the traditional communication architecture in systems on chip, are incapable of addressing the increasing bandwidth requirements of future large systems.
Networks on chip have emerged as an interconnection architecture offering unique solutions to the technological and design issues related to communication in future systems on chip. The transition from buses as a shared medium to networks on chip as a segmented medium has given rise to new challenges in system on chip realm.
By leveraging the shared nature of the communication medium, buses have been highly efficient in delivering multicast communication. The segmented nature of networks, however, inhibits the multicast messages to be delivered as efficiently by networks on chip. Relying on extensive research on multicast communication in parallel computers, several network on chip architectures have offered mechanisms to perform the operation, while conforming to resource constraints of the network on chip paradigm. Multicast communication in majority of these networks on chip is implemented by establishing a connection between source and all multicast destinations before the message transmission
commences. Establishing the connections incurs an overhead and, therefore, is not desirable; in particular in latency sensitive services such as cache coherence.
To address high performance multicast communication, this research presents Quarc, a novel network on chip architecture. The Quarc architecture targets an area-efficient, low power, high performance implementation. The thesis covers a detailed representation of
the building blocks of the architecture, including topology, router and network interface.
The cost and performance comparison of the Quarc architecture against other network on chip architectures reveals that the Quarc architecture is a highly efficient architecture.
Moreover, the thesis introduces novel performance models of complex traffic patterns, including multicast and quality of service-aware communication
Energy Wall for Exascale Supercomputing
"Sustainable development" is one of the major issues in the 21st century. Thus the notions of green computing, green development and so on show up one after another. As the large-scale parallel computing systems develop rapidly, energy consumption of such systems is becoming very huge, especially system performance reaches Petascale (10^15 Flops) or even Exascale (10^18 Flops). The huge energy consumption increases the system temperature, which seriously undermines the stability and reliability, and limits the growth of system size. The effects of energy consumption on scalability become a growing concern. Against the background, this paper proposes the concept of "Energy Wall" to highlight the significance of achieving scalable performance in peta/exascale supercomputing by taking energy consumption into account. We quantify the effect of energy consumption on scalability by building the energy-efficiency speedup model, which integrates computing performance and system energy. We define the energy wall quantitatively, and provide the theorem on the existence of the energy wall, and categorize the large-scale parallel computers according to the energy consumption. In the context of several representative types of HPC applications, we analyze and extrapolate the existence of the energy wall considering three kinds of topologies, 3D-Torus, binary n-cube and Fat tree which provides insights on how to mitigate the energy wall effect in system design and through hardware/software optimization in peta/exascale supercomputing
On random wiring in practicable folded clos networks for modern datacenters
Big scale, high performance and fault-tolerance, low-cost and graceful expandability are pursued features in current datacenter networks (DCN). Although there have been many proposals for DCNs, most modern installations are equipped with classical folded Clos networks. Recently, regular random topologies, as the Jellyfish, have been proposed for DCNs. However, their completely unstructured nature entails serious design problems. In this paper we propose Random Folded Clos (RFC) and Hydra networks in which the interconnection between certain switches levels is made randomly. Both RFCs and Hydras preserve important properties of Clos networks that provide a straightforward deadlock-free multi-path routing. The proposed networks leverage randomness to be gracefully expandable, thereby allowing for fine grain upgrading. RFCs and Hydras are compared in the paper, in topological and cost terms, against fat-trees, orthogonal fat-trees and random regular networks. Also, experiments are carried out to simulate their performance under synthetic traffic patterns emulating common loads present in warehouse scale computers. These theoretical and empirical studies reveal the interest of these topologies, concluding that Hydra constitutes a practicable alternative to current datacenter networks since it appropriately balance all the main design requirements. Moreover, Hydras perform better than the fat-trees, their natural competitor, being able to connect the same or more computing nodes with significant lower cost and latency while exhibiting comparable throughput. © 1990-2012 IEEE