Search CORE

10 research outputs found

Some Theoretical Results of Hypercube for Parallel Architecture

Author: Nagata M.
Publication venue: WP-92-018
Publication date: 01/02/1992
Field of study

This paper surveys some theoretical results of the hypercube for design of VLSI architecture. The parallel computer including the hypercube multiprocessor will become a leading technology that supports efficient computation for large uncertain systems

International Institute for Applied Systems Analysis (IIASA)

Rectilinear partitioning of irregular data parallel computations

Author: Nicol David M.
Publication venue
Publication date
Field of study

New mapping algorithms for domain oriented data-parallel computations, where the workload is distributed irregularly throughout the domain, but exhibits localized communication patterns are described. Researchers consider the problem of partitioning the domain for parallel processing in such a way that the workload on the most heavily loaded processor is minimized, subject to the constraint that the partition be perfectly rectilinear. Rectilinear partitions are useful on architectures that have a fast local mesh network. Discussed here is an improved algorithm for finding the optimal partitioning in one dimension, new algorithms for partitioning in two dimensions, and optimal partitioning in three dimensions. The application of these algorithms to real problems are discussed

NASA Technical Reports Server

Optimal processor assignment for pipeline computations

Author: Choudhury Alok N.
Narahari Bhagirath
Nicol David M.
Simha Rahul
Publication venue
Publication date: 01/01/1991
Field of study

The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered

NASA Technical Reports Server

Syracuse University Research Facility and Collaborative Environment

Embedding Meshes on the Star Graph

Author: Ranka Sanjay
Wang Jhy-Chun
Yeh Nangkang
Publication venue: SURFACE at Syracuse University
Publication date: 01/08/1989
Field of study

We develop algorithms for mapping n-dimensional meshes on a star graph of degree n with expansion 1 and dilation 3. We show that an n degree star graph can efficiently simulate an n-dimensional mesh

Syracuse University Research Facility and Collaborative Environment

On graphs embeddable in a layer of a hypercube and their extremal numbers

Author: Axenovich Maria
Martin Ryan R.
Winter Christian
Publication venue
Publication date: 27/03/2023
Field of study

A graph is cubical if it is a subgraph of a hypercube. For a cubical graph

H

and a hypercube

Q_n

ex(Q_n, H)

is the largest number of edges in an

H

-free subgraph of

Q_n

. If

ex(Q_n, H)

is equal to a positive proportion of the number of edges in

Q_n

H

is said to have positive Tur\'an density in a hypercube; otherwise it has zero Tur\'an density. Determining

ex(Q_n, H)

and even identifying whether

H

has positive or zero Tur\'an density remains a widely open question for general

H

. In this paper we focus on layered graphs, i.e., graphs that are contained in an edge-layer of some hypercube. Graphs

H

that are not layered have positive Tur\'an density because one can form an

H

-free subgraph of

Q_n

consisting of edges of every other layer. For example, a

4

-cycle is not layered and has positive Tur\'an density. However, in general it is not obvious what properties layered graphs have. We give a characterisation of layered graphs in terms of edge-colorings and show that any

n

-vertex layered graphs has at most

\frac{1}{2}n \log n (1+o(1))

edges. We show that most non-trivial subdivisions have zero Tur\'an density, extending known results on zero Tur\'an density of even cycles of length at least

12

and of length

8

. However, we prove that there are cubical graphs of girth

8

that are not layered and thus having positive Tur\'an density. The cycle of length

10

remains the only cycle for which it is not known whether its Tur\'an density is positive or not. We prove that

ex(Q_n, C_{10})= \Omega(n2^n/ \log^a n)

, for a constant

a

, showing that the extremal number for a

10

-cycle behaves differently from any other cycle of zero Tur\'an density

arXiv.org e-Print Archive

Fault-Tolerant Ring Embeddings in Hypercubes -- A Reconfigurable Approach

Author: Liu Jun-Lin
McMillin Bruce M.
Publication venue: Scholars\u27 Mine
Publication date: 01/12/1993
Field of study

We investigate the problem of designing reconfigurable embedding schemes for a fixed hypercube (without redundant processors and links). The fundamental idea for these schemes is to embed a basic network on the hypercube without fully utilizing the nodes on the hypercube. The remaining nodes can be used as spares to reconfigure the embeddings in case of faults. The result of this research shows that by carefully embedding the application graphs, the topological properties of the embedding can be preserved under fault conditions, and reconfiguration can be carried out efficiently. In this dissertation, we choose the ring as the basic network of interest, and propose several schemes for the design of reconfigurable embeddings with the aim of minimizing reconfiguration cost and performance degradation. The cost is measured by the number of node-state changes or reconfiguration steps needed for processing of the reconfiguration, and the performance degradation is characterized as the dilation of the new embedding after reconfiguration. Compared to the existing schemes, our schemes surpass the existing ones in terms of applicability of schemes and reconfiguration cost needed for the resulting embeddings

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Performance effects of node mapping on the IBM BlueGene/L machine

Author: Smith Brian Edward
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2005
Field of study

The IBM BlueGene/L (BG/L) supercomputer is a new machine consisting of up to 65536 relatively modest compute nodes connected with three application-level networks -- a high-performance point-to-point 3D torus network, a global combining/broadcast tree network for collective operations, and a global interrupt/barrier network for extremely fast global barriers. The BG/L control system allows the user to assign MPI logical ranks to physical torus coordinates at run-time in an arbitrary manner as long as all nodes are uniquely included in the mapping. This presents the possibility of increasing application performance with very little effort. This thesis investigates the performance effects of node mapping with several benchmarks and scientific codes using a variety of existing and new mapping strategies. The benchmarks are the NAS parallel benchmarks, the Ames Laboratory Classical Molecular dynamics code (ALCMD), and the General Atomic and Molecular Electronic Structure System (GAMESS) application. The NAS benchmarks are short, easy to understand, and fairly well known. ALCMD has an interesting communication pattern that should benefit from a good mapping strategy. GAMESS is one application that is not necessarily well-suited for running on BlueGene because it requires a large amount of compute power and memory per node. However, it provides an interesting data point for performance of applications that were not designed for a particular system and the possible benefits of mapping on such applications. The mappings investigated were the stock permutations (XYZ, XZY, etc), Gray-code based mesh mappings, random maps, variations on Gray-code maps for embedding 2D meshes in the 3D torus, and three maps designed for GAMESS. Performance results are presented for node mappings on several BG/L partition sizes

Digital Repository @ Iowa State University (ISU)

Hypercube-Based Topologies With Incremental Link Redundancy.

Author: Latifi Shahram
Publication venue: LSU Digital Commons
Publication date: 01/01/1989
Field of study

Hypercube structures have received a great deal of attention due to the attractive properties inherent to their topology. Parallel algorithms targeted at this topology can be partitioned into many tasks, each of which running on one node processor. A high degree of performance is achievable by running every task individually and concurrently on each node processor available in the hypercube. Nevertheless, the performance can be greatly degraded if the node processors spend much time just communicating with one another. The goal in designing hypercubes is, therefore, to achieve a high ratio of computation time to communication time. The dissertation addresses primarily ways to enhance system performance by minimizing the communication time among processors. The need for improving the performance of hypercube networks is clearly explained. Three novel topologies related to hypercubes with improved performance are proposed and analyzed. Firstly, the Bridged Hypercube (BHC) is introduced. It is shown that this design is remarkably more efficient and cost-effective than the standard hypercube due to its low diameter. Basic routing algorithms such as one to one and broadcasting are developed for the BHC and proven optimal. Shortcomings of the BHC such as its asymmetry and limited application are clearly discussed. The Folded Hypercube (FHC), a symmetric network with low diameter and low degree of the node, is introduced. This new topology is shown to support highly efficient communications among the processors. For the FHC, optimal routing algorithms are developed and proven to be remarkably more efficient than those of the conventional hypercube. For both BHC and FHC, network parameters such as average distance, message traffic density, and communication delay are derived and comparatively analyzed. Lastly, to enhance the fault tolerance of the hypercube, a new design called Fault Tolerant Hypercube (FTH) is proposed. The FTH is shown to exhibit a graceful degradation in performance with the existence of faults. Probabilistic models based on Markov chain are employed to characterize the fault tolerance of the FTH. The results are verified by Monte Carlo simulation. The most attractive feature of all new topologies is the asymptotically zero overhead associated with them. The designs are simple and implementable. These designs can lead themselves to many parallel processing applications requiring high degree of performance

Louisiana State University

On embedding rectangular grids in hypercubes.

Author: Chan MY
Chin FYL
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1988
Field of study

The following graph-embedding question is addressed: given a two-dimensional grid and the smallest hypercube with at least as many nodes as grid points, how can one assign grid points to hypercube nodes (with at most one grid point per node) so as to keep grid neighbors near each other in the cube? An embedding scheme for an infinite class of two-dimensional grids is given that keeps grid neighbors within a distance of two apart.link_to_subscribed_fulltex

HKU Scholars Hub