14 research outputs found

    Embeddings in hypercubes

    Full text link
    One important aspect of efficient use of a hypercube computer to solve a given problem is the assignment of subtasks to processors in such a way that the communication overhead is low. The subtasks and their inter-communication requirements can be modeled by a graph, and the assignment of subtasks to processors viewed as an embedding of the task graph into the graph of the hypercube network. We survey the known results concerning such embeddings, including expansion/dilation tradeoffs for general graphs, embeddings of meshes and trees, packings of multiple copies of a graph, the complexity of finding good embeddings, and critical graphs which are minimal with respect to some property. In addition, we describe several open problems.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/27512/1/0000556.pd

    PERFECT DOMINATING SETS

    No full text
    A dominating set Ă‹ of a graph G is perfect if each vertex of G is dominated by exactly one vertex in Ă‹. We study the existence and construction of PDSs in families of graphs arising from the interconnection networks of parallel computers. These include trees, dags, series-parallel graphs, meshes, tori, hypercubes, cube-connected cycles, cube-connected paths, and de Bruijn graphs. For trees, dags, and series-parallel graphs we give linear time algorithms that determine if a PDS exists, and generate a PDS when one does. For 2- and 3-dimensional meshes, 2-dimensional tori, hypercubes, and cube-connected paths we completely characterize which graphs have a PDS, and the structure of all PDSs. For higher dimensional meshes and tori, cube-connected cycles, and de Bruijn graphs, we show the existence of a PDS in infinitely many cases, but our characterization is not complete. Our results include distance d-domination for arbitrary d

    To appear in Mathematical and Computational Modelling Shift-Product Networks

    No full text
    Economics is the principle driver of current trends to use commodity components in the construction of parallel systems for a range of sizes. One objective is to enable the user to purchase a small system initially, and then extend it through a range of sizes as needs dictate. A desirable feature is to have good system performance through the range of sizes; an undesirable feature is to require the user to purchase excess hardware that will not be used until the system is grown to its maximum allowable size. In this paper, we give a new construction which reconciles these two con icting factors by introducing a way tointerconnect several components of a given small network using only the routers for the small network, with one additional port per router. This construction, whichwe call shift-product, does not unduly raise the communication diameter of the resulting large network

    Perfect Dominating Sets

    No full text
    A dominating set S of a graph G is perfect if each vertex of G is dominated by exactly one vertex in S. We study the existence and construction of PDSs in families of graphs arising from the interconnection networks of parallel computers. These include trees, dags, series-parallel graphs, meshes, tori, hypercubes, cube-connected cycles, cube-connected paths, and de Bruijn graphs. For trees, dags, and series-parallel graphs we give linear time algorithms that determine if a PDS exists, and generate a PDS when one does. For 2- and 3-dimensional meshes, 2-dimensional tori, hypercubes, and cube-connected paths we completely characterize which graphs have a PDS, and the structure of all PDSs. For higher dimensional meshes and tori, cube-connected cycles, and de Bruijn graphs, we show the existence of a PDS in infinitely many cases, but our characterization is not complete. Our results include distance d-domination for arbitrary d. 1 Introduction Suppose G = (V; E) is a graph with vertex se..

    Parallel Allocation Algorithms For Hypercubes And Meshes

    No full text
    We consider the problem of subsystem allocation in the mesh, torus, and hypercube multicomputers. Although the usual practice is to use a serial algorithm on the host processor to do the allocation, we show how the free and non-faulty processors can be used to perform the allocation in parallel. The algorithms we provide are dynamic, require very little storage, and work correctly even in the presence of faults. For the 2-dimensional mesh and torus with n processors, we give an optimal \Theta( p n) time algorithm which identifies all rectangular subsystems that are not busy and not faulty. For the d-dimensional mesh and torus of size n = m \Theta m \Theta \Delta \Delta \Delta \Theta m, we show how to find all submeshes of dimensions k \Theta k \Theta \Delta \Delta \Delta \Theta k, for all k m, in optimal \Theta(dn 1=d ) time. Since the number of subcubes in a hypercube of dimension d is 3 d , the current practice is to allocate only a fraction of the possible subcubes, which d..

    Shift-Product Networks

    No full text
    Economics is the principle driver of current trends to use commodity components in the construction of parallel systems for a range of sizes. One objective is to enable the user to purchase a small system initially, and then extend it through a range of sizes as needs dictate. A desirable feature is to have good system performance through the range of sizes; an undesirable feature is to require the user to purchase excess hardware that will not be used until the system is grown to its maximum allowable size. In this paper, we give a new construction which reconciles these two conflicting factors by introducing a way to interconnect several components of a given small network using only the routers for the small network, with one additional port per router. This construction, which we call shift-product, does not unduly raise the communication diameter of the resulting large network. Key Words: interconnection networks, scalable networks, shuffle-exchange networks, Cayley graphs, produ..

    Constant Time Computation of Minimum Dominating Sets

    No full text
    Let G be a graph and let P (n) denote an element from a one-parameter family of graphs, such as a path of length n, a cycle of length n, or a complete binary tree of height n. We are concerned with determining minimum dominating sets of graphs of the form G \Theta P (n). Using dynamic programming and properties of finite state spaces, we show a constant time algorithm to produce a minimum dominating set of G \Theta P (n), for fixed G and all n, for the one-parameter families mentioned. Previous researchers had used similar techniques but obtained only lineartime algorithms. We also show how a closed form expression can be obtained for the minimum domination number of G \Theta P (n). We discuss extensions of the algorithm to the determination of all minimum dominating sets for G \Theta P (n), and to related problems of coverings, packings, and codes. In addition, we discuss algorithm extensions to several different types of domination, including perfect domination, and to other ways of ..

    Perfect Dominating Sets on Cube-Connected Cycles

    No full text
    Cube-connected cycles are a family of cubic graphs with relatively small diameters and regular structure, making them attractive models for parallel architecture design. The existence of perfect dominating sets for any structural model of parallel computation is both useful for the construction of efficient algorithms for that structure and indicative of practical design constraints. This paper gives a simple algorithmic method for constructing perfect dominating sets on cube-connected cycles where they exist, and proves nonexistence for all other cases. Specifically, standard perfect dominating sets (distance equal to 1) are shown to exist for cubeconnected cycles of order k, k not equal to 5. Moreover, the existence of perfect dominating sets for all distances greater than 1 is disproved (with the trivial exception --- the distance equaling or exceeding the diameter of the graph). Keywords: Cube-Connected Cycles, Dominating Sets, Perfect Dominating Sets, Parallel Architecture, Paral..

    Fault Tolerance of the Cyclic Buddy Subcube Location Scheme in Hypercubes

    No full text
    This paper examines the problem of locating large fault-free subcubes in multiuser hypercube systems. We analyze a new location strategy, the cyclic buddy system, and compare its performance to the buddy system, the gray-coded buddy system, and several variants of them. We show that the cyclic buddy system gives a striking improvement in expected fault tolerance over the above schemes and, since it can easily be implemented in parallel with little overhead, it provides an attractive alternative to these schemes. We also investigate the behavior of these location systems in the folded, or projective, hypercube, and find that the cyclic buddy system, which adapts naturally to this enhancement, significantly outperforms the other schemes. A combination of analytic techniques and simulation is used to examine both worst case and expected case performance. Keywords fault tolerance, subcube location, subcube allocation, hypercube computer, buddy system, gray-coded system, folded hypercube. ..

    The Impact of Spatial Layout of Jobs on Parallel I/O Performance

    No full text
    Input/Output is a big obstacle to effective use of teraflopsscale computing systems. Motivated by earlier parallel I/O measurements on an Intel TFLOPS machine, we conduct studies to determine the sensitivity of parallel I/O performance on multi-programmed mesh-connected machines with respect to number of I/O nodes, number of compute nodes, network link bandwidth, I/O node bandwidth, spatial layout of jobs, and read or write demands of applications. Our extensive simulations and analytical modeling yield important insights into the limitations on parallel I/O performance due to network contention, and into the possible gains in parallel I/O performance that can be achieved by tuning the spatial layout of jobs. Applying these results, we devise a new processor allocation strategy that is sensitive to parallel I/O traffic and the resulting network contention. In performance evaluations driven by synthetic workloads and by a real workload trace captured at the San Diego Supercomputing Cen..
    corecore