106 research outputs found

    ONE BY ONE EMBEDDING THE CROSSED HYPERCUBE INTO PANCAKE GRAPH

    Get PDF
    Let G and H be two simple undirected graphs. An embedding of the graph G into the graph H is an injective mapping f from vertices of G to the vertices of H. The dilation of embedding is the maximum distance between f(u), f(v) taken over edges (u, v) of G. The Pancake graph is one as viable interconnection scheme for parallel computers, which has been examined by a number of researchers. The Pancake was proposed as alternatives to the hypercube for interconnecting processors in parallel computer. Some good attractive properties of this interconnection network include: vertex symmetry, small degree, a sub-logarithmic diameter, extendability, and high connectivity (robustness), easy routing and regularity of topology, fault tolerance, extensibility and embeddability of others topologies. In this paper, we give a construction of one by one embedding of dilation 5 of crossed hypercube into Pancake graph

    Selection, Routing and Sorting on the Star Graph

    Get PDF
    We consider the problems of selection, routing and sorting on an n-star graph (with n! nodes), an interconnection network which has been proven to possess many special properties. We identify a tree like subgraph (which we call as a \u27(k, l, k) chain network\u27) of the star graph which enables us to design efficient algorithms for the above mentioned problems. We present an algorithm that performs a sequence of n prefix computations in O(n2) time. This algorithm is used as a subroutine in our other algorithms. In addition we offer an efficient deterministic sorting algorithm that runs in O(n3lg n) steps. Though an algorithm with the same time bound has been proposed before, our algorithm is very simple and is based on a different approach. We also show that sorting can be performed on the n star graph in time O(n3) and that selection of a set of uniformly distributed n keys can be performed in O(n2) time with high probability. Finally, we also present a deterministic (non oblivious) routing algorithm that realizes any permutation in O(n3) steps on the n-star graph. There exists an algorithm in the literature that can perform a single prefix computation in O(n lg n) time. The best known previous algorithm for sorting has a run time of O(n3 lg n) and is deterministic. To our knowledge, the problem of selection has not been considered before on the star graph

    Properties and algorithms of the hyper-star graph and its related graphs

    Get PDF
    The hyper-star interconnection network was proposed in 2002 to overcome the drawbacks of the hypercube and its variations concerning the network cost, which is defined by the product of the degree and the diameter. Some properties of the graph such as connectivity, symmetry properties, embedding properties have been studied by other researchers, routing and broadcasting algorithms have also been designed. This thesis studies the hyper-star graph from both the topological and algorithmic point of view. For the topological properties, we try to establish relationships between hyper-star graphs with other known graphs. We also give a formal equation for the surface area of the graph. Another topological property we are interested in is the Hamiltonicity problem of this graph. For the algorithms, we design an all-port broadcasting algorithm and a single-port neighbourhood broadcasting algorithm for the regular form of the hyper-star graphs. These algorithms are both optimal time-wise. Furthermore, we prove that the folded hyper-star, a variation of the hyper-star, to be maixmally fault-tolerant

    Properties and algorithms of the (n, k)-arrangement graphs

    Get PDF
    The (n, k)-arrangement interconnection topology was first introduced in 1992. The (n, k )-arrangement graph is a class of generalized star graphs. Compared with the well known n-star, the (n, k )-arrangement graph is more flexible in degree and diameter. However, there are few algorithms designed for the (n, k)-arrangement graph up to present. In this thesis, we will focus on finding graph theoretical properties of the (n, k)- arrangement graph and developing parallel algorithms that run on this network. The topological properties of the arrangement graph are first studied. They include the cyclic properties. We then study the problems of communication: broadcasting and routing. Embedding problems are also studied later on. These are very useful to develop efficient algorithms on this network. We then study the (n, k )-arrangement network from the algorithmic point of view. Specifically, we will investigate both fundamental and application algorithms such as prefix sums computation, sorting, merging and basic geometry computation: finding convex hull on the (n, k )-arrangement graph. A literature review of the state-of-the-art in relation to the (n, k)-arrangement network is also provided, as well as some open problems in this area

    Fixed Linear Crossing Minimization by Reduction to the Maximum Cut Problem

    Get PDF
    Many real-life scheduling, routing and locating problems can be formulated as combinatorial optimization problems whose goal is to find a linear layout of an input graph in such a way that the number of edge crossings is minimized. In this paper, we study a restricted version of the linear layout problem where the order of vertices on the line is fixed, the so-called fixed linear crossing number problem (FLCNP). We show that this NP-hard problem can be reduced to the well-known maximum cut problem. The latter problem was intensively studied in the literature; practically efficient exact algorithms based on the branch-and-cut technique have been developed. By an experimental evaluation on a variety of graphs, we prove that using this reduction for solving FLCNP compares favorably to earlier branch-and-bound algorithms

    Interconnection networks for parallel and distributed computing

    Get PDF
    Parallel computers are generally either shared-memory machines or distributed- memory machines. There are currently technological limitations on shared-memory architectures and so parallel computers utilizing a large number of processors tend tube distributed-memory machines. We are concerned solely with distributed-memory multiprocessors. In such machines, the dominant factor inhibiting faster global computations is inter-processor communication. Communication is dependent upon the topology of the interconnection network, the routing mechanism, the flow control policy, and the method of switching. We are concerned with issues relating to the topology of the interconnection network. The choice of how we connect processors in a distributed-memory multiprocessor is a fundamental design decision. There are numerous, often conflicting, considerations to bear in mind. However, there does not exist an interconnection network that is optimal on all counts and trade-offs have to be made. A multitude of interconnection networks have been proposed with each of these networks having some good (topological) properties and some not so good. Existing noteworthy networks include trees, fat-trees, meshes, cube-connected cycles, butterflies, Möbius cubes, hypercubes, augmented cubes, k-ary n-cubes, twisted cubes, n-star graphs, (n, k)-star graphs, alternating group graphs, de Bruijn networks, and bubble-sort graphs, to name but a few. We will mainly focus on k-ary n-cubes and (n, k)-star graphs in this thesis. Meanwhile, we propose a new interconnection network called augmented k-ary n- cubes. The following results are given in the thesis.1. Let k ≥ 4 be even and let n ≥ 2. Consider a faulty k-ary n-cube Q(^k_n) in which the number of node faults f(_n) and the number of link faults f(_e) are such that f(_n) + f(_e) ≤ 2n - 2. We prove that given any two healthy nodes s and e of Q(^k_n), there is a path from s to e of length at least k(^n) - 2f(_n) - 1 (resp. k(^n) - 2f(_n) - 2) if the nodes s and e have different (resp. the same) parities (the parity of a node Q(^k_n) in is the sum modulo 2 of the elements in the n-tuple over 0, 1, ∙∙∙ , k - 1 representing the node). Our result is optimal in the sense that there are pairs of nodes and fault configurations for which these bounds cannot be improved, and it answers questions recently posed by Yang, Tan and Hsu, and by Fu. Furthermore, we extend known results, obtained by Kim and Park, for the case when n = 2.2. We give precise solutions to problems posed by Wang, An, Pan, Wang and Qu and by Hsieh, Lin and Huang. In particular, we show that Q(^k_n) is bi-panconnected and edge-bipancyclic, when k ≥ 3 and n ≥ 2, and we also show that when k is odd, Q(^k_n) is m-panconnected, for m = (^n(k - 1) + 2k - 6’ / ‘_2), and (k -1) pancyclic (these bounds are optimal). We introduce a path-shortening technique, called progressive shortening, and strengthen existing results, showing that when paths are formed using progressive shortening then these paths can be efficiently constructed and used to solve a problem relating to the distributed simulation of linear arrays and cycles in a parallel machine whose interconnection network is Q(^k_n) even in the presence of a faulty processor.3. We define an interconnection network AQ(^k_n) which we call the augmented k-ary n-cube by extending a k-ary n-cube in a manner analogous to the existing extension of an n-dimensional hypercube to an n-dimensional augmented cube. We prove that the augmented k-ary n-cube Q(^k_n) has a number of attractive properties (in the context of parallel computing). For example, we show that the augmented k-ary n-cube Q(^k_n) - is a Cayley graph (and so is vertex-symmetric); has connectivity 4n - 2, and is such that we can build a set of 4n - 2 mutually disjoint paths joining any two distinct vertices so that the path of maximal length has length at most max{{n- l)k- (n-2), k + 7}; has diameter [(^k) / (_3)] + [(^k - 1) /( _3)], when n = 2; and has diameter at most (^k) / (_4) (n+ 1), for n ≥ 3 and k even, and at most [(^k)/ (_4) (n + 1) + (^n) / (_4), for n ^, for n ≥ 3 and k odd.4. We present an algorithm which given a source node and a set of n - 1 target nodes in the (n, k)-star graph S(_n,k) where all nodes are distinct, builds a collection of n - 1 node-disjoint paths, one from each target node to the source. The collection of paths output from the algorithm is such that each path has length at most 6k - 7, and the algorithm has time complexity O(k(^3)n(^4))

    Models for Type I X-Ray Bursts Nucleosynthesis with Parallelisation and Improved Nuclear Physics

    Get PDF
    Type I XRBs are thermonuclear flashes on the surface of neutron stars (NS) associated with mass-accretion from a companion star. Models of type I XRBs and their associated nucleosynthesis are physically complicated and extremely intense as regards the huge computational power required to model the physical processes played out, with the required precision to be truly representative. Until recently, because of these computational limitations, studies of XRB nucleosynthesis have been performed using limited nuclear reaction networks. In the bid to overcome this hurdle, parallel computing has been raised as the main permitting factor of yet more precise and computationally intensive simulations as it offers the potential to concentrate computational resources on intensive computational problems. In this Work, we present a parallelisation of two different applications; a one-zone (i.e. parameterized) nucleosynthesis code, and a one-dimensional (spherically symmetric), hydrodynamic code, in Lagrangian formulation (hereafter SHIVA code), built originally to model classical nova outbursts (José 1996; José & Hernanz 1998). The codes have been parallelised using the MPICH2 implementation of the Message Passing Interface (MPI) specification for the design of parallel applications using clusters of distributed workstations. As an example, to execute a hydrodynamic simulation along 200k time-steps, the SHIVA code requires (in its sequential, single-node version) about 147 hours (6.1 days) to complete when using a reduced nuclear network with 324 isotopes and 1392 nuclear reactions, and 688 hours (28.6 days) when using a network with 606 nuclides and 3551 nuclear reactions for the same number of time-steps. The post-processing nucleosynthesis code is a time-step loosely synchronous application with a very small problem size (limited by the number of isotopes of the nuclear network). As shown by the performance tests, this fact results in the worst possible scenario for parallelisation; results show that the performance of the parallel application is much worst than the sequential, 1-node version of the code. Our results show that it is therefore not possible to parallelise efficiently a post-processing nucleosynthesis code, and efforts in this regard should be avoided. On the contrary, the parallelised version of the SHIVA code yields excellent performance results. A speed-up factor of 26 is achieved in a simulation with a reduced network consisting of 324 isotopes and 1392 nuclear reactions when 42 processors are used in parallel to execute the application along 200k time-steps. On the other hand, an excellent speed-up factor of 35 is accomplished in a simulation with a reaction network up to 606 nuclides and 3551 nuclear reactions. Maximum speed-ups of ~41 and ~85 are predicted by the performance models when using 200 processors, for the reduced and extended simulations respectively. Our results will not only improve the quality of the simulations (and hence publications) in terms of better numerical approaches, finer approximations, and a considerably shorter time-to-publication, but also will allow taking advantage, if desired, of parallel supercomputing facilities like the Mare Nostrum at the Supercomputing Centre in Barcelona (BSC)

    Study of robotics systems applications to the space station program

    Get PDF
    Applications of robotics systems to potential uses of the Space Station as an assembly facility, and secondarily as a servicing facility, are considered. A typical robotics system mission is described along with the pertinent application guidelines and Space Station environmental assumptions utilized in developing the robotic task scenarios. A functional description of a supervised dual-robot space structure construction system is given, and four key areas of robotic technology are defined, described, and assessed. Alternate technologies for implementing the more routine space technology support subsystems that will be required to support the Space Station robotic systems in assembly and servicing tasks are briefly discussed. The environmental conditions impacting on the robotic configuration design and operation are reviewed

    Combinatorial Design and Analysis of Optimal Multiple Bus Systems for Parallel Algorithms.

    Get PDF
    This dissertation develops a formal and systematic methodology for designing optimal, synchronous multiple bus systems (MBSs) realizing given (classes of) parallel algorithms. Our approach utilizes graph and group theoretic concepts to develop the necessary model and procedural tools. By partitioning the vertex set of the graphical representation CFG of the algorithm, we extract a set of interconnection functions that represents the interprocessor communication requirement of the algorithm. We prove that the optimal partitioning problem is NP-Hard. However, we show how to obtain polynomial time solutions by exploiting certain regularities present in many well-behaved parallel algorithms. The extracted set of interconnection functions is represented by an edge colored, directed graph called interconnection function graph (IFG). We show that the problem of constructing an optimal MBS to realize an IFG is NP-Hard. We show important special cases where polynomial time solutions exist. In particular, we prove that polynomial time solutions exist when the IFG is vertex symmetric. This is the case of interest for the vast majority of important interconnection function sets, whether extracted from algorithms or correspond to existing interconnection networks. We show that an IFG is vertex symmetric if and only if it is the Cayley color graph of a finite group Γ\Gamma and its generating set Δ.\Delta. Using this property, we present a particular scheme to construct a symmetric MBS M(Γ,Δ)MBS\ M(\Gamma,\Delta) with minimum number of buses as well as minimum number of interfaces realizing a vertex symmetric IFG. We demonstrate several advantages of the optimal MBS M(Γ,Δ)MBS\ M(\Gamma,\Delta) in terms of its symmetry, number of ports per processor, number of neighbors per processor, and the diameter. We also investigate the fault tolerant capabilities and performance degradation of M(Γ,Δ)M(\Gamma,\Delta) in the case of a single bus failure, single driver failure, single receiver failure, and single processor failure. Further, we address the problem of designing an optimal MBS realizing a class of algorithms when the number of buses and/or processors in the target MBS are specified. The optimality criteria are maximizing the speed and minimizing the number of interfaces
    • …
    corecore