343 research outputs found

    General Routing Algorithms for Star Graphs

    Get PDF
    In designing algorithms for a specific parallel architecture, a programmer has to cope with topological and cardinality variations. Both these problems always increase the programmer\u27s effort. However, an ideal shared memory abstract parallel model called the parallel random access machine (PRAM) [KRUS86, KRUS88] that avoids these problems and also simple-to-program has been proposed. Unfortunately, the PRAM does not seem to be realizable in the present or even foreseeable technologies. On the other hand, a packet routing technique can be employed to simulate the PRAM on a feasible parallel architecture without significant loss of efficiency. The problem of routing is also important due to its intrinsic significance in distributed processing and its important role in the simulations among parallel models. The routing problem is defined as follows: Given a specific network and a set of packets of information in which a packet is an (origin, destination) pair. To start with, the packets are placed on their origins, one per node. These packets must be routed in parallel to their own destinations such that at most one packet passes through any link of the network at any time and all packets arrive at their destinations as quickly as possible. We are interested in a special case of the general routing problem called permutation routing in which the destinations form some permutation of the origins. A routing algorithm is said to be oblivious if the path taken by each packet is only dependent on its source and destination. An oblivious routing strategy is preferable since it will lead to a simple control structure for the individual processing elements. Also oblivious routing algorithms can be used in a distributed environment. In this paper we are concerned with only oblivious routing strategies

    Randomized Parallel Selection

    Get PDF
    We show that selection on an input of size N can be performed on a P-node hypercube (P = N/(log N)) in time O(n/P) with high probability, provided each node can process all the incident edges in one unit of time (this model is called the parallel model and has been assumed by previous researchers (e.g.,[17])). This result is important in view of a lower bound of Plaxton that implies selection takes Ω((N/P)loglog P+log P) time on a P-node hypercube if each node can process only one edge at a time (this model is referred to as the sequential model)

    Aspects of practical implementations of PRAM algorithms

    Get PDF
    The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

    A Consensus-based Distributed Temperature Priority Control of Air Conditioner Clusters for Voltage Regulation in Distribution Networks

    Get PDF
    High penetration of Photovoltaic (PV) to the distribution network may bring under-voltage and over-voltage issues, limiting the PV hosting capacity. Air conditioners (AC) in grid-interactive buildings can support voltage regulation by manipulating flexible energy consumption. This paper developed a novel voltage control strategy to regulate the AC clusters’ on/off states for distribution network voltage regulation under high PV penetrations. The novelty lies in the distributed formulation of temperature priority-based on/off control (TPC) of AC clusters and the strategic selection and permutation of demand response technologies, including the real-time optimal demand response resources dispatch, distributed sensing of ACs based on average consensus algorithm, and the local implementation of TPC strategy and trial calculation scheme for flexibility capacity estimation. Finally, the distributed TPC is validated to be effective for system rebalancing with no comfort violations and an acceptable ON/OFF switching frequency. The theoretical and numerical analysis also proves its scalability and robustness to communication delays and link failures. It is then incorporated into a novel hierarchical control framework for smart grid voltage control in a four-bus three-phase test grid, considering the voltage sensitivities to power injections in different locations and phases

    Destination Tag Routing Techniques Based on a State Model for the IADM Network

    Get PDF
    A state model is proposed for solving the problem of routing and rerouting messages in the Inverse Augmented Data Manipulator (IADM) network. Using this model, necessary and sufficient conditions for the reroutability of messages are established, and then destination tag schemes are derived. These schemes are simpler, more efficient and require less complex hardware than previously proposed routing schemes. Two destination tag schemes are proposed. For one of the schemes, rerouting is totally transparent to the sender of the message and any blocked link of a given type can be avoided. Compared with previous works that deal with the same type of blockage, the timeXspace complexity is reduced from O(logN) to O(1). For the other scheme, rerouting is possible for any type of link blockage. A universal rerouting algorithm is constructed based on the second scheme, which finds a blockage-free path for any combination of multiple blockages if there exists such a path, and indicates absence of such a path if there exists none. In addition, the state model is used to derive constructively a lower bound on the number of subgraphs which are isomorphic to the Indirect Binary N-Cube network in the IADM network. This knowledge can be used to characterize properties of the IADM networks and for permutation routing in the IADM networks

    Processor allocation strategies for modified hypercubes

    Get PDF
    Parallel processing has been widely accepted to be the future in high speed computing. Among the various parallel architectures proposed/implemented, the hypercube has shown a lot of promise because of its poweful properties, like regular topology, fault tolerance, low diameter, simple routing, and ability to efficiently emulate other architectures. The major drawback of the hypercube network is that it can not be expanded in practice because the number of communication ports for each processor grows as the logarithm of the total number of processors in the system. Therefore, once a hypercube supercomputer of a certain dimensionality has been built, any future expansions can be accomplished only by replacing the VLSI chips. This is an undesirable feature and a lot of work has been under progress to eliminate this stymie, thus providing a platform for easier expansion. Modified hypercubes (MHs) have been proposed as the building blocks of hypercube-based systems supporting incremental growth techniques without introducing extra resources for individual hypercubes. However, processor allocation on MHs proves to be a challenge due to a slight deviation in their topology from that of the standard hypercube network. This thesis addresses the issue of processor allocation on MHs and proposes various strategies which are based, partially or entirely, on table look-up approaches. A study of the various task allocation strategies for standard hypercubes is conducted and their suitability for MHs is evaluated. It is shown that the proposed strategies have a perfect subcube recognition ability and a superior performance. Existing processor allocation strategies for pure hypercube networks are demonstrated to be ineffective for MHs, in the light of their inability to recognize all available subcubes. A comparative analysis that involves the buddy strategy and the new strategies is carried out using simulation results
    corecore