24 research outputs found

    Aspects of practical implementations of PRAM algorithms

    Get PDF
    The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

    Broadcasting in highly connected graphs

    Get PDF
    Throughout history, spreading information has been an important task. With computer networks expanding, fast and reliable dissemination of messages became a problem of interest for computer scientists. Broadcasting is one category of information dissemination that transmits a message from a single originator to all members of the network. In the past five decades the problem has been studied by many researchers and all have come to demonstrate that despite its easy definition, the problem of broadcasting does not have trivial properties and symmetries. For general graphs, and even for some very restricted classes of graphs, the question of finding the broadcast time and scheme remains NP-hard. This work uses graph theoretical concepts to explore mathematical bounds on how fast information can be broadcast in a network. The connectivity of a graph is a measure to assess how separable the graph is, or in other words how many machines in a network will have to fail to disrupt communication between all machines in the network. We initiate the study of finding upper bounds on broadcast time b(G) in highly connected graphs. In particular, we give upper bounds on b(G) for k-connected graphs and graphs with a large minimum degree. We explore 2-connected (biconnected) graphs and broadcasting in them. Using Whitney's open ear decomposition in an inductive proof we propose broadcast schemes that achieve an upper bound of ceil(n/2) for classical broadcasting as well as similar bounds for multiple originators. Exploring further, we use a matching-based approach to prove an upper bound of ceil(log(k)) + ceil(n/k) - 1 for all k-connected graphs. For many infinite families of graphs, these bounds are tight. Discussion of broadcasting in highly connected graphs leads to an exploration of dependence between the minimum degree in the graph and the broadcast time of the latter. By using similar techniques and arguments we show that if all vertices of the graph are neighboring linear numbers of vertices, then information dissemination in the graph can be achieved in ceil(log(n)) + C time. To the best of our knowledge, the bounds presented in our work are a novelty. Methods and questions proposed in this thesis open new pathways for research in broadcasting

    Designs and Analyses in Structured Peer-To-Peer Systems

    Get PDF
    Peer-to-Peer (P2P) computing is a recent hot topic in the areas of networking and distributed systems. Work on P2P computing was triggered by a number of ad-hoc systems that made the concept popular. Later, academic research efforts started to investigate P2P computing issues based on scientific principles. Some of that research produced a number of structured P2P systems that were collectively referred to by the term "Distributed Hash Tables" (DHTs). However, the research occurred in a diversified way leading to the appearance of similar concepts yet lacking a common perspective and not heavily analyzed. In this thesis we present a number of papers representing our research results in the area of structured P2P systems grouped as two sets labeled respectively "Designs" and "Analyses". The contribution of the first set of papers is as follows. First, we present the princi- ple of distributed k-ary search and argue that it serves as a framework for most of the recent P2P systems known as DHTs. That is, given this framework, understanding existing DHT systems is done simply by seeing how they are instances of that frame- work. We argue that by perceiving systems as instances of that framework, one can optimize some of them. We illustrate that by applying the framework to the Chord system, one of the most established DHT systems. Second, we show how the frame- work helps in the design of P2P algorithms by two examples: (a) The DKS(n; k; f) system which is a system designed from the beginning on the principles of distributed k-ary search. (b) Two broadcast algorithms that take advantage of the distributed k-ary search tree. The contribution of the second set of papers is as follows. We account for two approaches that we used to evaluate the performance of a particular class of DHTs, namely the one adopting periodic stabilization for topology maintenance. The first approach was of an intrinsic empirical nature. In this approach, we tried to perceive a DHT as a physical system and account for its properties in a size-independent manner. The second approach was of a more analytical nature. In this approach, we applied the technique of Master Equations, which is a widely used technique in the analysis of natural systems. The application of the technique lead to a highly accurate description of the behavior of structured overlays. Additionally, the thesis contains a primer on structured P2P systems that tries to capture the main ideas prevailing in the field

    Interconnection networks for parallel and distributed computing

    Get PDF
    Parallel computers are generally either shared-memory machines or distributed- memory machines. There are currently technological limitations on shared-memory architectures and so parallel computers utilizing a large number of processors tend tube distributed-memory machines. We are concerned solely with distributed-memory multiprocessors. In such machines, the dominant factor inhibiting faster global computations is inter-processor communication. Communication is dependent upon the topology of the interconnection network, the routing mechanism, the flow control policy, and the method of switching. We are concerned with issues relating to the topology of the interconnection network. The choice of how we connect processors in a distributed-memory multiprocessor is a fundamental design decision. There are numerous, often conflicting, considerations to bear in mind. However, there does not exist an interconnection network that is optimal on all counts and trade-offs have to be made. A multitude of interconnection networks have been proposed with each of these networks having some good (topological) properties and some not so good. Existing noteworthy networks include trees, fat-trees, meshes, cube-connected cycles, butterflies, Möbius cubes, hypercubes, augmented cubes, k-ary n-cubes, twisted cubes, n-star graphs, (n, k)-star graphs, alternating group graphs, de Bruijn networks, and bubble-sort graphs, to name but a few. We will mainly focus on k-ary n-cubes and (n, k)-star graphs in this thesis. Meanwhile, we propose a new interconnection network called augmented k-ary n- cubes. The following results are given in the thesis.1. Let k ≥ 4 be even and let n ≥ 2. Consider a faulty k-ary n-cube Q(^k_n) in which the number of node faults f(_n) and the number of link faults f(_e) are such that f(_n) + f(_e) ≤ 2n - 2. We prove that given any two healthy nodes s and e of Q(^k_n), there is a path from s to e of length at least k(^n) - 2f(_n) - 1 (resp. k(^n) - 2f(_n) - 2) if the nodes s and e have different (resp. the same) parities (the parity of a node Q(^k_n) in is the sum modulo 2 of the elements in the n-tuple over 0, 1, ∙∙∙ , k - 1 representing the node). Our result is optimal in the sense that there are pairs of nodes and fault configurations for which these bounds cannot be improved, and it answers questions recently posed by Yang, Tan and Hsu, and by Fu. Furthermore, we extend known results, obtained by Kim and Park, for the case when n = 2.2. We give precise solutions to problems posed by Wang, An, Pan, Wang and Qu and by Hsieh, Lin and Huang. In particular, we show that Q(^k_n) is bi-panconnected and edge-bipancyclic, when k ≥ 3 and n ≥ 2, and we also show that when k is odd, Q(^k_n) is m-panconnected, for m = (^n(k - 1) + 2k - 6’ / ‘_2), and (k -1) pancyclic (these bounds are optimal). We introduce a path-shortening technique, called progressive shortening, and strengthen existing results, showing that when paths are formed using progressive shortening then these paths can be efficiently constructed and used to solve a problem relating to the distributed simulation of linear arrays and cycles in a parallel machine whose interconnection network is Q(^k_n) even in the presence of a faulty processor.3. We define an interconnection network AQ(^k_n) which we call the augmented k-ary n-cube by extending a k-ary n-cube in a manner analogous to the existing extension of an n-dimensional hypercube to an n-dimensional augmented cube. We prove that the augmented k-ary n-cube Q(^k_n) has a number of attractive properties (in the context of parallel computing). For example, we show that the augmented k-ary n-cube Q(^k_n) - is a Cayley graph (and so is vertex-symmetric); has connectivity 4n - 2, and is such that we can build a set of 4n - 2 mutually disjoint paths joining any two distinct vertices so that the path of maximal length has length at most max{{n- l)k- (n-2), k + 7}; has diameter [(^k) / (_3)] + [(^k - 1) /( _3)], when n = 2; and has diameter at most (^k) / (_4) (n+ 1), for n ≥ 3 and k even, and at most [(^k)/ (_4) (n + 1) + (^n) / (_4), for n ^, for n ≥ 3 and k odd.4. We present an algorithm which given a source node and a set of n - 1 target nodes in the (n, k)-star graph S(_n,k) where all nodes are distinct, builds a collection of n - 1 node-disjoint paths, one from each target node to the source. The collection of paths output from the algorithm is such that each path has length at most 6k - 7, and the algorithm has time complexity O(k(^3)n(^4))

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Safety and Reliability - Safe Societies in a Changing World

    Get PDF
    The contributions cover a wide range of methodologies and application areas for safety and reliability that contribute to safe societies in a changing world. These methodologies and applications include: - foundations of risk and reliability assessment and management - mathematical methods in reliability and safety - risk assessment - risk management - system reliability - uncertainty analysis - digitalization and big data - prognostics and system health management - occupational safety - accident and incident modeling - maintenance modeling and applications - simulation for safety and reliability analysis - dynamic risk and barrier management - organizational factors and safety culture - human factors and human reliability - resilience engineering - structural reliability - natural hazards - security - economic analysis in risk managemen
    corecore