169 research outputs found

    Aspects of practical implementations of PRAM algorithms

    Get PDF
    The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

    Broadcasting in Hyper-cylinder graphs

    Get PDF
    Broadcasting in computer networking means the dissemination of information, which is known initially only at some nodes, to all network members. The goal is to inform every node in the minimal time possible. There are few models for broadcasting; the simplest and the historical model is called the Classical model. In the Classical model, dissemination happens in synchronous rounds, wherein a node may only inform one of its neighbors. The broadcast question is: What is the minimum number of rounds needed for broadcasting, and what broadcast scheme achieves it? For general graphs, these questions are NP-hard, and it is known to be at least 3 - ε inapproximable for any real ε > 0. Even for some very restricted classes of graphs, the questions remain as an NP-hard problem. Little is known about broadcasting in restricted graphs, and only a few classes have a polynomial solution. Parallel and distributed computing is one of the important domains which relies on efficient broadcasting. Hypercube and torus are the most used network topology in this domain. The widespread use is not only due to their simplicity but also is for their efficiency and high robustness (e.g., fault tolerance) while having an acceptable number of links. In this thesis, it is observed that the Cartesian product of a number of path and cycle graphs produces a valuable set of topologies, we called hyper-cylinders, which contain hypercube and Torus as well. Any hyper-cylinder shares many of the beneficial features of hypercube and torus and might be a suitable substitution in some cases. Some hyper-cylinders are also similar to other practically used topologies such as cube-connected cycles. In this thesis, the effect of the Cartesian product on broadcasting and broadcasting of hyper-cylinders under the Classical and Messy models is studied. This will add a valuable class of graphs to the limited classes of graphs which have a polynomially computable broadcast time. In the end, the relation between worst-case originators and diameters in trees is studied, which may help in the broadcast study of a larger class of graphs where any tree is allowed instead of a path in the Cartesian product

    Broadcasting in highly connected graphs

    Get PDF
    Throughout history, spreading information has been an important task. With computer networks expanding, fast and reliable dissemination of messages became a problem of interest for computer scientists. Broadcasting is one category of information dissemination that transmits a message from a single originator to all members of the network. In the past five decades the problem has been studied by many researchers and all have come to demonstrate that despite its easy definition, the problem of broadcasting does not have trivial properties and symmetries. For general graphs, and even for some very restricted classes of graphs, the question of finding the broadcast time and scheme remains NP-hard. This work uses graph theoretical concepts to explore mathematical bounds on how fast information can be broadcast in a network. The connectivity of a graph is a measure to assess how separable the graph is, or in other words how many machines in a network will have to fail to disrupt communication between all machines in the network. We initiate the study of finding upper bounds on broadcast time b(G) in highly connected graphs. In particular, we give upper bounds on b(G) for k-connected graphs and graphs with a large minimum degree. We explore 2-connected (biconnected) graphs and broadcasting in them. Using Whitney's open ear decomposition in an inductive proof we propose broadcast schemes that achieve an upper bound of ceil(n/2) for classical broadcasting as well as similar bounds for multiple originators. Exploring further, we use a matching-based approach to prove an upper bound of ceil(log(k)) + ceil(n/k) - 1 for all k-connected graphs. For many infinite families of graphs, these bounds are tight. Discussion of broadcasting in highly connected graphs leads to an exploration of dependence between the minimum degree in the graph and the broadcast time of the latter. By using similar techniques and arguments we show that if all vertices of the graph are neighboring linear numbers of vertices, then information dissemination in the graph can be achieved in ceil(log(n)) + C time. To the best of our knowledge, the bounds presented in our work are a novelty. Methods and questions proposed in this thesis open new pathways for research in broadcasting

    Three Hopf algebras from number theory, physics & topology, and their common background I: operadic & simplicial aspects

    Get PDF
    We consider three a priori totally different setups for Hopf algebras from number theory, mathematical physics and algebraic topology. These are the Hopf algebra of Goncharov for multiple zeta values, that of Connes-Kreimer for renormalization, and a Hopf algebra constructed by Baues to study double loop spaces. We show that these examples can be successively unified by considering simplicial objects, co-operads with multiplication and Feynman categories at the ultimate level. These considerations open the door to new constructions and reinterpretations of known constructions in a large common framework, which is presented step-by-step with examples throughout. In this first part of two papers, we concentrate on the simplicial and operadic aspectsPeer ReviewedPostprint (author's final draft

    Three Hopf algebras from number theory, physics & topology, and their common background I: operadic & simplicial aspects

    Get PDF
    We consider three a priori totally different setups for Hopf algebras from number theory, mathematical physics and algebraic topology. These are the Hopf algebra of Goncharov for multiple zeta values, that of Connes-Kreimer for renormalization, and a Hopf algebra constructed by Baues to study double loop spaces. We show that these examples can be successively unified by considering simplicial objects, co-operads with multiplication and Feynman categories at the ultimate level. These considerations open the door to new constructions and reinterpretations of known constructions in a large common framework, which is presented step-by-step with examples throughout. In this first part of two papers, we concentrate on the simplicial and operadic aspects.Comment: This replacement is part I of the final version of the paper, which has been split into two parts. The second part is available from the arXiv under the title "Three Hopf algebras from number theory, physics & topology, and their common background II: general categorical formulation" arXiv:2001.0872

    Free Probability Theory

    Get PDF
    The workhop brought together leading experts, as well as promising young researchers, in areas related to recent developments in free probability theory. Some particular emphasis was on the relation of free probability with random matrix theory

    On the implementation of P-RAM algorithms on feasible SIMD computers

    Get PDF
    The P-RAM model of computation has proved to be a very useful theoretical model for exploiting and extracting inherent parallelism in problems and thus for designing parallel algorithms. Therefore, it becomes very important to examine whether results obtained for such a model can be translated onto machines considered to be more realistic in the face of current technological constraints. In this thesis, we show how the implementation of many techniques and algorithms designed for the P-RAM can be achieved on the feasible SIMD class of computers. The first investigation concerns classes of problems solvable on the P-RAM model using the recursive techniques of compression, tree contraction and 'divide and conquer'. For such problems, specific methods are emphasised to achieve efficient implementations on some SIMD architectures. Problems such as list ranking, polynomial and expression evaluation are shown to have efficient solutions on the 2—dimensional mesh-connected computer. The balanced binary tree technique is widely employed to solve many problems in the P-RAM model. By proposing an implicit embedding of the binary tree of size n on a (√n x√n) mesh-connected computer (contrary to using the usual H-tree approach which requires a mesh of size ≈ (2√n x 2√n), we show that many of the problems solvable using this technique can be efficiently implementable on this architecture. Two efficient O (√n) algorithms for solving the bracket matching problem are presented. Consequently, the problems of expression evaluation (where the expression is given in an array form), evaluating algebraic expressions with a carrier of constant bounded size and parsing expressions of both bracket and input driven languages are all shown to have efficient solutions on the 2—dimensional mesh-connected computer. Dealing with non-tree structured computations we show that the Eulerian tour problem for a given graph with m edges and maximum vertex degree d can be solved in O(d√n) parallel time on the 2 —dimensional mesh-connected computer. A way to increase the processor utilisation on the 2-dimensional mesh-connected computer is also presented. The method suggested consists of pipelining sets of iteratively solvable problems each of which at each step of its execution uses only a fraction of available PE's

    Problems Related to Classical and Universal List Broadcasting

    Get PDF
    Broadcasting is a fundamental problem in the information dissemination area. In classical broadcasting, a message must be sent from one network member to all other members as rapidly as feasible. Although it has been demonstrated that this problem is NP-Hard for arbitrary graphs, it has several applications in various fields. As a result, the universal lists model, replicating real-world restrictions like the memory limits of nodes in large networks, is introduced as a branch of this problem in the literature. In the universal lists model, each node is equipped with a fixed list and has to follow the list regardless of the originator. In this study, we focus on both classical and universal lists broadcasting. Classical broadcasting is solvable for a few families of networks, such as trees, unicyclic graphs, tree of cycles, and tree of cliques. In this study, we begin by presenting an optimal algorithm that finds the broadcast time of any vertex in a Fully Connected Tree (FCT_n) in O(|V | log log n) time. An FCT_n is formed by attaching arbitrary trees to vertices of a complete graph of size n where |V| is the total number of vertices in the graph. Then, we replace the complete graph with a Hypercube H_k and propose a new heuristic for the Hypercube of Trees (HT_k). Not only does this heuristic have the same approximation ratio as the best-known algorithm, but our numerical results also show its superiority in most experiments. Our heuristic is able to outperform the current upper bound in up to 90% of the situations, resulting in an average speedup of 30%. Most importantly, our results illustrate that it can maintain its performance even if the network size grows, making the proposed heuristic practically useful. Afterward, we focus on broadcasting with universal lists, in which once a vertex is informed, it must follow its corresponding list, regardless of the originator and the neighbor from which it received the message. The problem of broadcasting with universal lists could be categorized into two sub-models: non-adaptive and adaptive. In the latter model, a sender will skip the vertices on its list from which it has received the message, while those vertices will not be skipped in the first model. In this study, we will present another sub-model called fully adaptive. Not only does this model benefit from a significantly better space complexity compared to the classical model, but, as will be proved, it is faster than the two other sub-models. Since the suggested model fits real-world network architectures, we will design optimal broadcast algorithms for well-known interconnection networks such as trees, grids, and cube-connected cycles. We also present an upper bound for tori under the same model. Then we focus on designing broadcast graphs (bg)’s under this model. A bg is a graph with minimum possible broadcast time from any originator. Additionally, a minimum broadcast graph (mbg) is a bg with the minimum possible number of edges. We propose mbg’s on n vertices for n ≤ 10 and sparse bg’s for 11 ≤ n ≤ 14 under the fully-adaptive model. Afterward, we introduce the first infinite families of bg’s under this model, and we prove that hypercubes are mbg under this model. Later, we establish the optimal broadcast time of k−ary trees and binomial trees under the nonadaptive model and provide an upper bound for complete bipartite graphs. We also improved a general upper bound for trees under the same model. We then suggest several general upper bounds for the universal lists by comparing them with the messy broadcasting model. Finally, we propose the first heuristic for this problem, namely HUB-GA: a Heuristic for Universal lists Broadcasting with Genetic Algorithm. We undertake various numerical experiments on frequently used interconnection networks in the literature, graphs with clique-like structures, and synthetic instances in order to cover many possibilities of industrial topologies. We also compare our results with state-of-the-art methods for classical broadcasting, which is proved to be the fastest model among all. Although the universal list model utilizes less memory than the classical model, our algorithm finds the same broadcast time as the classical model in diverse situations

    Fault-tolerance embedding of rings and arrays in star and pancake graphs

    Full text link
    The star and pancake graphs are useful interconnection networks for connecting processors in a parallel and distributed computing environment. The star network has been widely studied and is shown to possess attactive features like sublogarithmic diameter, node and edge symmetry and high resilience. The star/pancake interconnection graphs, {dollar}S\sb{n}/P\sb{n}{dollar} of dimension n have n! nodes connected by {dollar}{(n-1).n!\over2}{dollar} edges. Due to their large number of nodes and interconnections, they are prone to failure of one or more nodes/edges; In this thesis, we present methods to embed Hamiltonian paths (H-path) and Hamiltonian cycles (H-cycle) in a star graph {dollar}S\sb{n}{dollar} and pancake graph {dollar}P\sb{n}{dollar} in a faulty environment. Such embeddings are important for solving computational problems, formulated for array and ring topologies, on star and pancake graphs. The models considered include single-processor failure, double-processor failure, and multiple-processor failures. All the models are applied to an H-cycle which is formed by visiting all the ({dollar}{n!\over4!})\ S\sb4/P\sb4{dollar}s in an {dollar}S\sb{n}/P\sb{n}{dollar} in a particular order. Each {dollar}S\sb4/P\sb4{dollar} has an entry node where the cycle/path enters that particular {dollar}S\sb4/P\sb4{dollar} and an exit node where the path leaves it. Distributed algorithms for embedding hamiltonian cycle in the presence of multiple faults, are also presented for both {dollar}S\sb{n}{dollar} and {dollar}P\sb{n}{dollar}
    • …
    corecore