481 research outputs found

    Aspects of practical implementations of PRAM algorithms

    Get PDF
    The PRAM is a shared memory model of parallel computation which abstracts away from inessential engineering details. It provides a very simple architecture independent model and provides a good programming environment. Theoreticians of the computer science community have proved that it is possible to emulate the theoretical PRAM model using current technology. Solutions have been found for effectively interconnecting processing elements, for routing data on these networks and for distributing the data among memory modules without hotspots. This thesis reviews this emulation and the possibilities it provides for large scale general purpose parallel computation. The emulation employs a bridging model which acts as an interface between the actual hardware and the PRAM model. We review the evidence that such a scheme crn achieve scalable parallel performance and portable parallel software and that PRAM algorithms can be optimally implemented on such practical models. In the course of this review we presented the following new results: 1. Concerning parallel approximation algorithms, we describe an NC algorithm for finding an approximation to a minimum weight perfect matching in a complete weighted graph. The algorithm is conceptually very simple and it is also the first NC-approximation algorithm for the task with a sub-linear performance ratio. 2. Concerning graph embedding, we describe dense edge-disjoint embeddings of the complete binary tree with n leaves in the following n-node communication networks: the hypercube, the de Bruijn and shuffle-exchange networks and the 2-dimcnsional mesh. In the embeddings the maximum distance from a leaf to the root of the tree is asymptotically optimally short. The embeddings facilitate efficient implementation of many PRAM algorithms on networks employing these graphs as interconnection networks. 3. Concerning bulk synchronous algorithmics, we describe scalable transportable algorithms for the following three commonly required types of computation; balanced tree computations. Fast Fourier Transforms and matrix multiplications

    Random induced subgraphs of Cayley graphs induced by transpositions

    Get PDF
    In this paper we study random induced subgraphs of Cayley graphs of the symmetric group induced by an arbitrary minimal generating set of transpositions. A random induced subgraph of this Cayley graph is obtained by selecting permutations with independent probability, λn\lambda_n. Our main result is that for any minimal generating set of transpositions, for probabilities λn=1+ϵnn−1\lambda_n=\frac{1+\epsilon_n}{n-1} where n−1/3+δ≤ϵn0n^{-{1/3}+\delta}\le \epsilon_n0, a random induced subgraph has a.s. a unique largest component of size ℘(ϵn)1+ϵnn−1n!\wp(\epsilon_n)\frac{1+\epsilon_n}{n-1}n!, where ℘(ϵn)\wp(\epsilon_n) is the survival probability of a specific branching process.Comment: 18 pages, 1 figur

    Lower bounds for dilation, wirelength, and edge congestion of embedding graphs into hypercubes

    Full text link
    Interconnection networks provide an effective mechanism for exchanging data between processors in a parallel computing system. One of the most efficient interconnection networks is the hypercube due to its structural regularity, potential for parallel computation of various algorithms, and the high degree of fault tolerance. Thus it becomes the first choice of topological structure of parallel processing and computing systems. In this paper, lower bounds for the dilation, wirelength, and edge congestion of an embedding of a graph into a hypercube are proved. Two of these bounds are expressed in terms of the bisection width. Applying these results, the dilation and wirelength of embedding of certain complete multipartite graphs, folded hypercubes, wheels, and specific Cartesian products are computed

    Fixed Linear Crossing Minimization by Reduction to the Maximum Cut Problem

    Get PDF
    Many real-life scheduling, routing and locating problems can be formulated as combinatorial optimization problems whose goal is to find a linear layout of an input graph in such a way that the number of edge crossings is minimized. In this paper, we study a restricted version of the linear layout problem where the order of vertices on the line is fixed, the so-called fixed linear crossing number problem (FLCNP). We show that this NP-hard problem can be reduced to the well-known maximum cut problem. The latter problem was intensively studied in the literature; practically efficient exact algorithms based on the branch-and-cut technique have been developed. By an experimental evaluation on a variety of graphs, we prove that using this reduction for solving FLCNP compares favorably to earlier branch-and-bound algorithms

    Fault-Tolerant Ring Embeddings in Hypercubes -- A Reconfigurable Approach

    Get PDF
    We investigate the problem of designing reconfigurable embedding schemes for a fixed hypercube (without redundant processors and links). The fundamental idea for these schemes is to embed a basic network on the hypercube without fully utilizing the nodes on the hypercube. The remaining nodes can be used as spares to reconfigure the embeddings in case of faults. The result of this research shows that by carefully embedding the application graphs, the topological properties of the embedding can be preserved under fault conditions, and reconfiguration can be carried out efficiently. In this dissertation, we choose the ring as the basic network of interest, and propose several schemes for the design of reconfigurable embeddings with the aim of minimizing reconfiguration cost and performance degradation. The cost is measured by the number of node-state changes or reconfiguration steps needed for processing of the reconfiguration, and the performance degradation is characterized as the dilation of the new embedding after reconfiguration. Compared to the existing schemes, our schemes surpass the existing ones in terms of applicability of schemes and reconfiguration cost needed for the resulting embeddings

    Efficient Interconnection Schemes for VLSI and Parallel Computation

    Get PDF
    This thesis is primarily concerned with two problems of interconnecting components in VLSI technologies. In the first case, the goal is to construct efficient interconnection networks for general-purpose parallel computers. The second problem is a more specialized problem in the design of VLSI chips, namely multilayer channel routing. In addition, a final part of this thesis provides lower bounds on the area required for VLSI implementations of finite-state machines. This thesis shows that networks based on Leiserson\u27s fat-tree architecture are nearly as good as any network built in a comparable amount of physical space. It shows that these universal networks can efficiently simulate competing networks by means of an appropriate correspondence between network components and efficient algorithms for routing messages on the universal network. In particular, a universal network of area A can simulate competing networks with O(lg^3A) slowdown (in bit-times), using a very simple randomized routing algorithm and simple network components. Alternatively, a packet routing scheme of Leighton, Maggs, and Rao can be used in conjunction with more sophisticated switching components to achieve O(lg^2 A) slowdown. Several other important aspects of universality are also discussed. It is shown that universal networks can be constructed in area linear in the number of processors, so that there is no need to restrict the density of processors in competing networks. Also results are presented for comparisons between networks of different size or with processors of different sizes (as determined by the amount of attached memory). Of particular interest is the fact that a universal network built from sufficiently small processors can simulate (with the slowdown already quoted) any competing network of comparable size regardless of the size of processors in the competing network. In addition, many of the results given do not require the usual assumption of unit wire delay. Finally, though most of the discussion is in the two-dimensional world, the results are shown to apply in three dimensions by way of a simple demonstration of general results on graph layout in three dimensions. The second main problem considered in this thesis is channel routing when many layers of interconnect are available, a scenario that is becoming more and more meaningful as chip fabrication technologies advance. This thesis describes a system MulCh for multilayer channel routing which extends the Chameleon system developed at U. C. Berkeley. Like Chameleon, MulCh divides a multilayer problem into essentially independent subproblems of at most three layers, but unlike Chameleon, MulCh considers the possibility of using partitions comprised of a single layer instead of only partitions of two or three layers. Experimental results show that MulCh often performs better than Chameleon in terms of channel width, total net length, and number of vias. In addition to a description of MulCh as implemented, this thesis provides improved algorithms for subtasks performed by MulCh, thereby indicating potential improvements in the speed and performance of multilayer channel routing. In particular, a linear time algorithm is given for determining the minimum width required for a single-layer channel routing problem, and an algorithm is given for maintaining the density of a collection of nets in logarithmic time per net insertion. The last part of this thesis shows that straightforward techniques for implementing finite-state machines are optimal in the worst case. Specifically, for any s and k, there is a deterministic finite-state machine with s states and k symbols such that any layout algorithm requires (ks lg s) area to lay out its realization. For nondeterministic machines, there is an analogous lower bound of (ks^2) area

    First-order limits, an analytical perspective

    Full text link
    In this paper we present a novel approach to graph (and structural) limits based on model theory and analysis. The role of Stone and Gelfand dualities is displayed prominently and leads to a general theory, which we believe is naturally emerging. This approach covers all the particular examples of structural convergence and it put the whole in new context. As an application, it leads to new intermediate examples of structural convergence and to a "grand conjecture" dealing with sparse graphs. We survey the recent developments

    Efficient structural outlooks for vertex product networks

    Get PDF
    In this thesis, a new classification for a large set of interconnection networks, referred to as "Vertex Product Networks" (VPN), is provided and a number of related issues are discussed including the design and evaluation of efficient structural outlooks for algorithm development on this class of networks. The importance of studying the VPN can be attributed to the following two main reasons: first an unlimited number of new networks can be defined under the umbrella of the VPN, and second some known networks can be studied and analysed more deeply. Examples of the VPN include the newly proposed arrangement-star and the existing Optical Transpose Interconnection Systems (OTIS-networks). Over the past two decades many interconnection networks have been proposed in the literature, including the star, hyperstar, hypercube, arrangement, and OTIS-networks. Most existing research on these networks has focused on analysing their topological properties. Consequently, there has been relatively little work devoted to designing efficient parallel algorithms for important parallel applications. In an attempt to fill this gap, this research aims to propose efficient structural outlooks for algorithm development. These structural outlooks are based on grid and pipeline views as popular structures that support a vast body of applications that are encountered in many areas of science and engineering, including matrix computation, divide-and- conquer type of algorithms, sorting, and Fourier transforms. The proposed structural outlooks are applied to the VPN, notably the arrangement-star and OTIS-networks. In this research, we argue that the proposed arrangement-star is a viable candidate as an underlying topology for future high-speed parallel computers. Not only does the arrangement-star bring a solution to the scalability limitations from which the Abstract existing star graph suffers, but it also enables the development of parallel algorithms based on the proposed structural outlooks, such as matrix computation, linear algebra, divide-and-conquer algorithms, sorting, and Fourier transforms. Results from a performance study conducted in this thesis reveal that the proposed arrangement-star supports efficiently applications based on the grid or pipeline structural outlooks. OTIS-networks are another example of the VPN. This type of networks has the important advantage of combining both optical and electronic interconnect technology. A number of studies have recently explored the topological properties of OTIS-networks. Although there has been some work on designing parallel algorithms for image processing and sorting, hardly any work has considered the suitability of these networks for an important class of scientific problems such as matrix computation, sorting, and Fourier transforms. In this study, we present and evaluate two structural outlooks for algorithm development on OTIS-networks. The proposed structural outlooks are general in the sense that no specific factor network or problem domain is assumed. Timing models for measuring the performance of the proposed structural outlooks are provided. Through these models, the performance of various algorithms on OTIS-networks are evaluated and compared with their counterparts on conventional electronic interconnection systems. The obtained results reveal that OTIS-networks are an attractive candidate for future parallel computers due to their superior performance characteristics over networks using traditional electronic interconnects

    A QUBO formulation for the Tree Containment problem

    Full text link
    Phylogenetic (evolutionary) trees and networks are leaf-labeled graphs that are widely used to represent the evolutionary relationships between entities such as species, languages, cancer cells, and viruses. To reconstruct and analyze phylogenetic networks, the problem of deciding whether or not a given rooted phylogenetic network embeds a given rooted phylogenetic tree is of recurring interest. This problem, formally know as Tree Containment, is NP-complete in general and polynomial-time solvable for certain classes of phylogenetic networks. In this paper, we connect ideas from quantum computing and phylogenetics to present an efficient Quadratic Unconstrained Binary Optimization formulation for Tree Containment in the general setting. For an instance (N,T) of Tree Containment, where N is a phylogenetic network with n_N vertices and T is a phylogenetic tree with n_T vertices, the number of logical qubits that are required for our formulation is O(n_N n_T).Comment: final version accepted for publication in Theoretical Computer Scienc
    • …
    corecore