9 research outputs found

    Efficient weighted multiselection in parallel architectures

    Get PDF
    ©2002 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.We study parallel solutions to the problem of weighted multiselection to select r elements on given weighted-ranks from a set S of n weighted elements, where an element is on weighted rank k if it is the smallest element such that the aggregated weight of all elements not greater than it in S is not smaller than k. We propose efficient algorithms on two of the most popular parallel architectures, hypercube and mesh. For a hypercube with p < n processors, we present a parallel algorithm running in 0(n^\varepsilon \min \{ r,\log p\} ) time for p = n^{1 - \varepsilon } ,0 < \varepsilon < 1 which is cost optimal when r \geqslant p. Our algorithm on \sqrt p \times \sqrt p mesh runs in 0(\sqrt p + \frac{n}{p}\log ^3 p) time which is the same as multiselection on mesh when r \geqslant \log p, and thus has the same optimality as multiselection in this case

    A Randomized Algorithm for Multiselection

    Get PDF

    Efficient parallel computation on multiprocessors with optical interconnection networks

    Get PDF
    This dissertation studies optical interconnection networks, their architecture, address schemes, and computation and communication capabilities. We focus on a simple but powerful optical interconnection network model - the Linear Array with Reconfigurable pipelined Bus System (LARPBS). We extend the LARPBS model to a simplified higher dimensional LAPRBS and provide a set of basic computation operations. We then study the following two groups of parallel computation problems on both one dimensional LARPBS\u27s as well as multi-dimensional LARPBS\u27s: parallel comparison problems, including sorting, merging, and selection; Boolean matrix multiplication, transitive closure and their applications to connected component problems. We implement an optimal sorting algorithm on an n-processor LARPBS. With this optimal sorting algorithm at disposal, we study the sorting problem for higher dimensional LARPBS\u27s and obtain the following results: • An optimal basic Columnsort algorithm on a 2D LARPBS. • Two optimal two-way merge sort algorithms on a 2D LARPBS. • An optimal multi-way merge sorting algorithm on a 2D LARPBS. • An optimal generalized column sort algorithm on a 2D LARPBS. • An optimal generalized column sort algorithm on a 3D LARPBS. • An optimal 5-phase sorting algorithm on a 3D LARPBS. Results for selection problems are as follows: • A constant time maximum-finding algorithm on an LARPBS. • An optimal maximum-finding algorithm on an LARPBS. • An O((log log n)2) time parallel selection algorithm on an LARPBS. • An O(k(log log n)2) time parallel multi-selection algorithm on an LARPBS. While studying the computation and communication properties of the LARPBS model, we find Boolean matrix multiplication and its applications to the graph are another set of problem that can be solved efficiently on the LARPBS. Following is a list of results we have obtained in this area. • A constant time Boolean matrix multiplication algorithm. • An O(log n)-time transitive closure algorithm. • An O(log n)-time connected components algorithm. • An O(log n)-time strongly connected components algorithm. The results provided in this dissertation show the strong computation and communication power of optical interconnection networks

    Parallelization of Reconstructability Analysis Algorithms.

    Get PDF
    Bush Jones published a series of papers providing sequential algorithms that are key to reconstructability analysis. These algorithms include the determination of unbiased reconstructions and a greedy algorithm for a generalization of the reconstruction problem. The implementation of these sequential algorithms provide scientists and mathematicians with the means of utilizing reconstructability analysis in systems modeling. The algorithms, however, are so computationally intensive that the system is limited to a very small set of variables. Many papers have been written applying reconstructability analysis and maximum entropy methods to various disciplines. Reconstructability analysis has the potential of dramatically impacting the scientific community, but the sequential algorithms leave the utilization of reconstructability analysis infeasible. The author has parallelized the reconstructability analysis algorithms developed by Jones, thereby, bridging the gap between theoretical application and feasible implementation. Since the goal of parallelization of these reconstructability analysis algorithms is to make them feasible to as many researchers as possible, a specific architecture is not assumed. It is assumed that the architecture employed is a multiple data architecture. That is, the architectural design needed for the implementation of these algorithms must have memory local to each processing element (PE). The parallel algorithms developed and presented here do not address the problems of communications between processors of particular architectures. These algorithms assume a reconfigurable bus system which is a bus system whose configuration can be dynamically altered thus allowing broadcasting and long-distance communications to be completed in constant time. It is noted that processor arrays with such reconfigurable bus systems have been designed. Frequently, parallel algorithms do not address the situation in which the number of values on which to operate is larger than the number of processors. However, since the purpose of the parallelization of these reconstructability analysis algorithms is to make them feasible for large structure systems, the parallelization given does address the situation in which the number of values on which to operate is larger than the number of processors available. Therefore, implementation of the algorithms involves simply incorporating the communication protocols between processors for the particular architecture employed

    Optimal parallel algorithms for multiselection on mesh-connected computers

    No full text
    Multiselection is the problem of selecting multiple elements at specified ranks from a set of arbitrary elements. In this paper, we first present an efficient algorithm for single-element selection that runs in O(sqrt{p} +(n/p) log p log (kp/n)) time for selecting the kth smallest element from n elements on a sqrt{p} times sqrt{p} mesh-connected computer of p leq n processors, where the first component is for communication and second is for computation (data comparisons). Our algorithm is more computationally efficient than the existing result when p geq nˆ{1/2 + varepsilon} for any 0 lt varepsilon lt 1/2. Combining our result for p = Omega (sqrt{n}) with the existing result for p = O(sqrt{n}) yields an improved computation time complexity for the selection problem on mesh t_{rm comp}ˆ{rm sel} = O(min {(n/p) log plog (kp/n), (n/p + p) log(n/p)}). Using this algorithm as a building block, we then present two efficient parallel algorithms for multiselection on the mesh-connected computers. For selecting r elements from a set of n elements on a sqrt{p} times sqrt{p} mesh, p, r leq n, our first algorithm runs in time O(pˆ{1/2} + t_{rm comp}ˆ{rm sel} min {r log r, log p}) with processors operating in the SIMD mode, which is time-optimal when p le r. Allowing processors to operate in the MIMD mode, our second algorithm runs in O(pˆ{1/2} + t_{rm comp}ˆ{rm sel} log r) time and is time-optimal for any r and p.Hong Shen, Yijie Han, Yi Pan & David Evan

    Optimal Parallel Algorithms for Multiselection on Mesh-Connected Computers

    No full text
    Multiselection is the problem of selecting multiple elements at specified ranks from a set of arbitrary elements. In this paper, we first present an e#cient algorithm for single-element selection that runs in O( # p + log p log(kp/n)) time for selecting the kth smallest element from n elements on a # p # p mesh-connected computer of p n processors, where the first component is for communication and second is for computation (data comparisons). Our algorithm is more computationally e#cient than the existing result when p . Combining our result for p = #( # n) with the existing result for p = O( # n) yields an improved computation time complexity for the selection problem on mesh t comp = O(min{ + p) log(n/p)}). Using this algorithm as a building block, we then present two e#cient parallel algorithms for multiselection on the mesh-connected computers. For selecting r elements from a set of n elements on a # p mesh, p, r n, our first algorithm runs in time O(p comp min{r log r, log p}) with processors operating in the SIMD mode, which is time-optimal when p r. Allowing # Questions regarding this paper should be sent to shen@@jaist.ac.jp processors to operate in the MIMD mode, our second algorithm runs in O(p +t comp log r) time and is time-optimal for any r and p. Key words: Computation time, mesh, multiselection, parallel algorithm, routing, selection.

    Fundamental Characteristics of Turbulent Opposed Impinging Jets

    Get PDF
    A fundamental study of two turbulent directly opposed impinging jets in a stagnant ambient fluid, unconfined or uninfluenced by walls is presented. By experimental investigation and numerical modeling, the main characteristics of direct impingement of two turbulent axisymmetric round jets under seven different geometrical and flow rate configurations (L*= L/d = { 5, 10, 20 }, where L is nozzle to nozzle separation distance and d is nozzle diameter, and Re = { 1500, 4500, 7500, 11000 }) are discussed. Flow visualization and velocity measurements performed using various laser based techniques have revealed the effects of Reynolds number, Re, and nozzle to nozzle separation, L, on the complex flow structure. Although locally valid, the classical analysis of turbulence is found unable to provide reliable results within the highly unstable and unsteady impingement region. When used to simulate the present flow, the assessment of the performance of three distinct k - epsilon turbulence models showed little disagreement between computed and experimental mean velocities and poor predictions as far as turbulence parameters are concerned
    corecore