2,034 research outputs found

    A methodology for exploiting parallelism in the finite element process

    Get PDF
    A methodology is described for developing a parallel system using a top down approach taking into account the requirements of the user. Substructuring, a popular technique in structural analysis, is used to illustrate this approach

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    Where are the parallel algorithms?

    Get PDF
    Four paradigms that can be useful in developing parallel algorithms are discussed. These include computational complexity analysis, changing the order of computation, asynchronous computation, and divide and conquer. Each is illustrated with an example from scientific computation, and it is shown that computational complexity must be used with great care or an inefficient algorithm may be selected

    Partially Ordered Two-way B\"uchi Automata

    Full text link
    We introduce partially ordered two-way B\"uchi automata and characterize their expressive power in terms of fragments of first-order logic FO[<]. Partially ordered two-way B\"uchi automata are B\"uchi automata which can change the direction in which the input is processed with the constraint that whenever a state is left, it is never re-entered again. Nondeterministic partially ordered two-way B\"uchi automata coincide with the first-order fragment Sigma2. Our main contribution is that deterministic partially ordered two-way B\"uchi automata are expressively complete for the first-order fragment Delta2. As an intermediate step, we show that deterministic partially ordered two-way B\"uchi automata are effectively closed under Boolean operations. A small model property yields coNP-completeness of the emptiness problem and the inclusion problem for deterministic partially ordered two-way B\"uchi automata.Comment: The results of this paper were presented at CIAA 2010; University of Stuttgart, Computer Scienc

    A hierarchically blocked Jacobi SVD algorithm for single and multiple graphics processing units

    Full text link
    We present a hierarchically blocked one-sided Jacobi algorithm for the singular value decomposition (SVD), targeting both single and multiple graphics processing units (GPUs). The blocking structure reflects the levels of GPU's memory hierarchy. The algorithm may outperform MAGMA's dgesvd, while retaining high relative accuracy. To this end, we developed a family of parallel pivot strategies on GPU's shared address space, but applicable also to inter-GPU communication. Unlike common hybrid approaches, our algorithm in a single GPU setting needs a CPU for the controlling purposes only, while utilizing GPU's resources to the fullest extent permitted by the hardware. When required by the problem size, the algorithm, in principle, scales to an arbitrary number of GPU nodes. The scalability is demonstrated by more than twofold speedup for sufficiently large matrices on a Tesla S2050 system with four GPUs vs. a single Fermi card.Comment: Accepted for publication in SIAM Journal on Scientific Computin
    corecore