4,718 research outputs found

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    Minimizing synchronizations in sparse iterative solvers for distributed supercomputers

    Get PDF
    Eliminating synchronizations is one of the important techniques related to minimizing communications for modern high performance computing. This paper discusses principles of reducing communications due to global synchronizations in sparse iterative solvers on distributed supercomputers. We demonstrates how to minimizing global synchronizations by rescheduling a typical Krylov subspace method. The benefit of minimizing synchronizations is shown in theoretical analysis and is verified by numerical experiments using up to 900 processors. The experiments also show the communication complexity for some structured sparse matrix vector multiplications and global communications in the underlying supercomputers are in the order P1/2.5 and P4/5 respectively, where P is the number of processors and the experiments were carried on a Dawning 5000A

    A Cellular, Language Directed Computer Architecture

    Get PDF
    If a VLSI computer architecture is to influence the field of computing in some major way, it must have attractive properties in all important aspects affecting the design, production, and the use of the resulting computers. A computer architecture that is believed to have such properties is briefly discussed

    Scalable Coordinated Beamforming for Dense Wireless Cooperative Networks

    Full text link
    To meet the ever growing demand for both high throughput and uniform coverage in future wireless networks, dense network deployment will be ubiquitous, for which co- operation among the access points is critical. Considering the computational complexity of designing coordinated beamformers for dense networks, low-complexity and suboptimal precoding strategies are often adopted. However, it is not clear how much performance loss will be caused. To enable optimal coordinated beamforming, in this paper, we propose a framework to design a scalable beamforming algorithm based on the alternative direction method of multipliers (ADMM) method. Specifically, we first propose to apply the matrix stuffing technique to transform the original optimization problem to an equivalent ADMM-compliant problem, which is much more efficient than the widely-used modeling framework CVX. We will then propose to use the ADMM algorithm, a.k.a. the operator splitting method, to solve the transformed ADMM-compliant problem efficiently. In particular, the subproblems of the ADMM algorithm at each iteration can be solved with closed-forms and in parallel. Simulation results show that the proposed techniques can result in significant computational efficiency compared to the state- of-the-art interior-point solvers. Furthermore, the simulation results demonstrate that the optimal coordinated beamforming can significantly improve the system performance compared to sub-optimal zero forcing beamforming

    Using the VBARMS method in parallel computing

    Get PDF
    The paper describes an improved parallel MPI-based implementation of VBARMS, a variable block variant of the pARMS preconditioner proposed by Li, Saad and Sosonkina [NLAA, 2003] for solving general nonsymmetric linear systems. The parallel VBARMS solver can detect automatically exact or approximate dense structures in the linear system, and exploits this information to achieve improved reliability and increased throughput during the factorization. A novel graph compression algorithm is discussed that finds these approximate dense blocks structures and requires only one simple to use parameter. A complete study of the numerical and parallel performance of parallel VBARMS is presented for the analysis of large turbulent Navier-Stokes equations on a suite of three- dimensional test cases

    Stability Analysis in Spanwise-Periodic Double-Sided Lid-Driven Cavity Flows With Complex Cross-Sectional Profiles

    Get PDF
    Three-dimensional linear instability analyses are presented of steady two-dimensional laminar flows in the lid-driven cavity defined by [15] and further analyzed in the present volume [1], as well as in a derivative of the same geometry. It is shown that in both of the geometries considered three-dimensional BiGlobal instability leads to deviation of the flow from the two-dimensional solution; the analysis results are used to define low- and high-Reynolds number solutions by reference to the flow physics. Critical conditions for linear global instability and neutral loops are presented in both geometries
    • …
    corecore