4,718 research outputs found
A bibliography on parallel and vector numerical algorithms
This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also
Solution of partial differential equations on vector and parallel computers
The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed
Minimizing synchronizations in sparse iterative solvers for distributed supercomputers
Eliminating synchronizations is one of the important techniques related to minimizing communications for modern high performance computing. This paper discusses principles of reducing communications due to global synchronizations in sparse iterative solvers on distributed supercomputers. We demonstrates how to minimizing global synchronizations by rescheduling a typical Krylov subspace method. The benefit of minimizing synchronizations is shown in theoretical analysis and is verified by numerical experiments using up to 900 processors. The experiments also show the communication complexity for some structured sparse matrix vector multiplications and global communications in the underlying supercomputers are in the order P1/2.5 and P4/5 respectively, where P is the number of processors and the experiments were carried on a Dawning 5000A
A Cellular, Language Directed Computer Architecture
If a VLSI computer architecture is to influence the field
of computing in some major way, it must have attractive properties in all important aspects affecting the design, production, and the use of the resulting computers. A computer architecture that is believed to have such properties is briefly discussed
Scalable Coordinated Beamforming for Dense Wireless Cooperative Networks
To meet the ever growing demand for both high throughput and uniform coverage
in future wireless networks, dense network deployment will be ubiquitous, for
which co- operation among the access points is critical. Considering the
computational complexity of designing coordinated beamformers for dense
networks, low-complexity and suboptimal precoding strategies are often adopted.
However, it is not clear how much performance loss will be caused. To enable
optimal coordinated beamforming, in this paper, we propose a framework to
design a scalable beamforming algorithm based on the alternative direction
method of multipliers (ADMM) method. Specifically, we first propose to apply
the matrix stuffing technique to transform the original optimization problem to
an equivalent ADMM-compliant problem, which is much more efficient than the
widely-used modeling framework CVX. We will then propose to use the ADMM
algorithm, a.k.a. the operator splitting method, to solve the transformed
ADMM-compliant problem efficiently. In particular, the subproblems of the ADMM
algorithm at each iteration can be solved with closed-forms and in parallel.
Simulation results show that the proposed techniques can result in significant
computational efficiency compared to the state- of-the-art interior-point
solvers. Furthermore, the simulation results demonstrate that the optimal
coordinated beamforming can significantly improve the system performance
compared to sub-optimal zero forcing beamforming
Using the VBARMS method in parallel computing
The paper describes an improved parallel MPI-based implementation of VBARMS, a variable block variant of the pARMS preconditioner proposed by Li, Saad and Sosonkina [NLAA, 2003] for solving general nonsymmetric linear systems. The parallel VBARMS solver can detect automatically exact or approximate dense structures in the linear system, and exploits this information to achieve improved reliability and increased throughput during the factorization. A novel graph compression algorithm is discussed that finds these approximate dense blocks structures and requires only one simple to use parameter. A complete study of the numerical and parallel performance of parallel VBARMS is presented for the analysis of large turbulent Navier-Stokes equations on a suite of three- dimensional test cases
Stability Analysis in Spanwise-Periodic Double-Sided Lid-Driven Cavity Flows With Complex Cross-Sectional Profiles
Three-dimensional linear instability analyses are presented of steady two-dimensional laminar flows in the lid-driven cavity defined by [15] and further analyzed in the present volume [1], as well as in a derivative of the same geometry. It is shown that in both of the geometries considered three-dimensional BiGlobal instability leads to deviation of the flow from the two-dimensional solution; the analysis results are used to define low- and high-Reynolds number solutions by reference to the flow physics. Critical conditions for linear global instability and neutral loops are presented in both geometries
- …