Search CORE

40 research outputs found

The behavior of conjugate gradient algorithms on a multivector processor with a hierarchical memory

Author: Meier Ulrike
Sameh Ahmed
Publication venue: Published by Elsevier B.V.
Publication date: 30/11/1988
Field of study

AbstractIn this paper, an analysis of some of the tradeoffs involved in the design and efficient implementation of conjugate gradient-based algorithms for a multivector processor with a two-level memory hierarchy is presented and supplemented by experimental results obtained on an Alliant FX/8. The algorithms considered consist of the classical conjugate gradient method, preconditioning techniques that are well suited for parallel computers such as polynomial preconditioners and several versions of the incomplete Cholesky preconditioners as well as the reduced system approach. For linear systems arising from the 5-point finite difference discretization of 2-d self-adjoint elliptic P.D.E.'s, the analysis shows that conjugate gradient methods do not perform as well as algorithms for dense matrix computations on the considered architecture due to lack of data locality. By using the reduced system approach, however, a significant decrease in time could be obtained

Elsevier - Publisher Connector

Solving large sparse eigenvalue problems on supercomputers

Author: Philippe Bernard
Saad Youcef
Publication venue
Publication date
Field of study

An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed

NASA Technical Reports Server

On the parallel solution of parabolic equations

Author: Gallopoulos E.
Saad Youcef
Publication venue
Publication date: 01/01/1989
Field of study

Parallel algorithms for the solution of linear parabolic problems are proposed. The first of these methods is based on using polynomial approximation to the exponential. It does not require solving any linear systems and is highly parallelizable. The two other methods proposed are based on Pade and Chebyshev approximations to the matrix exponential. The parallelization of these methods is achieved by using partial fraction decomposition techniques to solve the resulting systems and thus offers the potential for increased time parallelism in time dependent problems. Experimental results from the Alliant FX/8 and the Cray Y-MP/832 vector multiprocessors are also presented

Crossref

NASA Technical Reports Server

An experiment in hurricane track prediction using parallel computing methods

Author: Dhall S. K.
Jwo Jung-Sing
Lakshmivarahan S.
Lewis John M.
Song Chang G.
Velden Christopher S.
Publication venue
Publication date
Field of study

The barotropic model is used to explore the advantages of parallel processing in deterministic forecasting. We apply this model to the track forecasting of hurricane Elena (1985). In this particular application, solutions to systems of elliptic equations are the essence of the computational mechanics. One set of equations is associated with the decomposition of the wind into irrotational and nondivergent components - this determines the initial nondivergent state. Another set is associated with recovery of the streamfunction from the forecasted vorticity. We demonstrate that direct parallel methods based on accelerated block cyclic reduction (BCR) significantly reduce the computational time required to solve the elliptic equations germane to this decomposition and forecast problem. A 72-h track prediction was made using incremental time steps of 16 min on a network of 3000 grid points nominally separated by 100 km. The prediction took 30 sec on the 8-processor Alliant FX/8 computer. This was a speed-up of 3.7 when compared to the one-processor version. The 72-h prediction of Elena's track was made as the storm moved toward Florida's west coast. Approximately 200 km west of Tampa Bay, Elena executed a dramatic recurvature that ultimately changed its course toward the northwest. Although the barotropic track forecast was unable to capture the hurricane's tight cycloidal looping maneuver, the subsequent northwesterly movement was accurately forecasted as was the location and timing of landfall near Mobile Bay

NASA Technical Reports Server

Vectorization of the odd-even hopscotch scheme and the alternating direction implicit scheme for the two-dimensional Burgers equations

Author: Goede E.D. (Erik) de
Thije Boonkkamp J.H.M. ten
Publication venue
Publication date: 01/01/1990
Field of study

CWI's Institutional Repository

Pure OAI Repository

Parallel Gaussian elimination of a block tridiagonal matrix using multiple microcomputers

Author: Blech Richard A.
Publication venue
Publication date
Field of study

The solution of a block tridiagonal matrix using parallel processing is demonstrated. The multiprocessor system on which results were obtained and the software environment used to program that system are described. Theoretical partitioning and resource allocation for the Gaussian elimination method used to solve the matrix are discussed. The results obtained from running 1, 2 and 3 processor versions of the block tridiagonal solver are presented. The PASCAL source code for these solvers is given in the appendix, and may be transportable to other shared memory parallel processors provided that the synchronization outlines are reproduced on the target system

University of Michigan Library Repository

NASA Technical Reports Server

An asymptotic induced numerical method for the convection-diffusion-reaction equation

Author: Scroggs Jeffrey S.
Sorensen Danny C.
Publication venue
Publication date
Field of study

A parallel algorithm for the efficient solution of a time dependent reaction convection diffusion equation with small parameter on the diffusion term is presented. The method is based on a domain decomposition that is dictated by singular perturbation analysis. The analysis is used to determine regions where certain reduced equations may be solved in place of the full equation. Parallelism is evident at two levels. Domain decomposition provides parallelism at the highest level, and within each domain there is ample opportunity to exploit parallelism. Run time results demonstrate the viability of the method

NASA Technical Reports Server

Non‐stationary parallel multisplitting algorithms for almost linear systems

Author: Josep Arnal
José Penadés
Violeta Migallón
Publication venue: 'Wiley'
Publication date: 01/01/2002
Field of study

Crossref

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

Vectorization and parallelization of the finite strip method for dynamic Mindlin plate problems

Author: Chen Hsin-Chu
He Ai-Fang
Publication venue
Publication date
Field of study

The finite strip method is a semi-analytical finite element process which allows for a discrete analysis of certain types of physical problems by discretizing the domain of the problem into finite strips. This method decomposes a single large problem into m smaller independent subproblems when m harmonic functions are employed, thus yielding natural parallelism at a very high level. In this paper we address vectorization and parallelization strategies for the dynamic analysis of simply-supported Mindlin plate bending problems and show how to prevent potential conflicts in memory access during the assemblage process. The vector and parallel implementations of this method and the performance results of a test problem under scalar, vector, and vector-concurrent execution modes on the Alliant FX/80 are also presented

NASA Technical Reports Server