Search CORE

3,759 research outputs found

An efficient parallel immersed boundary algorithm using a pseudo-compressible fluid solver

Author: Stockie John M.
Wiens Jeffrey K.
Publication venue: 'Elsevier BV'
Publication date: 19/05/2014
Field of study

We propose an efficient algorithm for the immersed boundary method on distributed-memory architectures, with the computational complexity of a completely explicit method and excellent parallel scaling. The algorithm utilizes the pseudo-compressibility method recently proposed by Guermond and Minev [Comptes Rendus Mathematique, 348:581-585, 2010] that uses a directional splitting strategy to discretize the incompressible Navier-Stokes equations, thereby reducing the linear systems to a series of one-dimensional tridiagonal systems. We perform numerical simulations of several fluid-structure interaction problems in two and three dimensions and study the accuracy and convergence rates of the proposed algorithm. For these problems, we compare the proposed algorithm against other second-order projection-based fluid solvers. Lastly, the strong and weak scaling properties of the proposed algorithm are investigated

arXiv.org e-Print Archive

CiteSeerX

A simple parallel prefix algorithm for compact finite-difference schemes

Author: Joslin Ronald D.
Sun Xian-He
Publication venue
Publication date
Field of study

A compact scheme is a discretization scheme that is advantageous in obtaining highly accurate solutions. However, the resulting systems from compact schemes are tridiagonal systems that are difficult to solve efficiently on parallel computers. Considering the almost symmetric Toeplitz structure, a parallel algorithm, simple parallel prefix (SPP), is proposed. The SPP algorithm requires less memory than the conventional LU decomposition and is highly efficient on parallel machines. It consists of a prefix communication pattern and AXPY operations. Both the computation and the communication can be truncated without degrading the accuracy when the system is diagonally dominant. A formal accuracy study was conducted to provide a simple truncation formula. Experimental results were measured on a MasPar MP-1 SIMD machine and on a Cray 2 vector machine. Experimental results show that the simple parallel prefix algorithm is a good algorithm for the compact scheme on high-performance computers

NASA Technical Reports Server

An adaptive grid algorithm for one-dimensional nonlinear equations

Author: Gutierrez William E.
Hills Richard G.
Publication venue
Publication date
Field of study

Richards' equation, which models the flow of liquid through unsaturated porous media, is highly nonlinear and difficult to solve. Step gradients in the field variables require the use of fine grids and small time step sizes. The numerical instabilities caused by the nonlinearities often require the use of iterative methods such as Picard or Newton interation. These difficulties result in large CPU requirements in solving Richards equation. With this in mind, adaptive and multigrid methods are investigated for use with nonlinear equations such as Richards' equation. Attention is focused on one-dimensional transient problems. To investigate the use of multigrid and adaptive grid methods, a series of problems are studied. First, a multigrid program is developed and used to solve an ordinary differential equation, demonstrating the efficiency with which low and high frequency errors are smoothed out. The multigrid algorithm and an adaptive grid algorithm is used to solve one-dimensional transient partial differential equations, such as the diffusive and convective-diffusion equations. The performance of these programs are compared to that of the Gauss-Seidel and tridiagonal methods. The adaptive and multigrid schemes outperformed the Gauss-Seidel algorithm, but were not as fast as the tridiagonal method. The adaptive grid scheme solved the problems slightly faster than the multigrid method. To solve nonlinear problems, Picard iterations are introduced into the adaptive grid and tridiagonal methods. Burgers' equation is used as a test problem for the two algorithms. Both methods obtain solutions of comparable accuracy for similar time increments. For the Burgers' equation, the adaptive grid method finds the solution approximately three times faster than the tridiagonal method. Finally, both schemes are used to solve the water content formulation of the Richards' equation. For this problem, the adaptive grid method obtains a more accurate solution in fewer work units and less computation time than required by the tridiagonal method. The performance of the adaptive grid method tends to degrade as the solution process proceeds in time, but still remains faster than the tridiagonal scheme

NASA Technical Reports Server

Alternating-Direction Line-Relaxation Methods on Multicomputers

Author: Hofhaus Jörn
Van de Velde Eric
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/1996
Field of study

We study the multicom.puter performance of a three-dimensional Navier–Stokes solver based on alternating-direction line-relaxation methods. We compare several multicomputer implementations, each of which combines a particular line-relaxation method and a particular distributed block-tridiagonal solver. In our experiments, the problem size was determined by resolution requirements of the application. As a result, the granularity of the computations of our study is finer than is customary in the performance analysis of concurrent block-tridiagonal solvers. Our best results were obtained with a modified half-Gauss–Seidel line-relaxation method implemented by means of a new iterative block-tridiagonal solver that is developed here. Most computations were performed on the Intel Touchstone Delta, but we also used the Intel Paragon XP/S, the Parsytec SC-256, and the Fujitsu S-600 for comparison

Caltech Authors

Publikationsserver der RWTH Aachen University

Development of iterative techniques for the solution of unsteady compressible viscous flows

Author: Hixon Duane
Sankar L. N.
Publication venue
Publication date
Field of study

During the past two decades, there has been significant progress in the field of numerical simulation of unsteady compressible viscous flows. At present, a variety of solution techniques exist such as the transonic small disturbance analyses (TSD), transonic full potential equation-based methods, unsteady Euler solvers, and unsteady Navier-Stokes solvers. These advances have been made possible by developments in three areas: (1) improved numerical algorithms; (2) automation of body-fitted grid generation schemes; and (3) advanced computer architectures with vector processing and massively parallel processing features. In this work, the GMRES scheme has been considered as a candidate for acceleration of a Newton iteration time marching scheme for unsteady 2-D and 3-D compressible viscous flow calculation; from preliminary calculations, this will provide up to a 65 percent reduction in the computer time requirements over the existing class of explicit and implicit time marching schemes. The proposed method has ben tested on structured grids, but is flexible enough for extension to unstructured grids. The described scheme has been tested only on the current generation of vector processor architecture of the Cray Y/MP class, but should be suitable for adaptation to massively parallel machines

NASA Technical Reports Server

Improved Accuracy and Parallelism for MRRR-based Eigensolvers -- A Mixed Precision Approach

Author: Bientinesi Paolo
Petschow Matthias
Quintana-Orti Enrique
Publication venue
Publication date: 01/01/2013
Field of study

The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR) is among the fastest methods. Although fast, the solvers based on MRRR do not deliver the same accuracy as competing methods like Divide & Conquer or the QR algorithm. In this paper, we demonstrate that the use of mixed precisions leads to improved accuracy of MRRR-based eigensolvers with limited or no performance penalty. As a result, we obtain eigensolvers that are not only equally or more accurate than the best available methods, but also -in most circumstances- faster and more scalable than the competition

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Publikationsserver der RWTH Aachen University