Search CORE

10,350 research outputs found

An Arnoldi-frontal approach for the stability analysis of flows in a collapsible channel

Author: Cai Zongxi
Hao Yujue
Luo Xiaoyu
Roper Steven
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/09/2016
Field of study

In this paper, we present a new approach based on a combination of the Arnoldi and frontal methods for solving large sparse asymmetric and generalized complex eigenvalue problems. The new eigensolver seeks the most unstable eigensolution in the Krylov subspace and makes use of the efficiency of the frontal solver developed for the finite element methods. The approach is used for a stability analysis of flows in a collapsible channel and is found to significantly improve the computational efficiency compared to the traditionally used QZ solver or a standard Arnoldi method. With the new approach, we are able to validate the previous results obtained either on a much coarser mesh or estimated from unsteady simulations. New neutral stability solutions of the system have been obtained which are beyond the limits of previously used methods

Enlighten

h-multigrid agglomeration based solution strategies for discontinuous Galerkin discretizations of incompressible flow problems

Author: Bassi Francesco
Botti Lorenzo
Colombo Alessandro
Publication venue: 'Elsevier BV'
Publication date: 10/03/2017
Field of study

In this work we exploit agglomeration based

h

-multigrid preconditioners to speed-up the iterative solution of discontinuous Galerkin discretizations of the Stokes and Navier-Stokes equations. As a distinctive feature

h

-coarsened mesh sequences are generated by recursive agglomeration of a fine grid, admitting arbitrarily unstructured grids of complex domains, and agglomeration based discontinuous Galerkin discretizations are employed to deal with agglomerated elements of coarse levels. Both the expense of building coarse grid operators and the performance of the resulting multigrid iteration are investigated. For the sake of efficiency coarse grid operators are inherited through element-by-element

L^2

projections, avoiding the cost of numerical integration over agglomerated elements. Specific care is devoted to the projection of viscous terms discretized by means of the BR2 dG method. We demonstrate that enforcing the correct amount of stabilization on coarse grids levels is mandatory for achieving uniform convergence with respect to the number of levels. The numerical solution of steady and unsteady, linear and non-linear problems is considered tackling challenging 2D test cases and 3D real life computations on parallel architectures. Significant execution time gains are documented.Comment: 78 pages, 7 figure

arXiv.org e-Print Archive

Parallel Factorizations in Numerical Analysis

Author: Amodio Pierluigi
Brugnano Luigi
Publication venue
Publication date: 01/01/2009
Field of study

In this paper we review the parallel solution of sparse linear systems, usually deriving by the discretization of ODE-IVPs or ODE-BVPs. The approach is based on the concept of parallel factorization of a (block) tridiagonal matrix. This allows to obtain efficient parallel extensions of many known matrix factorizations, and to derive, as a by-product, a unifying approach to the parallel solution of ODEs.Comment: 15 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Archivio istituzionale della ricerca - Università di Bari

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion

A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures

Author: Buttari Alfredo
Dongarra Jack
Kurzak Jakub
Langou Julien
Publication venue
Publication date: 01/01/2007
Field of study

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requirement and introduces the necessity of loose synchronization in the parallel execution of an operation. This paper presents an algorithm for the Cholesky, LU and QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data. These tasks can be dynamically scheduled for execution based on the dependencies among them and on the availability of computational resources. This may result in an out of order execution of the tasks which will completely hide the presence of intrinsically sequential tasks in the factorization. Performance comparisons are presented with the LAPACK algorithms where parallelism can only be exploited at the level of the BLAS operations and vendor implementations

arXiv.org e-Print Archive

CiteSeerX

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

MIMS EPrints

The University of Manchester - Institutional Repository

A weakly stable algorithm for general Toeplitz systems

Author: A. Griewank
A.N. Kolmogorov
A.W. Bojanczyk
A.W. Bojanczyk
A.W. Bojanczyk
A.W. Bojanczyk
A.W. Bojanczyk
Adam W. Bojanczyk
B.R. Musicus
C.-T. Pan
C.C. Paige
C.H. Bischof
D.J. Higham
D.R. Sweet
D.R. Sweet
D.R. Sweet
E.H. Bareiss
F.R. Hoog de
F.T. Luk
Frank R. de Hoog
G. Cybenko
G. Cybenko
G. Heinig
G. Heinig
G. Szegö
G.A. Watson
G.H. Golub
G.H. Golub
G.H. Golub
G.S. Ammar
G.W. Stewart
G.W. Stewart
G.W. Stewart
H. Sexton
I. Schur
J. Chun
J. Chun
J. Dongarra
J. Jankowski
J. Rissanen
J.G. Nagy
J.H. Wilkinson
J.H. Wilkinson
J.H. Wilkinson
J.M. Varah
J.R. Bunch
J.R. Bunch
M. Gentleman
M.A. Saunders
M.H. Gutknecht
M.H. Gutknecht
M.H. Gutknecht
N. Gould
N. Levinson
N. Wiener
P.C. Hansen
P.E. Gill
R. Fletcher
R.D. Skeel
R.P. Brent
R.P. Brent
R.R. Bitmead
R.W. Freund
R.W. Freund
R.W. Freund
Richard P. Brent
S. Qiao
S. Zohar
T. Kailath
T. Kailath
T. Kailath
T. Kailath
T.F. Chan
T.F. Chan
W. Miller
W.F. Trench
Å. Björck
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1995
Field of study

We show that a fast algorithm for the QR factorization of a Toeplitz or Hankel matrix A is weakly stable in the sense that R^T.R is close to A^T.A. Thus, when the algorithm is used to solve the semi-normal equations R^T.Rx = A^Tb, we obtain a weakly stable method for the solution of a nonsingular Toeplitz or Hankel linear system Ax = b. The algorithm also applies to the solution of the full-rank Toeplitz or Hankel least squares problem.Comment: 17 pages. An old Technical Report with postscript added. For further details, see http://wwwmaths.anu.edu.au/~brent/pub/pub143.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref