62 research outputs found
Augmented Block-Arnoldi Recycling CFD Solvers
One of the limitations of recycled GCRO methods is the large amount of
computation required to orthogonalize the basis vectors of the newly generated
Krylov subspace for the approximate solution when combined with those of the
recycle subspace. Recent advancements in low synchronization Gram-Schmidt and
generalized minimal residual algorithms, Swirydowicz et
al.~\cite{2020-swirydowicz-nlawa}, Carson et al. \cite{Carson2022}, and Lund
\cite{Lund2022}, can be incorporated, thereby mitigating the loss of
orthogonality of the basis vectors. An augmented Arnoldi formulation of
recycling leads to a matrix decomposition and the associated algorithm can also
be viewed as a {\it block} Krylov method. Generalizations of both classical and
modified block Gram-Schmidt algorithms have been proposed, Carson et
al.~\cite{Carson2022}. Here, an inverse compact modified Gram-Schmidt
algorithm is applied for the inter-block orthogonalization scheme with a block
lower triangular correction matrix at iteration . When combined with a
weighted (oblique inner product) projection step, the inverse compact
scheme leads to significant (over 10 in certain cases) reductions in
the number of solver iterations per linear system. The weight is also
interpreted in terms of the angle between restart residuals in LGMRES, as
defined by Baker et al.\cite{Baker2005}. In many cases, the recycle subspace
eigen-spectrum can substitute for a preconditioner
Flexible Enlarged Conjugate Gradient Methods
Enlarged Krylov subspace methods and their s-step versions were introduced
[7] in the aim of reducing communication when solving systems of linear
equations Ax = b. These enlarged CG methods consist of enlarging the Krylov
subspace by a maximum of t vectors per iteration based on the domain
decomposition of the graph of A. As for the s-step versions, s iterations of
the enlarged Conjugate Gradient methods are merged in one iteration. The
Enlarged CG methods and their s-step versions converge in less iterations than
the classical CG, but at the expense of requiring more memory storage than CG.
Thus in this paper we explore different options for reducing the memory
requirements of these enlarged CG methods without affecting much their
convergence.Comment: 28 pages, 14 figure
Recycling Krylov Subspaces for Efficient Partitioned Solution of Aerostructural Adjoint Systems
Robust and efficient solvers for coupled-adjoint linear systems are crucial
to successful aerostructural optimization. Monolithic and partitioned
strategies can be applied. The monolithic approach is expected to offer better
robustness and efficiency for strong fluid-structure interactions. However, it
requires a high implementation cost and convergence may depend on appropriate
scaling and initialization strategies. On the other hand, the modularity of the
partitioned method enables a straightforward implementation while its
convergence may require relaxation. In addition, a partitioned solver leads to
a higher number of iterations to get the same level of convergence as the
monolithic one.
The objective of this paper is to accelerate the fluid-structure
coupled-adjoint partitioned solver by considering techniques borrowed from
approximate invariant subspace recycling strategies adapted to sequences of
linear systems with varying right-hand sides. Indeed, in a partitioned
framework, the structural source term attached to the fluid block of equations
affects the right-hand side with the nice property of quickly converging to a
constant value. We also consider deflation of approximate eigenvectors in
conjunction with advanced inner-outer Krylov solvers for the fluid block
equations. We demonstrate the benefit of these techniques by computing the
coupled derivatives of an aeroelastic configuration of the ONERA-M6 fixed wing
in transonic flow. For this exercise the fluid grid was coupled to a structural
model specifically designed to exhibit a high flexibility. All computations are
performed using RANS flow modeling and a fully linearized one-equation
Spalart-Allmaras turbulence model. Numerical simulations show up to 39%
reduction in matrix-vector products for GCRO-DR and up to 19% for the nested
FGCRO-DR solver.Comment: 42 pages, 21 figure
Adaptively restarted block Krylov subspace methods with low-synchronization skeletons
With the recent realization of exascale performace by Oak Ridge National
Laboratory's Frontier supercomputer, reducing communication in kernels like QR
factorization has become even more imperative. Low-synchronization Gram-Schmidt
methods, first introduced in [K. \'{S}wirydowicz, J. Langou, S. Ananthan, U.
Yang, and S. Thomas, Low Synchronization Gram-Schmidt and Generalized Minimum
Residual Algorithms, Numer. Lin. Alg. Appl., Vol. 28(2), e2343, 2020], have
been shown to improve the scalability of the Arnoldi method in high-performance
distributed computing. Block versions of low-synchronization Gram-Schmidt show
further potential for speeding up algorithms, as column-batching allows for
maximizing cache usage with matrix-matrix operations. In this work,
low-synchronization block Gram-Schmidt variants from [E. Carson, K. Lund, M.
Rozlo\v{z}n\'{i}k, and S. Thomas, Block Gram-Schmidt algorithms and their
stability properties, Lin. Alg. Appl., 638, pp. 150--195, 2022] are transformed
into block Arnoldi variants for use in block full orthogonalization methods
(BFOM) and block generalized minimal residual methods (BGMRES). An adaptive
restarting heuristic is developed to handle instabilities that arise with the
increasing condition number of the Krylov basis. The performance, accuracy, and
stability of these methods are assessed via a flexible benchmarking tool
written in MATLAB. The modularity of the tool additionally permits generalized
block inner products, like the global inner product
A block minimum residual norm subspace solver with partial convergence management for sequences of linear systems
International audienceWe are concerned with the iterative solution of linear systems with multiple right-hand sides available one group after another with possibly slowly-varying left-hand sides. For such sequences of linear systems, we first develop a new block minimum norm residual approach that combines two main ingredients. The first component exploits ideas from GCRO-DR [SIAM J. Sci. Comput., 28(5) (2006), pp. 1651-1674], enabling to recycle information from one solve to the next. The second component is the numerical mechanism to manage the partial convergence of the right-hand sides, referred to as inexact breakdown detection in IB-BGMRES [Linear Algebra Appl., 419 (2006), pp. 265-285], that enables the monitoring of the rank deficiency in the residual space basis expanded block-wise. Secondly, for the class of block minimum norm residual approaches, that relies on a block Arnoldi-like equality between the search space and the residual space (e.g., any block GMRES or block GCRO variants), we introduce new search space expansion policies defined on novel criteria to detect the partial convergence. These novel detection criteria are tuned to the selected stopping criterion and targeted convergence threshold to best cope with the selected normwise backward error stopping criterion, enabling to monitor the computational effort while ensuring the final accuracy of each individual solution. Numerical experiments are reported to illustrate the numerical and computational features of both the new block Krylov solvers and the new search space block expansion polices
Tensor Train Decomposition for solving high-dimensional Mutual Hazard Networks
We describe the process of enabling the Mutual Hazard Network model for large data sets, i.e., for high dimensions, by using the Tensor Train decomposition. We first briefly review the Mutual Hazard Network model and explain its limitations when using classical methods. We then introduce the Tensor Train format and explain how to perform required operations in it with a particular emphasis on solving systems of linear equations. Next, we explain how to apply the Tensor Train format to the Mutual Hazard Network. Furthermore, we describe some technical aspects of the software implementation. Finally, we present numerical results of different methods used to solve linear systems which occur in the Mutual Hazard Network model. These methods allow the complexity in the number of events to be reduced from to , thereby enabling the Mutual Hazard Network model to be applied to larger data sets
KSPHPDDM and PCHPDDM: Extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners
[EN] Contemporary applications in computational science and engineering often require the solution of linear systems which may be of different sizes, shapes, and structures. The goal of this paper is to explain how two libraries, PETSc and HPDDM, have been interfaced in order to offer end-users robust overlapping Schwarz preconditioners and advanced Krylov methods featuring recycling and the ability to deal with multiple right-hand sides. The flexibility of the implementation is showcased and explained with minimalist, easy-to-run, and reproducible examples, to ease the integration of these algorithms into more advanced frameworks. The examples provided cover applications from eigenanalysis, elasticity, combustion, and electromagnetism.Jose E. Roman was supported by the Spanish Agencia Estatal de Investigacion (AEI) under project SLEPc-DA (PID2019-107379RB-I00)Jolivet, P.; Roman, JE.; Zampini, S. (2021). KSPHPDDM and PCHPDDM: Extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners. Computers & Mathematics with Applications. 84:277-295. https://doi.org/10.1016/j.camwa.2021.01.0032772958
NEP: A Module for the Parallel Solution of Nonlinear Eigenvalue Problems in SLEPc
[EN] SLEPc is a parallel library for the solution of various types of large-scale eigenvalue problems. Over the past few years, we have been developing a module within SLEPc, called NEP, that is intended for solving nonlinear eigenvalue problems. These problems can be defined by means of a matrix-valued function that depends nonlinearly on a single scalar parameter. We do not consider the particular case of polynomial eigenvalue problems (which are implemented in a different module in SLEPc) and focus here on rational eigenvalue problems and other general nonlinear eigenproblems involving square roots or any other nonlinear function. The article discusses how the NEP module has been designed to fit the needs of applications and provides a description of the available solvers, including some implementation details such as parallelization. Several test problems coming from real applications are used to evaluate the performance and reliability of the solvers.This work was partially funded by the Spanish Agencia Estatal de Investigacion AEI http://ciencia.gob.es under grants TIN2016-75985-P AEI and PID2019-107379RB-I00 AEI (including European Commission FEDER funds).Campos, C.; Roman, JE. (2021). NEP: A Module for the Parallel Solution of Nonlinear Eigenvalue Problems in SLEPc. ACM Transactions on Mathematical Software. 47(3):1-29. https://doi.org/10.1145/3447544S12947
- …