Search CORE

920 research outputs found

Numerically Stable Recurrence Relations for the Communication Hiding Pipelined Conjugate Gradient Method

Author: Cools Siegfried
Cornelis Jeffrey
Vanroose Wim
Publication venue
Publication date: 01/01/2019
Field of study

Pipelined Krylov subspace methods (also referred to as communication-hiding methods) have been proposed in the literature as a scalable alternative to classic Krylov subspace algorithms for iteratively computing the solution to a large linear system in parallel. For symmetric and positive definite system matrices the pipelined Conjugate Gradient method outperforms its classic Conjugate Gradient counterpart on large scale distributed memory hardware by overlapping global communication with essential computations like the matrix-vector product, thus hiding global communication. A well-known drawback of the pipelining technique is the (possibly significant) loss of numerical stability. In this work a numerically stable variant of the pipelined Conjugate Gradient algorithm is presented that avoids the propagation of local rounding errors in the finite precision recurrence relations that construct the Krylov subspace basis. The multi-term recurrence relation for the basis vector is replaced by two-term recurrences, improving stability without increasing the overall computational cost of the algorithm. The proposed modification ensures that the pipelined Conjugate Gradient method is able to attain a highly accurate solution independently of the pipeline length. Numerical experiments demonstrate a combination of excellent parallel performance and improved maximal attainable accuracy for the new pipelined Conjugate Gradient algorithm. This work thus resolves one of the major practical restrictions for the useability of pipelined Krylov subspace methods.Comment: 15 pages, 5 figures, 1 table, 2 algorithm

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Accelerated Discontinuous Galerkin Solvers with the Chebyshev Iterative Method on the Graphics Processing Unit

Author: Tullius Toni Kathleen
Publication venue
Publication date: 01/01/2011
Field of study

This work demonstrates implementations of the discontinuous Galerkin (DG) method on graphics processing units (GPU), which deliver improved computational time compared to the conventional central processing unit (CPU). The linear system developed when applying the DG method to an elliptic problem is solved using the GPU. The conjugate gradient (CG) method and the Chebyshev iterative method are the linear system solvers that are compared, to see which is more efficient when computing with the CPU's parallel architecture. When applying both methods, computational times decreased for large problems executed on the GPU compared to CPU; however, CG is the more efficient method compared to the Chebyshev iterative method. In addition, a constant-free upper bound for the DC spectrum applied to the elliptic problem is developed. Few previous works combine the DG method and the GPU. This thesis will provide useful guidelines for the numerical solution of elliptic problems using DG on the GPU

DSpace at Rice University

Matrix-equation-based strategies for convection-diffusion equations

Author: Palitta Davide
Simoncini Valeria
Publication venue
Publication date: 13/01/2015
Field of study

We are interested in the numerical solution of nonsymmetric linear systems arising from the discretization of convection-diffusion partial differential equations with separable coefficients and dominant convection. Preconditioners based on the matrix equation formulation of the problem are proposed, which naturally approximate the original discretized problem. For certain types of convection coefficients, we show that the explicit solution of the matrix equation can effectively replace the linear system solution. Numerical experiments with data stemming from two and three dimensional problems are reported, illustrating the potential of the proposed methodology

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Preconditioners for Krylov subspace methods: An overview

Author: Pearson John W
Pestana Jennifer
Publication venue: 'Wiley'
Publication date: 03/11/2020
Field of study

When simulating a mechanism from science or engineering, or an industrial process, one is frequently required to construct a mathematical model, and then resolve this model numerically. If accurate numerical solutions are necessary or desirable, this can involve solving large-scale systems of equations. One major class of solution methods is that of preconditioned iterative methods, involving preconditioners which are computationally cheap to apply while also capturing information contained in the linear system. In this article, we give a short survey of the field of preconditioning. We introduce a range of preconditioners for partial differential equations, followed by optimization problems, before discussing preconditioners constructed with less standard objectives in mind

Crossref

University of Strathclyde Institutional Repository

Edinburgh Research Explorer

A Bayesian conjugate gradient method (with Discussion)

Author: Cockayne J
Girolami M
Ipsen ICF
Oates CJ
Publication venue: Bayesian Analysis
Publication date: 01/01/2019
Field of study

A fundamental task in numerical computation is the solution of large linear systems. The conjugate gradient method is an iterative method which offers rapid convergence to the solution, particularly when an effective preconditioner is employed. However, for more challenging systems a substantial error can be present even after many iterations have been performed. The estimates obtained in this case are of little value unless further information can be provided about the numerical error. In this paper we propose a novel statistical model for this numerical error set in a Bayesian framework. Our approach is a strict generalisation of the conjugate gradient method, which is recovered as the posterior mean for a particular choice of prior. The estimates obtained are analysed with Krylov subspace methods and a contraction result for the posterior is presented. The method is then analysed in a simulation study as well as being applied to a challenging problem in medical imaging

Southampton (e-Prints Soton)

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra

Author: Finzi Marc
Pleiss Geoff
Potapczynski Andres
Wilson Andrew Gordon
Publication venue
Publication date: 06/09/2023
Field of study

Many areas of machine learning and science involve large linear algebra problems, such as eigendecompositions, solving linear systems, computing matrix exponentials, and trace estimation. The matrices involved often have Kronecker, convolutional, block diagonal, sum, or product structure. In this paper, we propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA (Compositional Linear Algebra). By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms. Moreover, CoLA provides memory efficient automatic differentiation, low precision computation, and GPU acceleration in both JAX and PyTorch, while also accommodating new objects, operations, and rules in downstream packages via multiple dispatch. CoLA can accelerate many algebraic operations, while making it easy to prototype matrix structures and algorithms, providing an appealing drop-in tool for virtually any computational effort that requires linear algebra. We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.Comment: Code available at https://github.com/wilson-labs/col

arXiv.org e-Print Archive

Properties of approximate inverses and adaptive control concepts for preconditioning [online]

Author: Koschinski Claus
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/1999
Field of study

KITopen

Faster Randomized Interior Point Methods for Tall/Wide Linear Programs

Author: Avron Haim
Chowdhury Agniva
Dexter Gregory
Drineas Petros
London Palma
Publication venue
Publication date: 23/09/2022
Field of study

Linear programming (LP) is an extremely useful tool which has been successfully applied to solve various problems in a wide range of areas, including operations research, engineering, economics, or even more abstract mathematical areas such as combinatorics. It is also used in many machine learning applications, such as

\ell_1

-regularized SVMs, basis pursuit, nonnegative matrix factorization, etc. Interior Point Methods (IPMs) are one of the most popular methods to solve LPs both in theory and in practice. Their underlying complexity is dominated by the cost of solving a system of linear equations at each iteration. In this paper, we consider both feasible and infeasible IPMs for the special case where the number of variables is much larger than the number of constraints. Using tools from Randomized Linear Algebra, we present a preconditioning technique that, when combined with the iterative solvers such as Conjugate Gradient or Chebyshev Iteration, provably guarantees that IPM algorithms (suitably modified to account for the error incurred by the approximate solver), converge to a feasible, approximately optimal solution, without increasing their iteration complexity. Our empirical evaluations verify our theoretical results on both real-world and synthetic data.Comment: Extended version of the NeurIPS 2020 submission. arXiv admin note: substantial text overlap with arXiv:2003.0807

arXiv.org e-Print Archive

Asynchronous and Multiprecision Linear Solvers - Scalable and Fault-Tolerant Numerics for Energy Efficient High Performance Computing

Author: Anzt Hartwig
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2012
Field of study

Asynchronous methods minimize idle times by removing synchronization barriers, and therefore allow the efficient usage of computer systems. The implied high tolerance with respect to communication latencies improves the fault tolerance. As asynchronous methods also enable the usage of the power and energy saving mechanisms provided by the hardware, they are suitable candidates for the highly parallel and heterogeneous hardware platforms that are expected for the near future

KITopen