2,821 research outputs found

    Multigrid preconditioning of linear systems for interior point methods applied to a class of box-constrained optimal control problems

    Full text link
    In this article we construct and analyze multigrid preconditioners for discretizations of operators of the form D+K* K, where D is the multiplication with a relatively smooth positive function and K is a compact linear operator. These systems arise when applying interior point methods to the minimization problem min_u (||K u-f||^2 +b||u||^2) with box-constraints on the controls u. The presented preconditioning technique is closely related to the one developed by Draganescu and Dupont in [11] for the associated unconstrained problem, and is intended for large-scale problems. As in [11], the quality of the resulting preconditioners is shown to increase with increasing resolution but decreases as the diagonal of D becomes less smooth. We test this algorithm first on a Tikhonov-regularized backward parabolic equation with box-constraints on the control, and then on a standard elliptic-constrained optimization problem. In both cases it is shown that the number of linear iterations per optimization step, as well as the total number of fine-scale matrix-vector multiplications is decreasing with increasing resolution, thus showing the method to be potentially very efficient for truly large-scale problems.Comment: 29 pages, 8 figure

    Schwarz Methods: To Symmetrize or Not to Symmetrize

    Full text link
    A preconditioning theory is presented which establishes sufficient conditions for multiplicative and additive Schwarz algorithms to yield self-adjoint positive definite preconditioners. It allows for the analysis and use of non-variational and non-convergent linear methods as preconditioners for conjugate gradient methods, and it is applied to domain decomposition and multigrid. It is illustrated why symmetrizing may be a bad idea for linear methods. It is conjectured that enforcing minimal symmetry achieves the best results when combined with conjugate gradient acceleration. Also, it is shown that absence of symmetry in the linear preconditioner is advantageous when the linear method is accelerated by using the Bi-CGstab method. Numerical examples are presented for two test problems which illustrate the theory and conjectures.Comment: Version of frequently requested articl

    MgNet: A Unified Framework of Multigrid and Convolutional Neural Network

    Full text link
    We develop a unified model, known as MgNet, that simultaneously recovers some convolutional neural networks (CNN) for image classification and multigrid (MG) methods for solving discretized partial differential equations (PDEs). This model is based on close connections that we have observed and uncovered between the CNN and MG methodologies. For example, pooling operation and feature extraction in CNN correspond directly to restriction operation and iterative smoothers in MG, respectively. As the solution space is often the dual of the data space in PDEs, the analogous concept of feature space and data space (which are dual to each other) is introduced in CNN. With such connections and new concept in the unified model, the function of various convolution operations and pooling used in CNN can be better understood. As a result, modified CNN models (with fewer weights and hyper parameters) are developed that exhibit competitive and sometimes better performance in comparison with existing CNN models when applied to both CIFAR-10 and CIFAR-100 data sets.Comment: 30 page

    Solving the Poisson equation on small aspect ratio domains using unstructured meshes

    Full text link
    We discuss the ill conditioning of the matrix for the discretised Poisson equation in the small aspect ratio limit, and motivate this problem in the context of nonhydrostatic ocean modelling. Efficient iterative solvers for the Poisson equation in small aspect ratio domains are crucial for the successful development of nonhydrostatic ocean models on unstructured meshes. We introduce a new multigrid preconditioner for the Poisson problem which can be used with finite element discretisations on general unstructured meshes; this preconditioner is motivated by the fact that the Poisson problem has a condition number which is independent of aspect ratio when Dirichlet boundary conditions are imposed on the top surface of the domain. This leads to the first level in an algebraic multigrid solver (which can be extended by further conventional algebraic multigrid stages), and an additive smoother. We illustrate the method with numerical tests on unstructured meshes, which show that the preconditioner makes a dramatic improvement on a more standard multigrid preconditioner approach, and also show that the additive smoother produces better results than standard SOR smoothing. This new solver method makes it feasible to run nonhydrostatic unstructured mesh ocean models in small aspect ratio domains.Comment: submitted to Ocean Modellin

    A new extrapolation cascadic multigrid method for 3D elliptic boundary value problems on rectangular domains

    Full text link
    In this paper, we develop a new extrapolation cascadic multigrid (ECMGjcg_{jcg}) method, which makes it possible to solve 3D elliptic boundary value problems on rectangular domains of over 100 million unknowns on a desktop computer in minutes. First, by combining Richardson extrapolation and tri-quadratic Serendipity interpolation techniques, we introduce a new extrapolation formula to provide a good initial guess for the iterative solution on the next finer grid, which is a third order approximation to the finite element (FE) solution. And the resulting large sparse linear system from the FE discretization is then solved by the Jacobi-preconditioned Conjugate Gradient (JCG) method. Additionally, instead of performing a fixed number of iterations as cascadic multigrid (CMG) methods, a relative residual stopping criterion is used in iterative solvers, which enables us to obtain conveniently the numerical solution with the desired accuracy. Moreover, a simple Richardson extrapolation is used to cheaply get a fourth order approximate solution on the entire fine grid. Test results are reported to show that ECMGjcg_{jcg} has much better efficiency compared to the classical MG methods. Since the initial guess for the iterative solution is a quite good approximation to the FE solution, numerical results show that only few number of iterations are required on the finest grid for ECMGjcg_{jcg} with an appropriate tolerance of the relative residual to achieve full second order accuracy, which is particularly important when solving large systems of equations and can greatly reduce the computational cost. It should be pointed out that when the tolerance becomes smaller, ECMGjcg_{jcg} still needs only few iterations to obtain fourth order extrapolated solution on each grid, except on the finest grid. Finally, we present the reason why our ECMG algorithms are so highly efficient for solving such problems.Comment: 20 pages, 4 figures, 10 tables; abbreviated abstrac

    Numerical Study of Geometric Multigrid Methods on CPU--GPU Heterogeneous Computers

    Full text link
    The geometric multigrid method (GMG) is one of the most efficient solving techniques for discrete algebraic systems arising from elliptic partial differential equations. GMG utilizes a hierarchy of grids or discretizations and reduces the error at a number of frequencies simultaneously. Graphics processing units (GPUs) have recently burst onto the scientific computing scene as a technology that has yielded substantial performance and energy-efficiency improvements. A central challenge in implementing GMG on GPUs, though, is that computational work on coarse levels cannot fully utilize the capacity of a GPU. In this work, we perform numerical studies of GMG on CPU--GPU heterogeneous computers. Furthermore, we compare our implementation with an efficient CPU implementation of GMG and with the most popular fast Poisson solver, Fast Fourier Transform, in the cuFFT library developed by NVIDIA

    A quantitative performance analysis for Stokes solvers at the extreme scale

    Full text link
    This article presents a systematic quantitative performance analysis for large finite element computations on extreme scale computing systems. Three parallel iterative solvers for the Stokes system, discretized by low order tetrahedral elements, are compared with respect to their numerical efficiency and their scalability running on up to 786 432786\,432 parallel threads. A genuine multigrid method for the saddle point system using an Uzawa-type smoother provides the best overall performance with respect to memory consumption and time-to-solution. The largest system solved on a Blue Gene/Q system has more than ten trillion (1.1⋅10131.1 \cdot 10 ^{13}) unknowns and requires about 13 minutes compute time. Despite the matrix free and highly optimized implementation, the memory requirement for the solution vector and the auxiliary vectors is about 200 TByte. Brandt's notion of "textbook multigrid efficiency" is employed to study the algorithmic performance of iterative solvers. A recent extension of this paradigm to "parallel textbook multigrid efficiency" makes it possible to assess also the efficiency of parallel iterative solvers for a given hardware architecture in absolute terms. The efficiency of the method is demonstrated for simulating incompressible fluid flow in a pipe filled with spherical obstacles

    FFT, FMM, or Multigrid? A comparative Study of State-Of-the-Art Poisson Solvers for Uniform and Nonuniform Grids in the Unit Cube

    Full text link
    In this work, we benchmark and discuss the performance of the scalable methods for the Poisson problem which are used widely in practice: the fast Fourier transform (FFT), the fast multipole method (FMM), the geometric multigrid (GMG), and algebraic multigrid (AMG). In total we compare five different codes, three of which are developed in our group. Our FFT, GMG, and FMM are parallel solvers that use high-order approximation schemes for Poisson problems with continuous forcing functions (the source or right-hand side). We examine and report results for weak scaling, strong scaling, and time to solution for uniform and highly refined grids. We present results on the Stampede system at the Texas Advanced Computing Center and on the Titan system at the Oak Ridge National Laboratory. In our largest test case, we solved a problem with 600 billion unknowns on 229,379 cores of Titan. Overall, all methods scale quite well to these problem sizes. We have tested all of the methods with different source functions (the right-hand side in the Poisson problem). Our results indicate that FFT is the method of choice for smooth source functions that require uniform resolution. However, FFT loses its performance advantage when the source function has highly localized features like internal sharp layers. FMM and GMG considerably outperform FFT for those cases. The distinction between FMM and GMG is less pronounced and is sensitive to the quality (from a performance point of view) of the underlying implementations. The high-order accurate versions of GMG and FMM significantly outperform their low-order accurate counterparts.Comment: 25 pages; accepted paper in SISC journa

    On the optimality of shifted Laplacian in the class of expansion preconditioners for the Helmholtz equation

    Full text link
    This paper introduces and explores the class of expansion preconditioners EX(m) that forms a direct generalization to the classic complex shifted Laplace (CSL) preconditioner for Helmholtz problems. The construction of the EX(m) preconditioner is based upon a truncated Taylor series expansion of the original Helmholtz operator inverse. The expansion preconditioner is shown to significantly improve Krylov solver convergence rates for the Helmholtz problem for growing values of the number of series terms m. However, the addition of multiple terms in the expansion also increases the computational cost of applying the preconditioner. A thorough cost-benefit analysis of the addition of extra terms in the EX(m) preconditioner proves that the CSL or EX(1) preconditioner is the practically most efficient member of the expansion preconditioner class. Additionally, possible extensions to the expansion preconditioner class that further increase preconditioner efficiency are suggested.Comment: 19 pages, 6 figures, 4 table

    Unified computational framework for the efficient solution of n-field coupled problems with monolithic schemes

    Full text link
    In this paper, we propose and evaluate the performance of a unified computational framework for preconditioning systems of linear equations resulting from the solution of coupled problems with monolithic schemes. The framework is composed by promising application-specific preconditioners presented previously in the literature with the common feature that they are able to be implemented for a generic coupled problem, involving an arbitrary number of fields, and to be used to solve a variety of applications. The first selected preconditioner is based on a generic block Gauss-Seidel iteration for uncoupling the fields, and standard algebraic multigrid (AMG) methods for solving the resulting uncoupled problems. The second preconditioner is based on the semi-implicit method for pressure-linked equations (SIMPLE) which is extended here to deal with an arbitrary number of fields, and also results in uncoupled problems that can be solved with standard AMG. Finally, a more sophisticated preconditioner is considered which enforces the coupling at all AMG levels, in contrast to the other two techniques which resolve the coupling only at the finest level. Our purpose is to show that these methods perform satisfactory in quite different scenarios apart from their original applications. To this end, we consider three very different coupled problems: thermo-structure interaction, fluid-structure interaction and a complex model of the human lung. Numerical results show that these general purpose methods are efficient and scalable in this range of applications
    • …
    corecore