Search CORE

103 research outputs found

A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet François-Henry
Publication venue
Publication date: 26/06/2015
Field of study

We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. This work is part of a more global effort, the STRUMPACK (STRUctured Matrices PACKage) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion

Conditioning Analysis of Incomplete Cholesky Factorizations with Orthogonal Dropping

Author: Artem Napov
Meijerink J. A.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date
Field of study

Crossref

A Parallel Solver for Graph Laplacians

Author: Boman Erik G.
Brannick James
Kepner Jeremy
Napov Artem
Ruge John W.
Spielman Daniel A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/07/2018
Field of study

Problems from graph drawing, spectral clustering, network flow and graph partitioning can all be expressed in terms of graph Laplacian matrices. There are a variety of practical approaches to solving these problems in serial. However, as problem sizes increase and single core speeds stagnate, parallelism is essential to solve such problems quickly. We present an unsmoothed aggregation multigrid method for solving graph Laplacians in a distributed memory setting. We introduce new parallel aggregation and low degree elimination algorithms targeted specifically at irregular degree graphs. These algorithms are expressed in terms of sparse matrix-vector products using generalized sum and product operations. This formulation is amenable to linear algebra using arbitrary distributions and allows us to operate on a 2D sparse matrix distribution, which is necessary for parallel scalability. Our solver outperforms the natural parallel extension of the current state of the art in an algorithmic comparison. We demonstrate scalability to 576 processes and graphs with up to 1.7 billion edges.Comment: PASC '18, Code: https://github.com/ligmg/ligm

arXiv.org e-Print Archive

Crossref

Krylov-based algebraic multigrid for edge elements

Author: Musy François
Napov Artem
Notay Yvan
Perrussel Ronan
Scorretti Riccardo
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceThis work tackles the evaluation of a multigrid cycling strategy using inner flexible Krylov subspace iterations. It provides a valuable improvement to the Reitzinger and Sch¨oberl algebraic multigrid method for systems coming from edge element discretization

Models for Metal Hydride Particle Shape, Packing, and Heat Transfer

Author: Asakuma
Beaulieu
Bershadsky
Castellanos
Checchetto
Da-Wen
Flueckiger
Fukai
Griesinger
Groisman
Hahne
Incropera
Jaoshvili
Josephy
Kennard
Kim
Klein
Kojima
Kyle C. Smith
Lee
Manley
Mujat
Napov
Notay
Notay
Ovshinsky
O’Hern
Pons
Ron
Rudman
Sakai
Schlapbach
Sen
Smith
Smith
Smith
Smith
Smith
Solomon
Sondheimer
Swartz
Sánchez
Timothy S. Fisher
Torquato
Tsotsas
Ueoka
Valverde
Yang
Yonezawa
Zhang
Publication venue: 'Elsevier BV'
Publication date: 04/05/2012
Field of study

A multiphysics modeling approach for heat conduction in metal hydride powders is presented, including particle shape distribution, size distribution, granular packing structure, and effective thermal conductivity. A statistical geometric model is presented that replicates features of particle size and shape distributions observed experimentally that result from cyclic hydride decreptitation. The quasi-static dense packing of a sample set of these particles is simulated via energy-based structural optimization methods. These particles jam (i.e., solidify) at a density (solid volume fraction) of 0.665+/-0.015 - higher than prior experimental estimates. Effective thermal conductivity of the jammed system is simulated and found to follow the behavior predicted by granular effective medium theory. Finally, a theory is presented that links the properties of bi-porous cohesive powders to the present systems based on recent experimental observations of jammed packings of fine powder. This theory produces quantitative experimental agreement with metal hydride powders of various compositions.Comment: 12 pages, 12 figures, 2 table

arXiv.org e-Print Archive

Crossref

Purdue E-Pubs

Recent advances in sparse direct solvers

Author: Agullo Emmanuel
Amestoy Patrick
Buttari Alfredo
Guermouche Abdou
Joslin Guillaume
L'Excellent Jean-Yves
Li Xiaoye,
Napov Artem
Rouet François-Henry
Sid-Lakhdar Wissam M.
Wang Shen
Weisbecker Clement
Yamazaki Ichitaro
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceDirect methods for the solution of sparse systems of linear equations of the form A x = b are used in a wide range of numerical simulation applications. Such methods are based on the decomposition of the matrix into a product of triangular factors (e.g., A = L U ), followed by triangular solves. They are known for their numerical accuracy and robustness but are also characterized by a high memory consumption and a large amount of computations. Here we survey some research directions that are being investigated by the sparse direct solver community to alleviate these issues: memory-aware scheduling techniques, low-rank approximations, and distributed/shared memory hybrid programming

HAL-ENS-LYON

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Open Archive Toulouse Archive Ouverte

DI-fusion

Hal-Diderot

Fast interior point solution of quadratic programming problems arising from PDE-constrained optimization

Author: A Borzì
A Borzì
A Borzì
A Drǎgǎnescu
A Napov
A Schiela
AJ Wathen
AJ Wathen
B Li
CC Paige
CT Kelley
F Tröltzsch
GH Golub
GH Golub
HC Elman
HD Mittelmann
ICF Ipsen
IS Duff
J Gondzio
J Gondzio
J Pestana
Jacek Gondzio
John W. Pearson
JW Pearson
JW Pearson
JW Pearson
L Bergamaschi
M Benzi
M Benzi
M Bergounioux
M Hinze
M Porcelli
M Ulbrich
M Weiser
M Weiser
M Weiser
MF Murphy
MJ Gander
MJ Grotte
R Herzog
SJ Wright
T Rusten
Y Notay
Y Notay
Y Saad
YA Kuznetsov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/05/2017
Field of study

Interior point methods provide an attractive class of approaches for solving linear, quadratic and nonlinear programming problems, due to their excellent efficiency and wide applicability. In this paper, we consider PDE-constrained optimization problems with bound constraints on the state and control variables, and their representation on the discrete level as quadratic programming problems. To tackle complex problems and achieve high accuracy in the solution, one is required to solve matrix systems of huge scale resulting from Newton iteration, and hence fast and robust methods for these systems are required. We present preconditioned iterative techniques for solving a number of these problems using Krylov subspace methods, considering in what circumstances one may predict rapid convergence of the solvers in theory, as well as the solutions observed from practical computations

Crossref

Edinburgh Research Explorer

Kent Academic Repository

Nonlinear multigrid methods for second order differential operators with nonlinear diffusion coefficient

Author: Baines
Bank
Barenblatt
Bastian
Braess
Bramble
Bramble
Bramble
Brandt
Brenner
Briggs
Brown
Deuflhard
Diening
Donatelli
Donatelli
Gaskell
Ge
Green
Gräser
Hackbusch
Hackbusch
Hackbusch
Hackbusch
Jouvet
K.J. Brabazon
Kim
Knoll
Kornhuber
Kornhuber
M.E. Hubbard
MacLachlan
Mavriplis
Mavriplis
Molenaar
Murray
Napov
Ortega
P.K. Jimack
Park
Reusken
Reusken
Rosam
Saad
Scheichl
Semplice
Shaidurov
Stals
Trottenberg
Vecharynski
Wienands
Xu
Publication venue: 'Elsevier BV'
Publication date: 01/12/2014
Field of study

Nonlinear multigrid methods such as the Full Approximation Scheme (FAS) and Newton-multigrid (Newton-MG) are well established as fast solvers for nonlinear PDEs of elliptic and parabolic type. In this paper we consider Newton-MG and FAS iterations applied to second order differential operators with nonlinear diffusion coefficient. Under mild assumptions arising in practical applications, an approximation (shown to be sharp) of the execution time of the algorithms is derived, which demonstrates that Newton-MG can be expected to be a faster iteration than a standard FAS iteration for a finite element discretisation. Results are provided for elliptic and parabolic problems, demonstrating a faster execution time as well as greater stability of the Newton-MG iteration. Results are explained using current theory for the convergence of multigrid methods, giving a qualitative insight into how the nonlinear multigrid methods can be expected to perform in practice

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

White Rose Research Online