Search CORE

14 research outputs found

ParILUT - A parallel threshold ILU for GPUS

Author: Anzt Hartwig
Chow Edmond
Dongarra Jack
Flegar Goran
Ribizel Tobias
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2019
Field of study

Crossref

KITopen

Combinatorial problems in solving linear systems

Author: Duff Iain
Uçar Bora
Publication venue: HAL CCSD
Publication date: 12/04/2011
Field of study

42 pages, available as LIP research report RR-2009-15Numerical linear algebra and combinatorial optimization are vast subjects; as is their interaction. In virtually all cases there should be a notion of sparsity for a combinatorial problem to arise. Sparse matrices therefore form the basis of the interaction of these two seemingly disparate subjects. As the core of many of today's numerical linear algebra computations consists of the solution of sparse linear system by direct or iterative methods, we survey some combinatorial problems, ideas, and algorithms relating to these computations. On the direct methods side, we discuss issues such as matrix ordering; bipartite matching and matrix scaling for better pivoting; task assignment and scheduling for parallel multifrontal solvers. On the iterative method side, we discuss preconditioning techniques including incomplete factorization preconditioners, support graph preconditioners, and algebraic multigrid. In a separate part, we discuss the block triangular form of sparse matrices

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Recommended from our members

Parallel Algebraic Multigrid Methods - High Performance Preconditioners

Author: Yang U. M.
Publication venue: Lawrence Livermore National Laboratory
Publication date: 11/11/2004
Field of study

The development of high performance, massively parallel computers and the increasing demands of computationally challenging applications have necessitated the development of scalable solvers and preconditioners. One of the most effective ways to achieve scalability is the use of multigrid or multilevel techniques. Algebraic multigrid (AMG) is a very efficient algorithm for solving large problems on unstructured grids. While much of it can be parallelized in a straightforward way, some components of the classical algorithm, particularly the coarsening process and some of the most efficient smoothers, are highly sequential, and require new parallel approaches. This chapter presents the basic principles of AMG and gives an overview of various parallel implementations of AMG, including descriptions of parallel coarsening schemes and smoothers, some numerical results as well as references to existing software packages

UNT Digital Library

Preconditioning for Sparse Linear Systems at the Dawn of the 21st Century: History, Current Developments, and Future Perspectives

Author: Massimiliano Ferronato
Publication venue
Publication date: 01/01/2012
Field of study

Iterative methods are currently the solvers of choice for large sparse linear systems of equations. However, it is well known that the key factor for accelerating, or even allowing for, convergence is the preconditioner. The research on preconditioning techniques has characterized the last two decades. Nowadays, there are a number of different options to be considered when choosing the most appropriate preconditioner for the specific problem at hand. The present work provides an overview of the most popular algorithms available today, emphasizing the respective merits and limitations. The overview is restricted to algebraic preconditioners, that is, general-purpose algorithms requiring the knowledge of the system matrix only, independently of the specific problem it arises from. Along with the traditional distinction between incomplete factorizations and approximate inverses, the most recent developments are considered, including the scalable multigrid and parallel approaches which represent the current frontier of research. A separate section devoted to saddle-point problems, which arise in many different applications, closes the paper

Crossref

Directory of Open Access Journals

Open Access Repository

Archivio istituzionale della ricerca - Università di Padova

New Sequential and Scalable Parallel Algorithms for Incomplete Factor Preconditioning

Author: Hysom David A.
Publication venue: ODU Digital Commons
Publication date: 01/01/2001
Field of study

The solution of large, sparse, linear systems of equations Ax = b is an important kernel, and the dominant term with regard to execution time, in many applications in scientific computing. The large size of the systems of equations being solved currently (millions of unknowns and equations) requires iterative solvers on parallel computers. Preconditioning, which is the process of translating a linear system into a related system that is easier to solve, is widely used to reduce solution time and is sometimes required to ensure convergence. Level-based preconditioning (ILU(ℓ)) has long been used in serial contexts and is widely recognized as robust and effective for a wide range of problems. However, the method has long been regarded as an inherently sequential technique. Parallelism, it has been thought, can be achieved primarily at the expense of increased iterations. We dispute these claims. The first half of this dissertation takes an in-depth look at structurally based ILU(ℓ) symbolic factorization. There are two definitions of fill level, “sum” and “max,” that have been proposed. Hitherto, these definitions have been cast in terms of matrix terminology. We develop a sequence of lemmas and theorems that provide graph theoretic characterizations of both definitions; these characterizations are based on the static graph of a matrix, G(A). Our Incomplete Fill Path Theorem characterizes fill levels per the sum definition; this is the definition that is used in most library implementations of the “classic” ILU(ℓ) factorization algorithm. Our theorem leads to several new graph-search algorithms that compute factors identical, or nearly identical, to those computed by the “classic” algorithm. Our analyses shows that the new algorithms have lower run time complexity than that of the previously existing algorithms for certain classes of matrices that are commonly encountered in scientific applications. The second half of this dissertation presents a Parallel ILU algorithmic framework (PILU). This framework enables scalable parallel ILU preconditioning by combining concepts from domain decomposition and graph ordering. The framework can accommodate ILU(ℓ) factorization as well as threshold-based ILUT methods. A model implementation of the framework, the Euclid library, was developed as part of this dissertation. This library was used to obtain experimental results for Poisson\u27s equation, the Convection-Diffusion equation, and a nonlinear Radiative Transfer problem. The experiments, which were conducted on a variety of platforms with up to 400 CPUs, demonstrate that our approach is highly scalable for arbitrary ILU(ℓ) fill levels

Old Dominion University

Astmeliste plaatide optimiseerimine siledate voolavuspindade korral

Author: Vlassov Boriss
Publication venue
Publication date: 24/10/2013
Field of study

Käesolevas väitekirjas vaadeldakse Misese, Hilli ning Tsai-Wu materjalist valmistatud elastsete plastsete astmeliste plaatide optimiseerimisega seotud küsimusi. Antud dissertatsioon põhineb autori seitsmel teaduslikul publikatsioonil, millest kuus on avaldatud viimase kolme aasta jooksul. Käesolev dissertatsioon koosneb neljast peatükist, kirjanduse loetelust ning autori elulookirjeldusest. Esimene peatükk on sisuliselt ülevaade numbriliste meetodite rakendamisest konstruktsioonielementide optimiseerimisel. Selles peatükis antakse ülevaade plaatide ja koorikute optimiseerimisele pühendatud töödest, samuti kirjeldatakse lõplike elementide meetodi ja paralleelarvutuse ajaloolist arengut. Käesoleva uurimise raames on kasutatud lõplike elementide meetodit ning Haari lainikute meetodit harilike ja osatuletistega diferentsiaalvõrrandite lahendamiseks ning on rakendatud kõrgproduktiivse ja paralleelarvutuse põhimõtteid. Teises peatükis vaadeldakse sandwich-tüüpi sümmeetrilise elastse-plastse ümarplaadi painet ühtlaselt jaotatud koormuse mõjul ning otsitakse miinimumkaaluga projekti ette antud maksimumläbipainde korral. Eeldatakse, et plaadi materjal vastab Misese voolavustingimusele. Optimaalse lahendi leidmiseks on kasutatud lõplike elementide meetodit. Kolmandas peatükis uuritakse eelmises peatükis püstitatud probleeme sümmeetriliste elastsete-plastsete astmeliste rõngasplaatide puhul. Optimaalse lahendi leidmiseks on kasutatud lõplike elementide meetodit ning Haari lainikute meetodit, viimast kasutatakse ka harilike diferentsiaalvõrrandite lahendamiseks. Neljandas peatükis on uuritud anisotroopsete rõngasplaatide painet ning on leitud miinimumkaaluga projektid Hilli ja Tsai-Wu voolavustingimuste puhul. Arvutamisel on kasutatud Haari lainikute meetodit. Väitekirjas on välja töötatud paralleelarvutuse metoodika, mis annab võimaluse numbriliselt lahendada elastsete-plastsete plaatide optimiseerimisprobleeme. Saadud lahendeid on võrreldud Ohashi ja Murakami, Turvey ning Upadrasta tulemustega. Töös saadud tulemused on heas kooskõlas teiste autorite töödega. Uurimistöö käigus ilmnes, et optimiseerimisülesannete puhul on mõistlikum kasutada lainikute meetodit, mille paralleeliseerimine hoiab rohkem kokku arvuti ressurssi.The current work is devoted to the theory of analysis and optimization of stepped circular and annular plates subject to smooth yield surfaces. Chapter 1 provides the brief historical review of the problem and of the finite element method. The Basic ideas of parallel computation, also of the multigrid method are presented herein, as well. In Chapter 2 a method for numerical investigation of axisymmetric plates subjected to the distributed transverse pressure loading was presented. The material of plates studied herein is assumed to be an ideal elastic plastic material obeying the non-linear yield condition of von Mises and the associated flow law. The strain hardening as well as geometrical non-linearity are neglected in the present investigation. Calculations carried out showed that the obtained results are in good correlation with those obtained by ABAQUS when solving the direct problem of determination of the stress strain state of the plate. In Chapter 3 an analytical-numerical study of annular plates operating in the range of elastic plastic deformations was undertaken. The material of plates was assumed to be an ideal elastic plastic material obeying the Mises yield condition. The author succeeded in the analytical derivation of optimality conditions for this highly non-linear problem. The obtained systems of equations were solved by existing computer codes. In Chapter 4 the methods of analysis and optimization of plates with piece wise constant thicknesses developed earlier for homogeneous isotropic materials are extended to plates made of anisotropic materials. The plastic yielding of the material is assumed to take place according to the criterion Tsai-Wu and the associated gradientality law. The traditional bending theory is used, non-linear effects are neglected in the current study

DSpace at Tartu University Library

Aspects of Ocean Circulation with Finite Element Modelling

Author: Harig Sven
Publication venue
Publication date: 01/01/2004
Field of study

This thesis deals with development and evaluation of the three dimensional, nonstationary ocean model FEOM:sub:0:/sub: (basic version of the Finite Element Ocean Model FEOM). This model is based on the Finite Element Method (FEM) which allows for the use of unstructured grids with variable resolution. The first part of the thesis introduces the governing equations, the mathematical formulation as well as the discretisation using FEM. After introducing the discrete form of the equations some details on the numerical implementation are given.The second part of the thesis contains applications of FEOM:sub:0:/sub: to different oceanographic tasks under idealised conditions. Comparisons to analytical results as well as to results of other numerical models in corresponding experiments are presented.The first application investigates the propagation of waves in a stratified ocean. The model shows nice correspondence to theoretically obtained wave properties as well as to results of the Modular Ocean Model (MOM). The second investigation considers the wind driven ocean circulation, especially the resulting vertical structure of the flow field. The influence of topography is examined, the results coincide with the predictions of linear theory. Finally an idealised overflow scenario is investigated. The flow of dense water on a slope poses a special problem for numerical ocean models. An international intercomparison study (DOME: Dynamics of Overflow Mixing and Entrainment) was conceived in order to gain insight into the capabilities of different numerical models in reproducing this process. FEOM:sub:0:/sub: is applied to the idealised DOME setup with and without interior density stratification. In case of a homogeneous interior a variability in the overflow rate of several days shows up, the model gives a reasonable path of the plume and reproduces the theoretically obtained dependence of the overflow transport on Coriolis parameter and density structure

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

Polynomial and rational approximation for electronic structure calculations

Author: Etter Simon
Publication venue
Publication date
Field of study

Atomic-scale simulation of matter has become an important research tool in physics, chemistry, material science and biology as it allows for insights which neither theoretical nor experimental investigation can provide. The most accurate of these simulations are based on the laws of quantum mechanics, in which case the main computational bottleneck becomes the evaluation of functions f(H) of a sparse matrix H (the Hamiltonian). One way to evaluate such matrix functions is through polynomial and rational approximation, the theory of which is reviewed in Chapter 2 of this thesis. It is well known that rational functions can approximate the relevant functions with much lower degrees than polynomials, but they are more challenging to use in practice since they require fast algorithms for evaluating rational functions r(H) of a matrix argument H. Such an algorithm has recently been proposed in the form of the Pole Expansion and Selected Inversion (PEXSI) scheme, which evaluates r(H) by writing r(x) = P k ck x−zk in partial-fraction-decomposed form and then employing advanced sparse factorisation techniques to evaluate only a small subset of the entries of the resolvents (H − z) −1 . This scheme scales better than cubically in the matrix dimension, but it is not a linear scaling algorithm in general. We overcome this limitation in Chapter 3 by devising a modified, linear-scaling PEXSI algorithm which exploits that most of the fill-in entries in the triangular factorisations computed by the PEXSI algorithm are negligibly small. Finally, Chapter 4 presents a novel algorithm for computing electric conductivities which requires evaluating a bivariate matrix function f(H, H). We show that the Chebyshev coefficients ck1k2 of the relevant function f(x1, x2) concentrate along the diagonal k1 ∼ k2 and that this allows us to approximate f(x1, x2) much more efficiently than one would expect based on a straightforward tensor-product extension of the one-dimensional arguments

Warwick Research Archives Portal Repository

High performance computing for multiphase fluid flows

Author: Kumar Bipin
Publication venue: Dublin City University. School of Computing
Publication date: 01/03/2010
Field of study

Multiphase fluid flows are very common in engineering and science applications. Examples include air ow on water surface, metallurgical flow and blood flow in the body. In these flows, fluids are separated by a sharp interface and form different phases. The flow is characterized by the movement of this interface. Accurate modelling of the interface movement is a fundamental problem in the numerical simulation of these flows. Velocities for the movement are provided by the numerical solution of the Navier-Stokes (N-S) equations. These equations are discretized and converted into linear systems of equations. Research in the direction towards solving these systems efficiently has been the main focus of many researchers in the field of Computational Fluid Dynamics (CFD). A modified Volume of Fluid (VOF) method for modelling two phase flows is implemented using an analytic relation for its reconstruction step. The Finite Volume Method (FVM) is utilized, by incorporating a staggered grid, to discretize the two-dimensional (2-D) N-S equations. A preconditioned Krylov-Subspace iterative method, namely, the Bi-Conjugate Gradient Stabilized (Bi-CGSTAB) method is employed to solve the linear systems of equations. Solving the linear system usually consumes most of the simulation time for multiphase flow problems. Novel algorithms for the Incomplete LU Threshold (ILUT) preconditioner, forward and backward substitution and other matrix operations for penta-diagonal matrices are proposed here by adopting a diagonal sparse matrices format. The novel algorithm for ILUT reduces the computational complexity from O(n3 − n2) to O(n) in comparison to dense format. Further, it brings down the communication overhead, consequently facilitating parallelization. Parallel versions of these algorithms are developed using a new load balancing scheme. The MPI C++ communication library is utilized to develop the parallel version. The 2-D VOF code is applied to shape advection problems and results are found to be in good agreement with those available in literature. In the case of translation of a square box, it provides more accurate results than other VOF methods. The code for the VOF method and the parallel iterative solvers are integrated with 2-D N-S code in C++. The whole code is then implemented to simulate several two phase flow problems: dam breaking with and without an obstacle, rising of an air bubble and lid driven cavity flows. Speedup data from parallel programs implemented on these problems are generated

Irish Universities

DCU Online Research Access Service