729 research outputs found

    Analysis of the discontinuous Galerkin method for elliptic problems on surfaces

    Get PDF
    We extend the discontinuous Galerkin (DG) framework to a linear second-order elliptic problem on a compact smooth connected and oriented surface. An interior penalty (IP) method is introduced on a discrete surface and we derive a-priori error estimates by relating the latter to the original surface via the lift introduced in Dziuk (1988). The estimates suggest that the geometric error terms arising from the surface discretisation do not affect the overall convergence rate of the IP method when using linear ansatz functions. This is then verified numerically for a number of test problems. An intricate issue is the approximation of the surface conormal required in the IP formulation, choices of which are investigated numerically. Furthermore, we present a generic implementation of test problems on surfaces.Comment: 21 pages, 4 figures. IMA Journal of Numerical Analysis 2013, Link to publication: http://imajna.oxfordjournals.org/cgi/content/abstract/drs033? ijkey=45b23qZl5oJslZQ&keytype=re

    Numerical solution of 3-D electromagnetic problems in exploration geophysics and its implementation on massively parallel computers

    Get PDF
    The growing significance, technical development and employment of electromagnetic (EM) methods in exploration geophysics have led to the increasing need for reliable and fast techniques of interpretation of 3-D EM data sets acquired in complex geological environments. The first and most important step to creating an inversion method is the development of a solver for the forward problem. In order to create an efficient, reliable and practical 3-D EM inversion, it is necessary to have a 3-D EM modelling code that is highly accurate, robust and very fast. This thesis focuses precisely on this crucial and very demanding step to building a 3-D EM interpretation method. The thesis presents as its main contribution a highly accurate, robust, very fast and extremely scalable numerical method for 3-D EM modelling in geophysics that is based on finite elements (FE) and designed to run on massively parallel computing platforms. Thanks to the fact that the FE approach supports completely unstructured tetrahedral meshes as well as local mesh refinements, the presented solver is able to represent complex geometries of subsurface structures very precisely and thus improve the solution accuracy and avoid misleading artefacts in images. Consequently, it can be successfully used in geological environments of arbitrary geometrical complexities. The parallel implementation of the method, which is based on the domain decomposition and a hybrid MPI-OpenMP scheme, has proved to be highly scalable - the achieved speed-up is close to the linear for more than a thousand processors. Thanks to this, the code is able to deal with extremely large problems, which may have hundreds of millions of degrees of freedom, in a very efficient way. The importance of having this forward-problem solver lies in the fact that it is now possible to create a 3-D EM inversion that can deal with data obtained in extremely complex geological environments in a way that is realistic for practical use in industry. So far, such imaging tool has not been proposed due to a lack of efficient, parallel FE solutions as well as the limitations of efficient solvers based on finite differences. In addition, the thesis discusses physical, mathematical and numerical aspects and challenges of 3-D EM modelling, which have been studied during my research in order to properly design the presented software for EM field simulations on 3-D areas of the Earth. Through this work, a physical problem formulation based on the secondary Coulomb-gauged EM potentials has been validated, proving that it can be successfully used with the standard nodal FE method to give highly accurate numerical solutions. Also, this work has shown that Krylov subspace iterative methods are the best solution for solving linear systems that arise after FE discretisation of the problem under consideration. More precisely, it has been discovered empirically that the best iterative method for this kind of problems is biconjugate gradient stabilised with an elaborate preconditioner. Since most commonly used preconditioners proved to be either unable to improve the convergence of the implemented solvers to the desired extent, or impractical in the parallel context, I have proposed a preconditioning technique for Krylov methods that is based on algebraic multigrid. Tests for various problems with different conductivity structures and characteristics have shown that the new preconditioner greatly improves the convergence of different Krylov subspace methods, which significantly reduces the total execution time of the program and improves the solution quality. Furthermore, the preconditioner is very practical for parallel implementation. Finally, it has been concluded that there are not any restrictions in employing classical parallel programming models, MPI and OpenMP, for parallelisation of the presented FE solver. Moreover, they have proved to be enough to provide an excellent scalability for it

    Efficient Multigrid Preconditioners for Atmospheric Flow Simulations at High Aspect Ratio

    Get PDF
    Many problems in fluid modelling require the efficient solution of highly anisotropic elliptic partial differential equations (PDEs) in "flat" domains. For example, in numerical weather- and climate-prediction an elliptic PDE for the pressure correction has to be solved at every time step in a thin spherical shell representing the global atmosphere. This elliptic solve can be one of the computationally most demanding components in semi-implicit semi-Lagrangian time stepping methods which are very popular as they allow for larger model time steps and better overall performance. With increasing model resolution, algorithmically efficient and scalable algorithms are essential to run the code under tight operational time constraints. We discuss the theory and practical application of bespoke geometric multigrid preconditioners for equations of this type. The algorithms deal with the strong anisotropy in the vertical direction by using the tensor-product approach originally analysed by B\"{o}rm and Hiptmair [Numer. Algorithms, 26/3 (2001), pp. 219-234]. We extend the analysis to three dimensions under slightly weakened assumptions, and numerically demonstrate its efficiency for the solution of the elliptic PDE for the global pressure correction in atmospheric forecast models. For this we compare the performance of different multigrid preconditioners on a tensor-product grid with a semi-structured and quasi-uniform horizontal mesh and a one dimensional vertical grid. The code is implemented in the Distributed and Unified Numerics Environment (DUNE), which provides an easy-to-use and scalable environment for algorithms operating on tensor-product grids. Parallel scalability of our solvers on up to 20,480 cores is demonstrated on the HECToR supercomputer.Comment: 22 pages, 6 Figures, 2 Table

    HONEI: A collection of libraries for numerical computations targeting multiple processor architectures.

    Get PDF
    We present HONEI, an open-source collection of libraries offering a hardware oriented approach to numerical calculations. HONEI abstracts the hardware, and applications written on top of HONEI can be executed on a wide range of computer architectures such as CPUs, GPUs and the Cell processor. We demonstrate the flexibility and performance of our approach with two test applications, a Finite Element multigrid solver for the Poisson problem and a robust and fast simulation of shallow water waves. By linking against HONEI's libraries, we achieve a two-fold speedup over straight forward C++ code using HONEI's SSE backend, and additional 3--4 and 4--16 times faster execution on the Cell and a GPU. A second important aspect of our approach is that the full performance capabilities of the hardware under consideration can be exploited by adding optimised application-specific operations to the HONEI libraries. HONEI provides all necessary infrastructure for development and evaluation of such kernels, significantly simplifying their development

    Advanced interface modelling for 2D shell & 3D continuum problems

    Get PDF
    This work is motivated by the need for an efficient yet accurate approach for static and dynamic contact analysis of large-scale structures which can a) capture the optimum con- tact position with a moderate number of contact elements, and b) enable across-partition adaptive contact analysis within a parallel processing environment. In addressing these two issues, a novel adaptive node-to-surface contact approach is proposed to discretise the contact boundaries and to trace the evolution of contact locations. Contact search is a demanding process that can become quite complicated for certain types of problem. In this work, an efficient and robust contact search method is proposed, which can a) locally track the master facet of a given slave node despite the appearance of highly non-smooth contact surface, including surfaces with concave/convex regions or with distinct boundaries as well as reversible normals, and b) globally reallocate the master-slave contact pairs based on the penetration state without an expensive global search, providing an effective adaptive contact approach. A dual-interface-based domain decomposition method emphasising across-partition con- tact coupling is proposed. A pair of fully decomposed node-to-surface contact element are proposed to discretise the across-partition contact boundaries. The assumption of small incremental displacements is adopted, which a) avoids the excessive coupling between the decomposed master and slave, b) reduces significantly the communication overhead, and c) facilitates a flexible across-partition adaptive analysis. This strategy is found to provide good results for a sufficiently small time- or load-step, and it also facilitates mix-dimensional contact simulation. Another interest in current thesis is the inaccuracy in non-smooth plates modelled us- ing 2D displacement-based shell elements. In this work the dominant factor causing the inaccuracy is recognised as the incompatible tangential rotations on the two sides of the in- tersection. A 3-noded coupling element is introduced to impose a continuous constraint to couple the incompatible rotations. The significance of the discontinuity in the shell-based folded structure and the effectiveness of the coupling element is demonstrated through numerical studies comparing shell-based models to high fidelity solid-based models.Open Acces

    Strategies for producing fast finite element solutions of the incompressible Navier-Stokes equations on massively parallel architectures

    Get PDF
    To take advantage of the inherent flexibility of the finite element method in solving for flows within complex geometries, it is necessary to produce efficient implementations of the method. Segregation of the solution scheme and the use of parallel computers are two ways of doing this. Here, the optimisation of a sequential segregated finite element algorithm is discussed, together with the various strategies by which this is done. Furthermore, the implications of parallelising the code onto a massively parallel computer, the MasPar, are explored. This machine is of Single Instruction Multiple Data type and so modifications to the computer code have been necessary. A general methodology for the implementation of finite element programs is presented based on projecting the levels of data within the algorithm into a form which is ideal for parallelisation. Application of this methodology, in a high level language, has resulted in a code which runs at just under 30MFlops (in double precision). The computations are performed with minimal inter-processor communication and this represents an efficiency of 20% of the theoretical peak speed. Even though only high level language constructs have been used, this efficiency is comparable with other work using low level constructs on machines of this architecture. In particular, the use of data parallel arrays and the utilisation of the non-unique machine specific features of the computer architecture have produced an efficient, fast program

    Communication-Avoiding Algorithms for a High-Performance Hyperbolic PDE Engine

    Get PDF
    The study of waves has always been an important subject of research. Earthquakes, for example, have a direct impact on the daily lives of millions of people while gravitational waves reveal insight into the composition and history of the Universe. These physical phenomena, despite being tackled traditionally by different fields of physics, have in common that they are modelled the same way mathematically: as a system of hyperbolic partial differential equations (PDEs). The ExaHyPE project (“An Exascale Hyperbolic PDE Engine") translates this similarity into a software engine that can be quickly adapted to simulate a wide range of hyperbolic partial differential equations. ExaHyPE’s key idea is that the user only specifies the physics while the engine takes care of the parallelisation and the interplay of the underlying numerical methods. Consequently, a first simulation code for a new hyperbolic PDE can often be realised within a few hours. This is a task that traditionally can take weeks, months, even years for researchers starting from scratch. My main contribution to ExaHyPE is the development of the core infrastructure. This comprises the development and implementation of ExaHyPE’s solvers and adaptive mesh refinement procedures, it’s MPI+X parallelisation as well as high-level aspects of ExaHyPE’s application-tailored code generation, which allows to adapt ExaHyPE to model many different hyperbolic PDE systems. Like any high-performance computing code, ExaHyPE has to tackle the challenges of the coming exascale computing era, notably network communication latencies and the growing memory wall. In this thesis, I propose memory-efficient realisations of ExaHyPE’s solvers that avoid data movement together with a novel task-based MPI+X parallelisation concept that allows to hide network communication behind computation in dynamically adaptive simulations
    corecore