Search CORE

232 research outputs found

The DUNE-ALUGrid Module

Author: Alkämper Martin
Dedner Andreas
Klöfkorn Robert
Nolte Martin
Publication venue
Publication date: 15/08/2015
Field of study

In this paper we present the new DUNE-ALUGrid module. This module contains a major overhaul of the sources from the ALUgrid library and the binding to the DUNE software framework. The main changes include user defined load balancing, parallel grid construction, and an redesign of the 2d grid which can now also be used for parallel computations. In addition many improvements have been introduced into the code to increase the parallel efficiency and to decrease the memory footprint. The original ALUGrid library is widely used within the DUNE community due to its good parallel performance for problems requiring local adaptivity and dynamic load balancing. Therefore, this new model will benefit a number of DUNE users. In addition we have added features to increase the range of problems for which the grid manager can be used, for example, introducing a 3d tetrahedral grid using a parallel newest vertex bisection algorithm for conforming grid refinement. In this paper we will discuss the new features, extensions to the DUNE interface, and explain for various examples how the code is used in parallel environments.Comment: 25 pages, 11 figure

arXiv.org e-Print Archive

UiS Brage

A generic finite element framework on parallel tree-based adaptive meshes

Author: Badia Santiago
Martín Alberto F.
Neiva Eric
Verdugo Francesc
Publication venue
Publication date: 01/01/2020
Field of study

We present highly scalable parallel distributed-memory algorithms and associated data structures for a generic finite element framework that supports h-adaptivity on computational domains represented as multiple connected adaptive trees—forest-of-trees—, thus providing multi-scale resolution on problems governed by partial differential equations.The framework is grounded on a rich representation of the adaptive mesh suitable for generic finite elements that is built on top of a low-level, light-weight forest-oftrees data structure handled by a specialized, highly parallel adaptive meshing engine. Along the way, we have identified the requirements that the forest-of-trees layer must fulfill to be coupled into our framework. Essentially, it must be able to describe neighboring relationships between cells in the adapted mesh (apart from hierarchical relationships) across the lower-dimensional objects at the boundary of the cells. Atop this two-layered mesh representation, we build the rest of data structures required for the numerical integration and assembly of the discrete system of linear equations.We consider algorithms that are suitable for both subassembled and fully-assembled distributed data layouts of linear system matrices. The proposed framework has been implemented within the FEMPAR scientific software library, using p4est as a practical forest-of-octrees demonstrator. A comprehensive strong scaling study of this implementation when applied to Poisson and Maxwell problems reveals remarkable scalability up to 32.2K CPU cores and 482.2M degrees of freedom. Besides, the implementation in FEMPAR of the proposed approach is up to 2.6 and 3.4 times faster than the state-of-the-art deal.II finite element software in the h-adaptive approximation of a Poisson problem with firstand second-order Lagrangian finite elements, respectively (excluding the linear solver step from the comparison)

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Scipedia

Evaluation of an efficient etack-RLE clustering concept for dynamically adaptive grids

Author: Bungartz HJB
Neckel TN
Schreiber M
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2016
Field of study

This is the author accepted manuscript. The final version is available from the Society for Industrial and Applied Mathematics via the DOI in this record.Abstract. One approach to tackle the challenge of efficient implementations for parallel PDE simulations on dynamically changing grids is the usage of space-filling curves (SFC). While SFC algorithms possess advantageous properties such as low memory requirements and close-to-optimal partitioning approaches with linear complexity, they require efficient communication strategies for keeping and utilizing the connectivity information, in particular for dynamically changing grids. Our approach is to use a sparse communication graph to store the connectivity information and to transfer data block-wise. This permits efficient generation of multiple partitions per memory context (denoted by clustering) which - in combination with a run-length encoding (RLE) - directly leads to elegant solutions for shared, distributed and hybrid parallelization and allows cluster-based optimizations. While previous work focused on specific aspects, we present in this paper an overall compact summary of the stack-RLE clustering approach completed by aspects on the vertex-based communication that ease up understanding the approach. The central contribution of this work is the proof of suitability of the stack-RLE clustering approach for an efficient realization of different, relevant building blocks of Scientific Computing methodology and real-life CSE applications: We show 95% strong scalability for small-scale scalability benchmarks on 512 cores and weak scalability of over 90% on 8192 cores for finite-volume solvers and changing grid structure in every time step; optimizations of simulation data backends by writer tasks; comparisons of analytical benchmarks to analyze the adaptivity criteria; and a Tsunami simulation as a representative real-world showcase of a wave propagation for our approach which reduces the overall workload by 95% for parallel fully-adaptive mesh refinement and, based on a comparison with SFC-ordered regular grid cells, reduces the computation time by a factor of 7.6 with improved results and a factor of 62.2 with results of similar accuracy of buoy station dataThis work was partly supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing” (SFB/TR 89)

Open Research Exeter

Recommended from our members

Albany: Using Component-based Design to Develop a Flexible, Generic Multiphysics Analysis Code

Author: Bartlett Roscoe A.
Bradley Andrew M.
Chen Qiushi
Demeshko Irina P.
Gao Xujiao
Hansen Glen A.
Mota Alejandro
Muller Richard P.
Nielsen Erik
Ostien Jakob T.
Pawlowski Roger P.
Perego Mauro
Phipps Eric T.
Salinger Andrew G.
Sun WaiChing
Tezaur Irina K.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2016
Field of study

Abstract: Albany is a multiphysics code constructed by assembling a set of reusable, general components. It is an implicit, unstructured grid finite element code that hosts a set of advanced features that are readily combined within a single analysis run. Albany uses template-based generic programming methods to provide extensibility and flexibility; it employs a generic residual evaluation interface to support the easy addition and modification of physics. This interface is coupled to powerful automatic differentiation utilities that are used to implement efficient nonlinear solvers and preconditioners, and also to enable sensitivity analysis and embedded uncertainty quantification capabilities as part of the forward solve. The flexible application programming interfaces in Albany couple to two different adaptive mesh libraries; it internally employs generic integration machinery that supports tetrahedral, hexahedral, and hybrid meshes of user specified order. We present the overall design of Albany, and focus on the specifics of the integration of many of its advanced features. As Albany and the components that form it are openly available on the internet, it is our goal that the reader might find some of the design concepts useful in their own work. Albany results in a code that enables the rapid development of parallel, numerically efficient multiphysics software tools. In discussing the features and details of the integration of many of the components involved, we show the reader the wide variety of solution components that are available and what is possible when they are combined within a simulation capability. Key Words: partial differential equations, finite element analysis, template-based generic programmin

Columbia University Academic Commons

Performance and Optimization Abstractions for Large Scale Heterogeneous Systems in the Cactus/Chemora Framework

Author: Schnetter Erik
Publication venue
Publication date: 01/01/2013
Field of study

We describe a set of lower-level abstractions to improve performance on modern large scale heterogeneous systems. These provide portable access to system- and hardware-dependent features, automatically apply dynamic optimizations at run time, and target stencil-based codes used in finite differencing, finite volume, or block-structured adaptive mesh refinement codes. These abstractions include a novel data structure to manage refinement information for block-structured adaptive mesh refinement, an iterator mechanism to efficiently traverse multi-dimensional arrays in stencil-based codes, and a portable API and implementation for explicit SIMD vectorization. These abstractions can either be employed manually, or be targeted by automated code generation, or be used via support libraries by compilers during code generation. The implementations described below are available in the Cactus framework, and are used e.g. in the Einstein Toolkit for relativistic astrophysics simulations

arXiv.org e-Print Archive

CiteSeerX

Design and Analysis of a Task-based Parallelization over a Runtime System of an Explicit Finite-Volume CFD Code with Adaptive Time Stepping

Author: Brenner Pierre
Carpaye Jean Marie Couteyen
Roman Jean
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

FLUSEPA (Registered trademark in France No. 134009261) is an advanced simulation tool which performs a large panel of aerodynamic studies. It is the unstructured finite-volume solver developed by Airbus Safran Launchers company to calculate compressible, multidimensional, unsteady, viscous and reactive flows around bodies in relative motion. The time integration in FLUSEPA is done using an explicit temporal adaptive method. The current production version of the code is based on MPI and OpenMP. This implementation leads to important synchronizations that must be reduced. To tackle this problem, we present the study of a task-based parallelization of the aerodynamic solver of FLUSEPA using the runtime system StarPU and combining up to three levels of parallelism. We validate our solution by the simulation (using a finite-volume mesh with 80 million cells) of a take-off blast wave propagation for Ariane 5 launcher.Comment: Accepted manuscript of a paper in Journal of Computational Scienc

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server