Search CORE

446 research outputs found

The DUNE-ALUGrid Module

Author: Alkämper Martin
Dedner Andreas
Klöfkorn Robert
Nolte Martin
Publication venue
Publication date: 15/08/2015
Field of study

In this paper we present the new DUNE-ALUGrid module. This module contains a major overhaul of the sources from the ALUgrid library and the binding to the DUNE software framework. The main changes include user defined load balancing, parallel grid construction, and an redesign of the 2d grid which can now also be used for parallel computations. In addition many improvements have been introduced into the code to increase the parallel efficiency and to decrease the memory footprint. The original ALUGrid library is widely used within the DUNE community due to its good parallel performance for problems requiring local adaptivity and dynamic load balancing. Therefore, this new model will benefit a number of DUNE users. In addition we have added features to increase the range of problems for which the grid manager can be used, for example, introducing a 3d tetrahedral grid using a parallel newest vertex bisection algorithm for conforming grid refinement. In this paper we will discuss the new features, extensions to the DUNE interface, and explain for various examples how the code is used in parallel environments.Comment: 25 pages, 11 figure

arXiv.org e-Print Archive

UiS Brage

High-Quality Shared-Memory Graph Partitioning

Author: A Buluç
CL Staudt
H Meyerhenke
Henning Meyerhenke
P Sanders
Publication venue
Publication date: 15/10/2018
Field of study

Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation in processing graphs. Recently, size, variety, and structural complexity of these networks has grown dramatically. Unfortunately, previous approaches to parallel graph partitioning have problems in this context since they often show a negative trade-off between speed and quality. We present an approach to multi-level shared-memory parallel graph partitioning that guarantees balanced solutions, shows high speed-ups for a variety of large graphs and yields very good quality independently of the number of cores used. For example, on 31 cores, our algorithm partitions our largest test instance into 16 blocks cutting less than half the number of edges than our main competitor when both algorithms are given the same amount of time. Important ingredients include parallel label propagation for both coarsening and improvement, parallel initial partitioning, a simple yet effective approach to parallel localized local search, and fast locality preserving hash tables

arXiv.org e-Print Archive

Crossref

PT-Scotch: A tool for efficient parallel graph ordering

Author: Amestoy
Amestoy
Barnard
C. Chevalier
F. Pellegrini
Fiduccia
George
George
Hendrickson
Hénon
Hénon
Karypis
Kernighan
Lipton
Liu
Pellegrini
Tinney
Publication venue: 'Elsevier BV'
Publication date: 01/07/2008
Field of study

The parallel ordering of large graphs is a difficult problem, because on the one hand minimum degree algorithms do not parallelize well, and on the other hand the obtainment of high quality orderings with the nested dissection algorithm requires efficient graph bipartitioning heuristics, the best sequential implementations of which are also hard to parallelize. This paper presents a set of algorithms, implemented in the PT-Scotch software package, which allows one to order large graphs in parallel, yielding orderings the quality of which is only slightly worse than the one of state-of-the-art sequential algorithms. Our implementation uses the classical nested dissection approach but relies on several novel features to solve the parallel graph bipartitioning problem. Thanks to these improvements, PT-Scotch produces consistently better orderings than ParMeTiS on large numbers of processors

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Oskar Bordeaux

Parallel Graph Partitioning for Complex Networks

Author: Meyerhenke Henning
Sanders Peter
Schulz Christian
Publication venue
Publication date: 01/01/2015
Field of study

Processing large complex networks like social networks or web graphs has recently attracted considerable interest. In order to do this in parallel, we need to partition them into pieces of about equal size. Unfortunately, previous parallel graph partitioners originally developed for more regular mesh-like networks do not work well for these networks. This paper addresses this problem by parallelizing and adapting the label propagation technique originally developed for graph clustering. By introducing size constraints, label propagation becomes applicable for both the coarsening and the refinement phase of multilevel graph partitioning. We obtain very high quality by applying a highly parallel evolutionary algorithm to the coarsened graph. The resulting system is both more scalable and achieves higher quality than state-of-the-art systems like ParMetis or PT-Scotch. For large complex networks the performance differences are very big. For example, our algorithm can partition a web graph with 3.3 billion edges in less than sixteen seconds using 512 cores of a high performance cluster while producing a high quality partition -- none of the competing systems can handle this graph on our system.Comment: Review article. Parallelization of our previous approach arXiv:1402.328

arXiv.org e-Print Archive

CiteSeerX

Preparing TELEMAC-2D for extremely large simulations

Author: Adouin Yoann
Barber Robert
Moulinec Charles
Sunderland Andrew
Publication venue
Publication date: 01/01/2011
Field of study

Water Qualit

Hydraulic Engineering Repository

Large-Scale CFD Parallel Computing Dealing with Massive Mesh

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Crossref

Parallel Mesh Partitioning in Alya

Author: A. Artigues
Publication venue
Publication date
Field of study

The Alya System is the BSC simulation code for multi-physics problems [1]. It is based on a Variational Multiscale Finite Element Method for unstructured meshes. Work distribution is achieved by partitioning the original mesh into subdomains (submeshes). This pre-partition step has until now been done in serial by only one process, using the metis library [2]. This is a huge bottleneck when larger meshes with millions of elements have to be partitioned. This is due to the data not fitting in the memory of a single computing node and in the cases where the data does fit; Alya takes too long in the partitioning step. In this document we explain the tasks done to design, implement and test a new parallel partitioning algorithm for Alya. In this algorithm a subset of the workers, is in charge of partition the mesh in parallel, using the parmetis library [3]. Partitioning workers, load consecutive parts of the main mesh, with a parallel space partitioning bin structure [4], capable of obtaining the adjacent boundary elements of their respective submeshes. With this local mesh, each of the partitioning workers is able to create its local element adjacency graph and to partition the mesh. We have validated our new algorithm using a Navier-Stokes problem on a small cube mesh of 1000 elements. Then we performed a scalability test on a 30M element mesh to check if the time to partition the mesh is reduced proportionally with the number of partitioning workers. We have also done a comparison between metis and parmetis, the balancing of the element distribution among the domains, to test how the use of many partitioning workers to partition the mesh affects the scalability of Alya. We have noticed in these tests that it’s better to use fewer partitioning workers to partition the mesh. Finally we have two sections explaining the results and the future work that has to be done in order to finalise and improve the parallel partition algorithm

ZENODO