8,107 research outputs found

    Adaptive mesh refinement computation of acoustic radiation from an engine intake

    No full text
    A block-structured adaptive mesh refinement (AMR) method was applied to the computational problem of acoustic radiation from an aeroengine intake. The aim is to improve the computational and storage efficiency in aeroengine noise prediction through reduction of computational cells. A parallel implementation of the adaptive mesh refinement algorithm was achieved using message passing interface. It combined a range of 2nd- and 4th-order spatial stencils, a 4th-order low-dissipation and low-dispersion Runge–Kutta scheme for time integration and several different interpolation methods. Both the parallel AMR algorithms and numerical issues were introduced briefly in this work. To solve the problem of acoustic radiation from an aeroengine intake, the code was extended to support body-fitted grid structures. The problem of acoustic radiation was solved with linearised Euler equations. The AMR results were compared with the previous results computed on a uniformly fine mesh to demonstrate the accuracy and the efficiency of the current AMR strategy. As the computational load of the whole adaptively refined mesh has to be balanced between nodes on-line, the parallel performance of the existing code deteriorates along with the increase of processors due to the expensive inter-nodes memory communication costs. The potential solution was suggested in the end

    Vectorization and Parallelization of the Adaptive Mesh Refinement N-body Code

    Full text link
    In this paper, we describe our vectorized and parallelized adaptive mesh refinement (AMR) N-body code with shared time steps, and report its performance on a Fujitsu VPP5000 vector-parallel supercomputer. Our AMR N-body code puts hierarchical meshes recursively where higher resolution is required and the time step of all particles are the same. The parts which are the most difficult to vectorize are loops that access the mesh data and particle data. We vectorized such parts by changing the loop structure, so that the innermost loop steps through the cells instead of the particles in each cell, in other words, by changing the loop order from the depth-first order to the breadth-first order. Mass assignment is also vectorizable using this loop order exchange and splitting the loop into 2Ndim2^{N_{dim}} loops, if the cloud-in-cell scheme is adopted. Here, NdimN_{dim} is the number of dimension. These vectorization schemes which eliminate the unvectorized loops are applicable to parallelization of loops for shared-memory multiprocessors. We also parallelized our code for distributed memory machines. The important part of parallelization is data decomposition. We sorted the hierarchical mesh data by the Morton order, or the recursive N-shaped order, level by level and split and allocated the mesh data to the processors. Particles are allocated to the processor to which the finest refined cells including the particles are also assigned. Our timing analysis using the Λ\Lambda-dominated cold dark matter simulations shows that our parallel code speeds up almost ideally up to 32 processors, the largest number of processors in our test.Comment: 21pages, 16 figures, to be published in PASJ (Vol. 57, No. 5, Oct. 2005

    Relativistic MHD with Adaptive Mesh Refinement

    Get PDF
    This paper presents a new computer code to solve the general relativistic magnetohydrodynamics (GRMHD) equations using distributed parallel adaptive mesh refinement (AMR). The fluid equations are solved using a finite difference Convex ENO method (CENO) in 3+1 dimensions, and the AMR is Berger-Oliger. Hyperbolic divergence cleaning is used to control the B=0\nabla\cdot {\bf B}=0 constraint. We present results from three flat space tests, and examine the accretion of a fluid onto a Schwarzschild black hole, reproducing the Michel solution. The AMR simulations substantially improve performance while reproducing the resolution equivalent unigrid simulation results. Finally, we discuss strong scaling results for parallel unigrid and AMR runs.Comment: 24 pages, 14 figures, 3 table

    Refficientlib: an efficient load-rebalanced adaptive mesh refinement algorithm for high-performance computational physics meshes

    Get PDF
    No separate or additional fees are collected for access to or distribution of the work.In this paper we present a novel algorithm for adaptive mesh refinement in computational physics meshes in a distributed memory parallel setting. The proposed method is developed for nodally based parallel domain partitions where the nodes of the mesh belong to a single processor, whereas the elements can belong to multiple processors. Some of the main features of the algorithm presented in this paper are its capability of handling multiple types of elements in two and three dimensions (triangular, quadrilateral, tetrahedral, and hexahedral), the small amount of memory required per processor, and the parallel scalability up to thousands of processors. The presented algorithm is also capable of dealing with nonbalanced hierarchical refinement, where multirefinement level jumps are possible between neighbor elements. An algorithm for dealing with load rebalancing is also presented, which allows us to move the hierarchical data structure between processors so that load unbalancing is kept below an acceptable level at all times during the simulation. A particular feature of the proposed algorithm is that arbitrary renumbering algorithms can be used in the load rebalancing step, including both graph partitioning and space-filling renumbering algorithms. The presented algorithm is packed in the Fortran 2003 object oriented library \textttRefficientLib, whose interface calls which allow it to be used from any computational physics code are summarized. Finally, numerical experiments illustrating the performance and scalability of the algorithm are presented.Peer ReviewedPostprint (published version

    Hydra: A Parallel Adaptive Grid Code

    Full text link
    We describe the first parallel implementation of an adaptive particle-particle, particle-mesh code with smoothed particle hydrodynamics. Parallelisation of the serial code, ``Hydra'', is achieved by using CRAFT, a Cray proprietary language which allows rapid implementation of a serial code on a parallel machine by allowing global addressing of distributed memory. The collisionless variant of the code has already completed several 16.8 million particle cosmological simulations on a 128 processor Cray T3D whilst the full hydrodynamic code has completed several 4.2 million particle combined gas and dark matter runs. The efficiency of the code now allows parameter-space explorations to be performed routinely using 64364^3 particles of each species. A complete run including gas cooling, from high redshift to the present epoch requires approximately 10 hours on 64 processors. In this paper we present implementation details and results of the performance and scalability of the CRAFT version of Hydra under varying degrees of particle clustering.Comment: 23 pages, LaTex plus encapsulated figure

    Optimisation of patch distribution strategies for AMR applications

    Get PDF
    As core counts increase in the world's most powerful supercomputers, applications are becoming limited not only by computational power, but also by data availability. In the race to exascale, efficient and effective communication policies are key to achieving optimal application performance. Applications using adaptive mesh refinement (AMR) trade off communication for computational load balancing, to enable the focused computation of specific areas of interest. This class of application is particularly susceptible to the communication performance of the underlying architectures, and are inherently difficult to scale efficiently. In this paper we present a study of the effect of patch distribution strategies on the scalability of an AMR code. We demonstrate the significance of patch placement on communication overheads, and by balancing the computation and communication costs of patches, we develop a scheme to optimise performance of a specific, industry-strength, benchmark application

    A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing

    Get PDF
    This work introduces an innovative parallel, fully-distributed finite element framework for growing geometries and its application to metal additive manufacturing. It is well-known that virtual part design and qualification in additive manufacturing requires highly-accurate multiscale and multiphysics analyses. Only high performance computing tools are able to handle such complexity in time frames compatible with time-to-market. However, efficiency, without loss of accuracy, has rarely held the centre stage in the numerical community. Here, in contrast, the framework is designed to adequately exploit the resources of high-end distributed-memory machines. It is grounded on three building blocks: (1) Hierarchical adaptive mesh refinement with octree-based meshes; (2) a parallel strategy to model the growth of the geometry; (3) state-of-the-art parallel iterative linear solvers. Computational experiments consider the heat transfer analysis at the part scale of the printing process by powder-bed technologies. After verification against a 3D benchmark, a strong-scaling analysis assesses performance and identifies major sources of parallel overhead. A third numerical example examines the efficiency and robustness of (2) in a curved 3D shape. Unprecedented parallelism and scalability were achieved in this work. Hence, this framework contributes to take on higher complexity and/or accuracy, not only of part-scale simulations of metal or polymer additive manufacturing, but also in welding, sedimentation, atherosclerosis, or any other physical problem where the physical domain of interest grows in time

    An adaptive fixed-mesh ALE method for free surface flows

    Get PDF
    In this work we present a Fixed-Mesh ALE method for the numerical simulation of free surface flows capable of using an adaptive finite element mesh covering a background domain. This mesh is successively refined and unrefined at each time step in order to focus the computational effort on the spatial regions where it is required. Some of the main ingredients of the formulation are the use of an Arbitrary-Lagrangian–Eulerian formulation for computing temporal derivatives, the use of stabilization terms for stabilizing convection, stabilizing the lack of compatibility between velocity and pressure interpolation spaces, and stabilizing the ill-conditioning introduced by the cuts on the background finite element mesh, and the coupling of the algorithm with an adaptive mesh refinement procedure suitable for running on distributed memory environments. Algorithmic steps for the projection between meshes are presented together with the algebraic fractional step approach used for improving the condition number of the linear systems to be solved. The method is tested in several numerical examples. The expected convergence rates both in space and time are observed. Smooth solution fields for both velocity and pressure are obtained (as a result of the contribution of the stabilization terms). Finally, a good agreement between the numerical results and the reference experimental data is obtained.Postprint (published version
    corecore