11,231 research outputs found

    An odyssey into local refinement and multilevel preconditioning III: Implementation and numerical experiments

    Get PDF
    In this paper, we examine a number of additive and multiplicative multilevel iterative methods and preconditioners in the setting of two-dimensional local mesh refinement. While standard multilevel methods are effective for uniform refinement-based discretizations of elliptic equations, they tend to be less effective for algebraic systems, which arise from discretizations on locally refined meshes, losing their optimal behavior in both storage and computational complexity. Our primary focus here is on Bramble, Pasciak, and Xu (BPX)-style additive and multiplicative multilevel preconditioners, and on various stabilizations of the additive and multiplicative hierarchical basis (HB) method, and their use in the local mesh refinement setting. In parts I and II of this trilogy, it was shown that both BPX and wavelet stabilizations of HB have uniformly bounded condition numbers on several classes of locally refined two- and three-dimensional meshes based on fairly standard (and easily implementable) red and red-green mesh refinement algorithms. In this third part of the trilogy, we describe in detail the implementation of these types of algorithms, including detailed discussions of the data structures and traversal algorithms we employ for obtaining optimal storage and computational complexity in our implementations. We show how each of the algorithms can be implemented using standard data types, available in languages such as C and FORTRAN, so that the resulting algorithms have optimal (linear) storage requirements, and so that the resulting multilevel method or preconditioner can be applied with optimal (linear) computational costs. We have successfully used these data structure ideas for both MATLAB and C implementations using the FEtk, an open source finite element software package. We finish the paper with a sequence of numerical experiments illustrating the effectiveness of a number of BPX and stabilized HB variants for several examples requiring local refinement

    Parallel load balancing strategy for Volume-of-Fluid methods on 3-D unstructured meshes

    Get PDF
    © 2016. This version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/l Volume-of-Fluid (VOF) is one of the methods of choice to reproduce the interface motion in the simulation of multi-fluid flows. One of its main strengths is its accuracy in capturing sharp interface geometries, although requiring for it a number of geometric calculations. Under these circumstances, achieving parallel performance on current supercomputers is a must. The main obstacle for the parallelization is that the computing costs are concentrated only in the discrete elements that lie on the interface between fluids. Consequently, if the interface is not homogeneously distributed throughout the domain, standard domain decomposition (DD) strategies lead to imbalanced workload distributions. In this paper, we present a new parallelization strategy for general unstructured VOF solvers, based on a dynamic load balancing process complementary to the underlying DD. Its parallel efficiency has been analyzed and compared to the DD one using up to 1024 CPU-cores on an Intel SandyBridge based supercomputer. The results obtained on the solution of several artificially generated test cases show a speedup of up to similar to 12x with respect to the standard DD, depending on the interface size, the initial distribution and the number of parallel processes engaged. Moreover, the new parallelization strategy presented is of general purpose, therefore, it could be used to parallelize any VOF solver without requiring changes on the coupled flow solver. Finally, note that although designed for the VOF method, our approach could be easily adapted to other interface-capturing methods, such as the Level-Set, which may present similar workload imbalances. (C) 2014 Elsevier Inc. Allrights reserved.Peer ReviewedPostprint (author's final draft

    Matlab parallel codes for 3D slope stability benchmarks

    Get PDF
    This contribution is focused on a description of implementation details for solver related to the slope stability benchmarks in 3D. Such problems are formulated by the standard elastoplastic models containing the Mohr-Coulomb yield criterion and by the limit analysis of collapse states. The implicit Euler method and higher order finite elements are used for discretization. The discretized problem is solved by non-smooth Newton-like methods in combination with incremental methods of limit load analysis. In this standard approach, we propose several innovative techniques. Firstly, we use recently developed sub-differential based constitutive solution schemes. Such an approach is suitable for non-smooth yield criteria, and leads better return-mapping algorithms. For example, a priori decision criteria for each return-type or simplified construction of consistent tangent operators are applied. The parallel codes are developed in MATLAB using Parallel Computing Toolbox. For parallel implementation of linear systems, we use the TFETI domain decomposition method. It is a non-overlapping method where the Lagrange multipliers are used to enforce continuity on the subdomain interfaces and satisfaction of the Dirichlet boundary conditions

    A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing

    Get PDF
    This work introduces an innovative parallel, fully-distributed finite element framework for growing geometries and its application to metal additive manufacturing. It is well-known that virtual part design and qualification in additive manufacturing requires highly-accurate multiscale and multiphysics analyses. Only high performance computing tools are able to handle such complexity in time frames compatible with time-to-market. However, efficiency, without loss of accuracy, has rarely held the centre stage in the numerical community. Here, in contrast, the framework is designed to adequately exploit the resources of high-end distributed-memory machines. It is grounded on three building blocks: (1) Hierarchical adaptive mesh refinement with octree-based meshes; (2) a parallel strategy to model the growth of the geometry; (3) state-of-the-art parallel iterative linear solvers. Computational experiments consider the heat transfer analysis at the part scale of the printing process by powder-bed technologies. After verification against a 3D benchmark, a strong-scaling analysis assesses performance and identifies major sources of parallel overhead. A third numerical example examines the efficiency and robustness of (2) in a curved 3D shape. Unprecedented parallelism and scalability were achieved in this work. Hence, this framework contributes to take on higher complexity and/or accuracy, not only of part-scale simulations of metal or polymer additive manufacturing, but also in welding, sedimentation, atherosclerosis, or any other physical problem where the physical domain of interest grows in time
    corecore