242 research outputs found

    A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization

    Full text link
    We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. This work is part of a more global effort, the STRUMPACK (STRUctured Matrices PACKage) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver

    Parallel computation of 3-D soil-structure interaction in time domain with a coupled FEM/SBFEM approach

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10915-011-9551-xThis paper introduces a parallel algorithm for the scaled boundary finite element method (SBFEM). The application code is designed to run on clusters of computers, and it enables the analysis of large-scale soil-structure-interaction problems, where an unbounded domain has to fulfill the radiation condition for wave propagation to infinity. The main focus of the paper is on the mathematical description and numerical implementation of the SBFEM. In particular, we describe in detail the algorithm to compute the acceleration unit impulse response matrices used in the SBFEM as well as the solvers for the Riccati and Lyapunov equations. Finally, two test cases validate the new code, illustrating the numerical accuracy of the results and the parallel performances. © Springer Science+Business Media, LLC 2011.Jose E. Roman and Enrique S. Quintana-Orti were partially supported by the Spanish Ministerio de Ciencia e Innovacion under grants TIN2009-07519, and TIN2008-06570-C04-01, respectively.Schauer, M.; Román Moltó, JE.; Quintana Orti, ES.; Langer, S. (2012). Parallel computation of 3-D soil-structure interaction in time domain with a coupled FEM/SBFEM approach. Journal of Scientific Computing. 52(2):446-467. doi:10.1007/s10915-011-9551-xS446467522Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK User’s Guide. Society for Industrial and Applied Mathematics, Philadelphia (1992)Antes, H., Spyrakos, C.: Soil-structure interaction. In: Beskos, D., Anagnotopoulos, S. (eds.) Computer Analysis and Design of Earthquake Resistant Structures, p. 271. Computational Mechanics Publications, Southampton (1997)Appelö, D., Colonius, T.: A high-order super-grid-scale absorbing layer and its application to linear hyperbolic systems. J. Comput. Phys. 228(11), 4200–4217 (2009)Astley, R.J.: Infinite elements for wave problems: a review of current formulations and a assessment of accuracy. Int. J. Numer. Methods Eng. 49(7), 951–976 (2000)Balay, S., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M., McInnes, L.C., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11 - Revision 3.1, Argonne National Laboratory (2010)Benner, P.: Contributions to the numerical solution of algebraic Riccati equations and related eigenvalue problems. Dissertation, Fak. f. Mathematik, TU Chemnitz–Zwickau, Chemnitz, FRG (1997)Benner, P.: Numerical solution of special algebraic Riccati equations via an exact line search method. In: Proc. European Control Conf. ECC 97, Paper 786, BELWARE Information Technology, Waterloo (B) (1997)Benner, P., Quintana-Ortí, E.: Solving stable generalized Lyapunov equations with the matrix sign function. Numer. Algorithms 20(1), 75–100 (1999)Benner, P., Byers, R., Quintana-Ortí, E., Quintana-Ortí, G.: Solving algebraic Riccati equations on parallel computers using Newton’s method with exact line search. Parallel Comput. 26(10), 1345–1368 (2000)Benner, P., Quintana-Ortí, E.S., Quintana-Ortí, G.: Solving linear-quadratic optimal control problems on parallel computers. Optim. Methods Softw. 23(6), 879–909 (2008)Bettess, P.: Infinite Elements. Penshaw Press, Sunderland (1992)Blackford, L.S., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1997)Borsutzky, R.: Braunschweiger Schriften zur Mechanik - Seismic Risk Analysis of Buried Lifelines, vol. 63. Mechanik-Zentrum Technische Universität. Braunschweig (2008)Dongarra, J.J., Whaley, R.C.: LAPACK working note 94: A user’s guide to the BLACS v1.1. Tech. Rep. UT-CS-95-281, Department of Computer Science, University of Tennessee (1995)Engquist, B., Majda, A.: Absorbing boundary conditions for the numerical simulation of waves. Math. Comput. 31(139), 629–651 (1977)Granat, R., Kågström, B.: Algorithm 904: The SCASY library – parallel solvers for Sylvester-type matrix equations with applications in condition estimation, part II. ACM Trans. Math. Softw. 37(3), 33:1–33:4 (2010)Guerrero, D., Hernández, V., Román, J.E.: Parallel SLICOT model reduction routines: The Cholesky factor of Grammians. In: Proceedings of the 15th Triennal IFAC World Congress, Barcelona, Spain (2002)Harr, M.E.: Foundations of Theoretical Soil Mechanics. McGraw-Hill, New York (1966)Hilbert, H., Hughes, T., Taylor, R.: Improved numerical dissipation for time integration algorithms in structural dynamics. Earthquake Eng. Struct. Dyn. 5, 283 (1977)Kleinman, D.: On an iterative technique for Riccati equation computations. IEEE Trans. Autom. Control AC-13, 114–115 (1968)Lehmann, L.: Wave Propagation in Infinite Domains. Springer, Berlin (2006)Lehmann, L., Langer, S., Clasen, D.: Scaled boundary finite element method for acoustics. J. Comput. Acoust. 14(4), 489–506 (2006)Liao, Z.P., Wong, H.L.: A transmitting boundary for the numerical simulation of elastic wave propagation. Soil Dyn. Earthq. Eng. 3(4), 174–183 (1984)Lysmer, J., Kuhlmeyer, R.L.: Finite dynamic model for infinite media. J. Eng. Mech. 95, 859–875 (1969)Meskouris, K., Hinzen, K.G., Butenweg, C., Mistler, M.: Bauwerke und Erdbeben - Grundlagen - Anwendung - Beispiele. Vieweg Teubner, Wiesbaden (2007)MPI Forum: The message passing interface (MPI) standard (1994). http://www.mcs.anl.gov/mpiNewmark, N.: A method of computation for structural dynamics. J. Eng. Mech. Div. 85, 67 (1959)Petersen, C.: Dynamik der Baukonstruktionen. Vieweg/Sohn Verlagsgesellschaft, Braunschweig (2000)Roberts, J.: Linear model reduction and solution of the algebraic Riccati equation by use of the sign function. Int. J. Control 32, 677–687 (1980)Schauer, M., Lehmann, L.: Large scale simulation with scaled boundary finite element method. Proc. Appl. Math. Mech. 9, 103–106 (2009)Wolf, J.: The Scaled Boundary Finite Element Method. Wiley, Chichester (2003)Wolf, J., Song, C.: Finite-Element Modelling of Unbounded Media. Wiley, Chichester (1996

    Implementation of an efficient coupled fem-sbfem approach for soil-structure-interaction analysis

    Get PDF
    Buildings are grounded in the surrounding soil, so that soil and structure interact with each other. Consequently in the soil induced vibrations are transmitted to the structures. Neighbouring buildings and structures interact with each other, as they are connected by the soil. Nowadays numerical simulation of soil structure interaction is of great interest and is applied to very different problems. These include for example the construction of reliable earthquake-resistant structures in seismic active areas, and also the increase of comfort of buildings by decouple them form surrounding emissions like vibrations induced by traffic of machine foundations. This work shows that the simulation of soil-structure-interaction taking unbounded domains into account, which fulfils the Sommerfeld radiation condition exactly, is not only possible for academic examples, but for large scale real life problems as well. Therefore two numerical methods where coupled to create an efficient coupled method, which can be used to simulate soil-structure-interaction in time domain. The numerical implementation of this coupled approach bases on a combination of finite element method [1] and scaled boundary finite element method [2]. The finite element method is used to discretise the near-field, containing structures and its surrounding soil. The coupled infinite half-space, the far-field is realised by the scaled boundary finite element method. A contemporary parallel implementation of the coupling algorithms is done, since the simulation of soil structure interaction in time domain is very time and memory consuming [3]. Subsequent the numerical performance of the implemented software is discussed in terms of speed-up and efficiency. Different geotechnical applications are illustrated and the applicability of the coupled method is shown and discussed on chosen examples

    Accelerating Atomic Orbital-based Electronic Structure Calculation via Pole Expansion and Selected Inversion

    Full text link
    We describe how to apply the recently developed pole expansion and selected inversion (PEXSI) technique to Kohn-Sham density function theory (DFT) electronic structure calculations that are based on atomic orbital discretization. We give analytic expressions for evaluating the charge density, the total energy, the Helmholtz free energy and the atomic forces (including both the Hellman-Feynman force and the Pulay force) without using the eigenvalues and eigenvectors of the Kohn-Sham Hamiltonian. We also show how to update the chemical potential without using Kohn-Sham eigenvalues. The advantage of using PEXSI is that it has a much lower computational complexity than that associated with the matrix diagonalization procedure. We demonstrate the performance gain by comparing the timing of PEXSI with that of diagonalization on insulating and metallic nanotubes. For these quasi-1D systems, the complexity of PEXSI is linear with respect to the number of atoms. This linear scaling can be observed in our computational experiments when the number of atoms in a nanotube is larger than a few hundreds. Both the wall clock time and the memory requirement of PEXSI is modest. This makes it even possible to perform Kohn-Sham DFT calculations for 10,000-atom nanotubes with a sequential implementation of the selected inversion algorithm. We also perform an accurate geometry optimization calculation on a truncated (8,0) boron-nitride nanotube system containing 1024 atoms. Numerical results indicate that the use of PEXSI does not lead to loss of accuracy required in a practical DFT calculation

    Computing the k-th Eigenvalue of Symmetric H2H^2-Matrices

    Full text link
    The numerical solution of eigenvalue problems is essential in various application areas of scientific and engineering domains. In many problem classes, the practical interest is only a small subset of eigenvalues so it is unnecessary to compute all of the eigenvalues. Notable examples are the electronic structure problems where the kk-th smallest eigenvalue is closely related to the electronic properties of materials. In this paper, we consider the kk-th eigenvalue problems of symmetric dense matrices with low-rank off-diagonal blocks. We present a linear time generalized LDL decomposition of H2\mathcal{H}^2 matrices and combine it with the bisection eigenvalue algorithm to compute the kk-th eigenvalue with controllable accuracy. In addition, if more than one eigenvalue is required, some of the previous computations can be reused to compute the other eigenvalues in parallel. Numerical experiments show that our method is more efficient than the state-of-the-art dense eigenvalue solver in LAPACK/ScaLAPACK and ELPA. Furthermore, tests on electronic state calculations of carbon nanomaterials demonstrate that our method outperforms the existing HSS-based bisection eigenvalue algorithm on 3D problems.Comment: 14 pages, 11 figure

    BCYCLIC: A parallel block tridiagonal matrix cyclic solver

    Get PDF
    13 pages, 6 figures.A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.This research has been sponsored by the US Department of Energy under Contract DE-AC05-00OR22725 with UT-Battelle, LLC. This research used resources of the National Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the Department of Energy under Contract DE-AC05-00OR22725.Publicad

    Parallel application on high performance computing platforms of 3D BEM/FEM based coupling model for dynamic analysis of SSI problems

    Get PDF
    Implementation of an improved parallel computation algorithm into a coupled model based on Finite Element and Boundary Element Methods for analysis of threedimensional Soil-Structure Interaction (SSI) problems on High-Performance Computing (HPC) platforms is presented. The model and the parallel computation algorithm are developed for the linear analysis of large-scale three-dimensional SSI problems. The finite element method is used for modeling the finite region and the structure, and the Boundary Element Method is used for modeling the soil extending to infinity. The parallelization of the model is performed by the calculation of the impedance coefficients on the interaction nodes between the near- and the far-fields. The performance of the parallel computation algorithm is represented by elapsed timing measurements according to the number of processors. The efficiency of the proposed parallel algorithm of the coupled model is validated with one numerical example that confirm the consistent accuracy and applicability of the parallel algorithm by considerable time saving for large-scale problems

    A Shift Selection Strategy for Parallel Shift-invert Spectrum Slicing in Symmetric Self-consistent Eigenvalue Computation

    Get PDF
    © 2020 ACM. The central importance of large-scale eigenvalue problems in scientific computation necessitates the development of massively parallel algorithms for their solution. Recent advances in dense numerical linear algebra have enabled the routine treatment of eigenvalue problems with dimensions on the order of hundreds of thousands on the world's largest supercomputers. In cases where dense treatments are not feasible, Krylov subspace methods offer an attractive alternative due to the fact that they do not require storage of the problem matrices. However, demonstration of scalability of either of these classes of eigenvalue algorithms on computing architectures capable of expressing massive parallelism is non-trivial due to communication requirements and serial bottlenecks, respectively. In this work, we introduce the SISLICE method: a parallel shift-invert algorithm for the solution of the symmetric self-consistent field (SCF) eigenvalue problem. The SISLICE method drastically reduces the communication requirement of current parallel shift-invert eigenvalue algorithms through various shift selection and migration techniques based on density of states estimation and k-means clustering, respectively. This work demonstrates the robustness and parallel performance of the SISLICE method on a representative set of SCF eigenvalue problems and outlines research directions that will be explored in future work

    Parallel sparse direct solvers for Poisson's equation in streamer discharges

    Get PDF
    The aim of this paper is to examine whether a hybrid approach of parallel computing, a combination of the message passing model (MPI) with the threads model (OpenMP) can deliver good performance in streamer discharge simulations. Since one of the bottlenecks of almost all streamer models is the solution of Poisson's equation, we focused on several direct solvers, which can solve large sparse systems in parallel. For this purpose, our basic thought was to concentrate on 'easy to get' performance improvements, or, without rewriting of the code. We have investigated in PARDISO, a shared memory solver, and CLUSTER SPARSE SOLVER and MUMPS, which both can apply hybrid parallelism; the latter two solvers can be called from a single core and do not require minor awareness of MPI. We show their performance for solving two- and three-dimensional Poisson's equations on the Dutch national supercomputer, called Cartesius. A runtime study of a code developed for streamer propagation nearby a dielectric rod is included. We discuss various issues that appear to be critical in a mixed MPI-OpenMP environment
    corecore