6 research outputs found

    Orbital-enriched Flat-top Partition of Unity Method for the Schr\"odinger Eigenproblem

    Full text link
    Quantum mechanical calculations require the repeated solution of a Schr\"odinger equation for the wavefunctions of the system. Recent work has shown that enriched finite element methods significantly reduce the degrees of freedom required to obtain accurate solutions. However, time to solution has been adversely affected by the need to solve a generalized eigenvalue problem and the ill-conditioning of associated systems matrices. In this work, we address both issues by proposing a stable and efficient orbital-enriched partition-of-unity method to solve the Schr\"odinger boundary-value problem in a parallelepiped unit cell subject to Bloch-periodic boundary conditions. In our proposed PUM, the three-dimensional domain is covered by overlapping patches, with a compactly-supported, non-negative weight function, that is identically equal to unity over some finite subset of its support associated with each patch. This so-called flat-top property provides a pathway to devise a stable approximation over the whole domain. On each patch, we use pp-th degree orthogonal polynomials that ensure pp-th order completeness, and in addition include eigenfunctions of the radial solution of the Schr\"odinger equation. Furthermore, we adopt a variational lumping approach to construct a block-diagonal overlap matrix that yields a standard eigenvalue problem and demonstrate accuracy, stability and efficiency of the method.Comment: 24 pages, 12 figure

    Algorithms and data structures for matrix-free finite element operators with MPI-parallel sparse multi-vectors

    Full text link
    Traditional solution approaches for problems in quantum mechanics scale as O(M3)\mathcal O(M^3), where MM is the number of electrons. Various methods have been proposed to address this issue and obtain linear scaling O(M)\mathcal O(M). One promising formulation is the direct minimization of energy. Such methods take advantage of physical localization of the solution, namely that the solution can be sought in terms of non-orthogonal orbitals with local support. In this work a numerically efficient implementation of sparse parallel vectors within the open-source finite element library deal.II is proposed. The main algorithmic ingredient is the matrix-free evaluation of the Hamiltonian operator by cell-wise quadrature. Based on an a-priori chosen support for each vector we develop algorithms and data structures to perform (i) matrix-free sparse matrix multivector products (SpMM), (ii) the projection of an operator onto a sparse sub-space (inner products), and (iii) post-multiplication of a sparse multivector with a square matrix. The node-level performance is analyzed using a roofline model. Our matrix-free implementation of finite element operators with sparse multivectors achieves the performance of 157 GFlop/s on Intel Cascade Lake architecture. Strong and weak scaling results are reported for a typical benchmark problem using quadratic and quartic finite element bases.Comment: 29 pages, 12 figure

    Convergence study of the h-adaptive PUM and the hp-adaptive FEM applied to eigenvalue problems in quantum mechanics

    Get PDF
    Abstract In this paper the h-adaptive partition-of-unity method and the h- and hp-adaptive finite element method are applied to eigenvalue problems arising in quantum mechanics, namely, the Schrödinger equation with Coulomb and harmonic potentials, and the all-electron Kohn–Sham density functional theory. The partition-of-unity method is equipped with an a posteriori error estimator, thus enabling implementation of error-controlled adaptive mesh refinement strategies. To that end, local interpolation error estimates are derived for the partition-of-unity method enriched with a class of exponential functions. The efficiency of the h-adaptive partition-of-unity method is compared to the h- and hp-adaptive finite element method. The latter is implemented by adopting the analyticity estimate from Legendre coefficients. An extension of this approach to multiple solution vectors is proposed. Numerical results confirm the theoretically predicted convergence rates and remarkable accuracy of the h-adaptive partition-of-unity approach. Implementational details of the partition-of-unity method related to enforcing continuity with hanging nodes are discussed

    Algorithms for massively parallel generic hp-adaptive finite element methods

    Get PDF
    Efficient algorithms for the numerical solution of partial differential equations are required to solve problems on an economically viable timescale. In general, this is achieved by adapting the resolution of the discretization to the investigated problem, as well as exploiting hardware specifications. For the latter category, parallelization plays a major role for modern multi-core and multi-node architectures, especially in the context of high-performance computing. Using finite element methods, solutions are approximated by discretizing the function space of the problem with piecewise polynomials. With hp-adaptive methods, the polynomial degrees of these basis functions may vary on locally refined meshes. We present algorithms and data structures required for generic hp-adaptive finite element software applicable for both continuous and discontinuous Galerkin methods on distributed memory systems. Both function space and mesh may be adapted dynamically during the solution process. We cover details concerning the unique enumeration of degrees of freedom with continuous Galerkin methods, the communication of variable size data, and load balancing. Furthermore, we present strategies to determine the type of adaptation based on error estimation and prediction as well as smoothness estimation via the decay rate of coefficients of Fourier and Legendre series expansions. Both refinement and coarsening are considered. A reference implementation in the open-source library deal.II is provided and applied to the Laplace problem on a domain with a reentrant corner which invokes a singularity. With this example, we demonstrate the benefits of the hp-adaptive methods in terms of error convergence and show that our algorithm scales up to 49,152 MPI processes.Für die numerische Lösung partieller Differentialgleichungen sind effiziente Algorithmen erforderlich, um Probleme auf einer wirtschaftlich tragbaren Zeitskala zu lösen. Im Allgemeinen ist dies durch die Anpassung der Diskretisierungsauflösung an das untersuchte Problem sowie durch die Ausnutzung der Hardwarespezifikationen möglich. Für die letztere Kategorie spielt die Parallelisierung eine große Rolle für moderne Mehrkern- und Mehrknotenarchitekturen, insbesondere im Kontext des Hochleistungsrechnens. Mit Hilfe von Finite-Elemente-Methoden werden Lösungen durch Diskretisierung des assoziierten Funktionsraums mit stückweisen Polynomen approximiert. Bei hp-adaptiven Verfahren können die Polynomgrade dieser Basisfunktionen auf lokal verfeinerten Gittern variieren. In dieser Dissertation werden Algorithmen und Datenstrukturen vorgestellt, die für generische hp-adaptive Finite-Elemente-Software benötigt werden und sowohl für kontinuierliche als auch diskontinuierliche Galerkin-Verfahren auf Systemen mit verteiltem Speicher anwendbar sind. Sowohl der Funktionsraum als auch das Gitter können während des Lösungsprozesses dynamisch angepasst werden. Im Besonderen erläutert werden die eindeutige Nummerierung von Freiheitsgraden mit kontinuierlichen Galerkin-Verfahren, die Kommunikation von Daten variabler Größe und die Lastenverteilung. Außerdem werden Strategien zur Bestimmung des Adaptierungstyps auf der Grundlage von Fehlerschätzungen und -prognosen sowie Glattheitsschätzungen vorgestellt, die über die Zerfallsrate von Koeffizienten aus Reihenentwicklungen nach Fourier und Legendre bestimmt werden. Dabei werden sowohl Verfeinerung als auch Vergröberung berücksichtigt. Eine Referenzimplementierung erfolgt in der Open-Source-Bibliothek deal.II und wird auf das Laplace-Problem auf einem Gebiet mit einer einschneidenden Ecke angewandt, die eine Singularität aufweist. Anhand dieses Beispiels werden die Vorteile der hp-adaptiven Methoden hinsichtlich der Fehlerkonvergenz und die Skalierbarkeit der präsentierten Algorithmen auf bis zu 49.152 MPI-Prozessen demonstriert
    corecore