Search CORE

205 research outputs found

Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in hypre and PETSc

Author: Argentati M.E.
Argentati M.E.
Knyazev A.V.
Knyazev A.V.
Lashuk I.
Lashuk I.
Ovtchinnikov E.
Ovtchinnikov E.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 17/05/2007
Field of study

We describe our software package Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) publicly released recently. BLOPEX is available as a stand-alone serial library, as an external package to PETSc (``Portable, Extensible Toolkit for Scientific Computation'', a general purpose suite of tools for the scalable solution of partial differential equations and related problems developed by Argonne National Laboratory), and is also built into {\it hypre} (``High Performance Preconditioners'', scalable linear solvers package developed by Lawrence Livermore National Laboratory). The present BLOPEX release includes only one solver--the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) method for symmetric eigenvalue problems. {\it hypre} provides users with advanced high-quality parallel preconditioners for linear systems, in particular, with domain decomposition and multigrid preconditioners. With BLOPEX, the same preconditioners can now be efficiently used for symmetric eigenvalue problems. PETSc facilitates the integration of independently developed application modules with strict attention to component interoperability, and makes BLOPEX extremely easy to compile and use with preconditioners that are available via PETSc. We present the LOBPCG algorithm in BLOPEX for {\it hypre} and PETSc. We demonstrate numerically the scalability of BLOPEX by testing it on a number of distributed and shared memory parallel systems, including a Beowulf system, SUN Fire 880, an AMD dual-core Opteron workstation, and IBM BlueGene/L supercomputer, using PETSc domain decomposition and {\it hypre} multigrid preconditioning. We test BLOPEX on a model problem, the standard 7-point finite-difference approximation of the 3-D Laplacian, with the problem size in the range

10^5-10^8

.Comment: Submitted to SIAM Journal on Scientific Computin

arXiv.org e-Print Archive

WestminsterResearch

JupiterNCSM: A Pantheon of Nuclear Physics —an implementation of three-nucleon forces in the no-core shell model—

Author: Dj\ue4rv Tor
Publication venue
Publication date: 01/01/2021
Field of study

It is well established that three-nucleon forces (3NFs) are necessary for achieving realistic and accurate descriptions of atomic nuclei. In particular, such forces arisenaturally when using chiral effective field theories (χEFT). However, due to the huge computational complexity associated with the inclusion of 3NFs in many-body methods they are often approximated or neglected completely. In this thesis, three different methods to include the physics of 3NFs in the ab initio no-core shell-model(NCSM) have been implemented and tested. In the first method, we approximate the 3NFs as effective two-body operators by exploiting Wick’s theorem to normal order the 3NF relative a harmonic-oscillator Slater determinant reference state and discarding the remaining three-body term. We explored the performance of this single-reference normal-ordered two-body approximation on the ground-state energies of the two smallest closed-core nuclei, 4He and 16O, in particular focusing on consequences of the breaking of translational symmetry. The second approach is a full implementation of 3NFs in a new NCSM code, named JupiterNCSM, that we provide as an open-source research software. We have validated and benchmarked JupiterNCSM against other codes and we have specifically used it to investigate theeffects of different 3NFs on light p-shell nuclei 6He and 6Li. Finally, we implement the eigenvector continuation (EVC) method to emulate the response of ground-state energies of the aforementioned A = 6 nuclei to variations in the low-energy constants of χEFT that parametrize the 3NFs. In this approach, the full Hamiltonian is projected onto a small subspace that is constructed from a few selected eigenvectors. These training vectors are computed with JupiterNCSM in a large model space for a small set of parameter values. This thesis provides the first EVC-based emulation of nuclei computed with a Slater-determinant basis. After the training phase, we find that EVC predictions offer a very high accuracy and more than seven ordersof magnitude computational speedup. As a result we are able to perform rigorous statistical inferences to explore the effects of 3NFs in nuclear many-body systems

Chalmers Research

Dense and sparse parallel linear algebra algorithms on graphics processing units

Author: Lamas Daviña Alejandro
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 13/11/2018
Field of study

Una línea de desarrollo seguida en el campo de la supercomputación es el uso de procesadores de propósito específico para acelerar determinados tipos de cálculo. En esta tesis estudiamos el uso de tarjetas gráficas como aceleradores de la computación y lo aplicamos al ámbito del álgebra lineal. En particular trabajamos con la biblioteca SLEPc para resolver problemas de cálculo de autovalores en matrices de gran dimensión, y para aplicar funciones de matrices en los cálculos de aplicaciones científicas. SLEPc es una biblioteca paralela que se basa en el estándar MPI y está desarrollada con la premisa de ser escalable, esto es, de permitir resolver problemas más grandes al aumentar las unidades de procesado. El problema lineal de autovalores, Ax = lambda x en su forma estándar, lo abordamos con el uso de técnicas iterativas, en concreto con métodos de Krylov, con los que calculamos una pequeña porción del espectro de autovalores. Este tipo de algoritmos se basa en generar un subespacio de tamaño reducido (m) en el que proyectar el problema de gran dimensión (n), siendo m << n. Una vez se ha proyectado el problema, se resuelve este mediante métodos directos, que nos proporcionan aproximaciones a los autovalores del problema inicial que queríamos resolver. Las operaciones que se utilizan en la expansión del subespacio varían en función de si los autovalores deseados están en el exterior o en el interior del espectro. En caso de buscar autovalores en el exterior del espectro, la expansión se hace mediante multiplicaciones matriz-vector. Esta operación la realizamos en la GPU, bien mediante el uso de bibliotecas o mediante la creación de funciones que aprovechan la estructura de la matriz. En caso de autovalores en el interior del espectro, la expansión requiere resolver sistemas de ecuaciones lineales. En esta tesis implementamos varios algoritmos para la resolución de sistemas de ecuaciones lineales para el caso específico de matrices con estructura tridiagonal a bloques, que se ejecutan en GPU. En el cálculo de las funciones de matrices hemos de diferenciar entre la aplicación directa de una función sobre una matriz, f(A), y la aplicación de la acción de una función de matriz sobre un vector, f(A)b. El primer caso implica un cálculo denso que limita el tamaño del problema. El segundo permite trabajar con matrices dispersas grandes, y para resolverlo también hacemos uso de métodos de Krylov. La expansión del subespacio se hace mediante multiplicaciones matriz-vector, y hacemos uso de GPUs de la misma forma que al resolver autovalores. En este caso el problema proyectado comienza siendo de tamaño m, pero se incrementa en m en cada reinicio del método. La resolución del problema proyectado se hace aplicando una función de matriz de forma directa. Nosotros hemos implementado varios algoritmos para calcular las funciones de matrices raíz cuadrada y exponencial, en las que el uso de GPUs permite acelerar el cálculo.One line of development followed in the field of supercomputing is the use of specific purpose processors to speed up certain types of computations. In this thesis we study the use of graphics processing units as computer accelerators and apply it to the field of linear algebra. In particular, we work with the SLEPc library to solve large scale eigenvalue problems, and to apply matrix functions in scientific applications. SLEPc is a parallel library based on the MPI standard and is developed with the premise of being scalable, i.e. to allow solving larger problems by increasing the processing units. We address the linear eigenvalue problem, Ax = lambda x in its standard form, using iterative techniques, in particular with Krylov's methods, with which we calculate a small portion of the eigenvalue spectrum. This type of algorithms is based on generating a subspace of reduced size (m) in which to project the large dimension problem (n), being m << n. Once the problem has been projected, it is solved by direct methods, which provide us with approximations of the eigenvalues of the initial problem we wanted to solve. The operations used in the expansion of the subspace vary depending on whether the desired eigenvalues are from the exterior or from the interior of the spectrum. In the case of searching for exterior eigenvalues, the expansion is done by matrix-vector multiplications. We do this on the GPU, either by using libraries or by creating functions that take advantage of the structure of the matrix. In the case of eigenvalues from the interior of the spectrum, the expansion requires solving linear systems of equations. In this thesis we implemented several algorithms to solve linear systems of equations for the specific case of matrices with a block-tridiagonal structure, that are run on GPU. In the computation of matrix functions we have to distinguish between the direct application of a matrix function, f(A), and the action of a matrix function on a vector, f(A)b. The first case involves a dense computation that limits the size of the problem. The second allows us to work with large sparse matrices, and to solve it we also make use of Krylov's methods. The expansion of subspace is done by matrix-vector multiplication, and we use GPUs in the same way as when solving eigenvalues. In this case the projected problem starts being of size m, but it is increased by m on each restart of the method. The solution of the projected problem is done by directly applying a matrix function. We have implemented several algorithms to compute the square root and the exponential matrix functions, in which the use of GPUs allows us to speed up the computation.Una línia de desenvolupament seguida en el camp de la supercomputació és l'ús de processadors de propòsit específic per a accelerar determinats tipus de càlcul. En aquesta tesi estudiem l'ús de targetes gràfiques com a acceleradors de la computació i ho apliquem a l'àmbit de l'àlgebra lineal. En particular treballem amb la biblioteca SLEPc per a resoldre problemes de càlcul d'autovalors en matrius de gran dimensió, i per a aplicar funcions de matrius en els càlculs d'aplicacions científiques. SLEPc és una biblioteca paral·lela que es basa en l'estàndard MPI i està desenvolupada amb la premissa de ser escalable, açò és, de permetre resoldre problemes més grans en augmentar les unitats de processament. El problema lineal d'autovalors, Ax = lambda x en la seua forma estàndard, ho abordem amb l'ús de tècniques iteratives, en concret amb mètodes de Krylov, amb els quals calculem una xicoteta porció de l'espectre d'autovalors. Aquest tipus d'algorismes es basa a generar un subespai de grandària reduïda (m) en el qual projectar el problema de gran dimensió (n), sent m << n. Una vegada s'ha projectat el problema, es resol aquest mitjançant mètodes directes, que ens proporcionen aproximacions als autovalors del problema inicial que volíem resoldre. Les operacions que s'utilitzen en l'expansió del subespai varien en funció de si els autovalors desitjats estan en l'exterior o a l'interior de l'espectre. En cas de cercar autovalors en l'exterior de l'espectre, l'expansió es fa mitjançant multiplicacions matriu-vector. Aquesta operació la realitzem en la GPU, bé mitjançant l'ús de biblioteques o mitjançant la creació de funcions que aprofiten l'estructura de la matriu. En cas d'autovalors a l'interior de l'espectre, l'expansió requereix resoldre sistemes d'equacions lineals. En aquesta tesi implementem diversos algorismes per a la resolució de sistemes d'equacions lineals per al cas específic de matrius amb estructura tridiagonal a blocs, que s'executen en GPU. En el càlcul de les funcions de matrius hem de diferenciar entre l'aplicació directa d'una funció sobre una matriu, f(A), i l'aplicació de l'acció d'una funció de matriu sobre un vector, f(A)b. El primer cas implica un càlcul dens que limita la grandària del problema. El segon permet treballar amb matrius disperses grans, i per a resoldre-ho també fem ús de mètodes de Krylov. L'expansió del subespai es fa mitjançant multiplicacions matriu-vector, i fem ús de GPUs de la mateixa forma que en resoldre autovalors. En aquest cas el problema projectat comença sent de grandària m, però s'incrementa en m en cada reinici del mètode. La resolució del problema projectat es fa aplicant una funció de matriu de forma directa. Nosaltres hem implementat diversos algorismes per a calcular les funcions de matrius arrel quadrada i exponencial, en les quals l'ús de GPUs permet accelerar el càlcul.Lamas Daviña, A. (2018). Dense and sparse parallel linear algebra algorithms on graphics processing units [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/112425TESI

RiuNet

Electron Transport on the Nanoscale

Author: He Shenglai
Publication venue: VANDERBILT
Publication date
Field of study

The Eigenvalue Spectrum of the Fermion Matrix in Lattice Higgs Systems

Author: Henty David
Publication venue: ProQuest Dissertations & Theses,
Publication date: 01/01/1990
Field of study

In the first part of this thesis we consider the performance of various block algorithms for the inversion of large sparse matrices. By computing the eigenvalue spectra of the matrices under consideration we are able to directly relate the performance of the algorithms to the difficulty of the calculation. We find that the block Lanczos algorithm is superior to all others considered for the inversion of the Kogut Susskind fermion matrix. Furthermore we investigate the performance of the block Lanczos algorithm on matrices constructed to have specific eigenvalue spectra. From this study we are able to make quantitative predictive statements about the number of iterations that the algorithm will take to converge given the form of the eigenvalue spectrum of the matrix whose inversion is attempted. The rest of this thesis is concerned with lattice Higgs systems. Specifically we study a model where staggered fermions are coupled to Ising spins via an on-site Yukawa term with coupling constant y. This is a very simple model that seems to embody most of the relevant phenomena observed in more complicated systems. Most importantly there are two symmetric regions PM1 and PM2 where the renormalised fermion mass my is non-zero for large y in the PM2 region despite the scalar field having zero expectation value. We study the model in the quenched approximation and by examining the distribution of the eigenvalues of the fermion matrix M in the complex plane we qualitatively explain the features of the model as being due to the transition of eigenvalues from the imaginary to the real axis via the origin as y is increased. An approximate method for calculating mf from the value of a fermion condensate is developed and we reproduce the values for mf obtained by other authors who calculate it using the standard method involving the fermion propagator. However, our method has the advantage that it is applicable on very small volumes where the propagator definition breaks down. We investigate the behaviour in the quenched infinite volume limit by evaluating the low lying eigenvalues of the matrix A+M. We show that the small eigenvalues observed in the spectrum of M at intermediate y on finite lattices imply that there is a finite density of zero modes in the infinite volume limit. By performing dynamical simulations on a small lattice we determine the phase diagram of the model and demonstrate the validity of mean field calculations of the phase boundaries. From calculations of mf we identify the PM1 and PM2 phases. It is shown that the inclusion of fermion dynamics eliminates the small eigenvalues of M present in the quenched model and as y is increased the eigenvalues now transfer from the real to imaginary axis via a path avoiding the origin. It is only by using the block Lanczos algorithm that simulations in certain regions of the phase plane are feasible, and only by our method of considering a fermion condensate can we calculate mf on such a small volume

Glasgow Theses Service

Development of explicitly correlated and many-body diagrammatic techniques for the investigation of electron-hole correlation in nanomaterials

Author: Bayne Michael Gray
Publication venue: SURFACE at Syracuse University
Publication date: 21/12/2018
Field of study

The focus of this work is to develop theoretical methods that will accurately describe electron-electron and electron-hole correlation in nanoparticles using many-body diagrammatic techniques. Diagrammatic representation is a more complex representation of quantum mechanics, however, it becomes a more advantageous representation in its application to this work due to its ease of use. Diagrammatic techniques are essential to the ve methods presented here as they prove to be pivotal in theoretical development as well as providing useful information in extracting and visualizing fundamental physics to make useful approximations to the methods. In the projected congruent transformed Hamiltonian method with partial innite order summation of diagrams (PCTH-PIOS), diagrammatic summation approach was used. In the geminal projected conguration interaction (GPCI) method, diagrammatic factorization techniques were used. In the geminal screened electron-hole interaction kernel (GSIK) method, we conclude that only linked diagrams contribute to the exciton binding energy. The approximation is made to only include rst order diagrams which captures the essential physics of the electron-hole interaction. In the composite control-variate stratied sampling (CCSS) method the calculation of the vertices of the diagrams using stratied sampling. Lastly we investigate the eect of electromagnetic (EM) eld on the generation of 2e-2h states from 1e-1h states. In this work, time independent diagrams are calculated once and used for the rest of the calculation. Diagrammatic techniques are essential to the theoretical development of the methods in this work for understanding the optical and electronic properties of nanoparticles

Syracuse University Research Facility and Collaborative Environment

NASA/American Society for Engineering Education (ASEE) Summer Faculty Fellowship Program, 1991

Author: Tiwari Surendra N.
Publication venue
Publication date
Field of study

In a series of collaborations between NASA research and development centers and nearby universities, engineering faculty members spent 10 weeks working with professional peers on research. The Summer Faculty Program Committee of the American Society of Engineering Education supervises the programs. The objects were the following: (1) to further the professional knowledge of qualified engineering and science faculty members; (2) to stimulate and exchange ideas between participants and NASA; (3) to enrich and refresh the research and teaching activities of the participants' institutions; and (4) to contribute to the research objectives of the NASA center

NASA Technical Reports Server

Modeling EMI Resulting from a Signal Via Transition Through Power/Ground Layers

Author: Archambeault Bruce
Cui Wei
Drewniak James L.
Li Min
White Doug
Ye Xiaoning
Publication venue: Scholars\u27 Mine
Publication date: 01/03/2000
Field of study

Signal transitioning through layers on vias are very common in multi-layer printed circuit board (PCB) design. For a signal via transitioning through the internal power and ground planes, the return current must switch from one reference plane to another reference plane. The discontinuity of the return current at the via excites the power and ground planes, and results in noise on the power bus that can lead to signal integrity, as well as EMI problems. Numerical methods, such as the finite-difference time-domain (FDTD), Moment of Methods (MoM), and partial element equivalent circuit (PEEC) method, were employed herein to study this problem. The modeled results are supported by measurements. In addition, a common EMI mitigation approach of adding a decoupling capacitor was investigated with the FDTD method

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Applications

Author
Publication venue: Walter de Gruyter GmbH
Publication date: 07/12/2020
Field of study

Pure OAI Repository

A priori convergence analysis for Krylov subspace eigensolvers

Author: Zhou Ming (gnd: 1024105881)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

This thesis contributes to the convergence theory of Krylov subspace eigensolvers for discretized self-adjoint elliptic differential operators. A central topic refers to a priori convergence estimates with weak assumptions and concise bounds, which can reasonably predict the convergence rate, in particular for clustered eigenvalues. By avoiding the dependence on current approximate eigenvalues, such estimates significantly improve certain state-of-the-art estimates with regard to their sharpness and applicability.Diese Arbeit widmet sich der Konvergenztheorie Krylovraum-basierter Lösungsverfahren für Eigenwertprobleme diskretisierter selbstadjungierter elliptischer Differentialoperatoren. Ein zentrales Thema bezieht sich auf A-priori-Konvergenzabschätzungen mit schwachen Voraussetzungen und prägnanten Schranken, welche die Konvergenzrate vernünftig vorhersagen können, insbesondere bei dicht aneinanderliegenden Eigenwerten. Durch Vermeidung der Abhängigkeit von aktuellen Näherungseigenwerten lassen sich einige State-of-the-art-Abschätzungen hinsichtlich Schärfe und Anwendbarkeit deutlich verbessern

Rostocker Dokumentenserver