Search CORE

242 research outputs found

Structure Preserving Parallel Algorithms for Solving the Bethe-Salpeter Eigenvalue Problem

Author: Anderson
Bai
Bai
Benner
Benner
Bhatia
Bischof
Blackford
Byers
Chao Yang
Dancoff
Dhillon
Fahey
Faßbender
Felipe H. da Jornada
Golub
Granat
Grüning
Grüning
Jack Deslippe
Kressner
Lin
Luszczek
Mackey
Mackey
Marek
Mehl
Meiyue Shao
Puschnig
Rocca
Rohlfing
Salpeter
Steven G. Louie
Tamm
Ward
Ward
Willems
Xu
Xu
Zimmermann
Publication venue: 'Elsevier BV'
Publication date: 18/09/2015
Field of study

The Bethe-Salpeter eigenvalue problem is a dense structured eigenvalue problem arising from discretized Bethe-Salpeter equation in the context of computing exciton energies and states. A computational challenge is that at least half of the eigenvalues and the associated eigenvectors are desired in practice. We establish the equivalence between Bethe-Salpeter eigenvalue problems and real Hamiltonian eigenvalue problems. Based on theoretical analysis, structure preserving algorithms for a class of Bethe-Salpeter eigenvalue problems are proposed. We also show that for this class of problems all eigenvalues obtained from the Tamm-Dancoff approximation are overestimated. In order to solve large scale problems of practical interest, we discuss parallel implementations of our algorithms targeting distributed memory systems. Several numerical examples are presented to demonstrate the efficiency and accuracy of our algorithms

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Numerical Analysis

Author: Trefethen Lloyd N.
Publication venue: Princeton University Press
Publication date: 01/05/2006
Field of study

Acknowledgements: This article will appear in the forthcoming Princeton Companion to Mathematics, edited by Timothy Gowers with June Barrow-Green, to be published by Princeton University Press.\ud \ud In preparing this essay I have benefitted from the advice of many colleagues who corrected a number of errors of fact and emphasis. I have not always followed their advice, however, preferring as one friend put it, to "put my head above the parapet". So I must take full responsibility for errors and omissions here.\ud \ud With thanks to: Aurelio Arranz, Alexander Barnett, Carl de Boor, David Bindel, Jean-Marc Blanc, Mike Bochev, Folkmar Bornemann, Richard Brent, Martin Campbell-Kelly, Sam Clark, Tim Davis, Iain Duff, Stan Eisenstat, Don Estep, Janice Giudice, Gene Golub, Nick Gould, Tim Gowers, Anne Greenbaum, Leslie Greengard, Martin Gutknecht, Raphael Hauser, Des Higham, Nick Higham, Ilse Ipsen, Arieh Iserles, David Kincaid, Louis Komzsik, David Knezevic, Dirk Laurie, Randy LeVeque, Bill Morton, John C Nash, Michael Overton, Yoshio Oyanagi, Beresford Parlett, Linda Petzold, Bill Phillips, Mike Powell, Alex Prideaux, Siegfried Rump, Thomas Schmelzer, Thomas Sonar, Hans Stetter, Gil Strang, Endre Süli, Defeng Sun, Mike Sussman, Daniel Szyld, Garry Tee, Dmitry Vasilyev, Andy Wathen, Margaret Wright and Steve Wright

Oxford University Research Archive

A hierarchically blocked Jacobi SVD algorithm for single and multiple graphics processing units

Author: Novaković Vedran
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 27/09/2014
Field of study

We present a hierarchically blocked one-sided Jacobi algorithm for the singular value decomposition (SVD), targeting both single and multiple graphics processing units (GPUs). The blocking structure reflects the levels of GPU's memory hierarchy. The algorithm may outperform MAGMA's dgesvd, while retaining high relative accuracy. To this end, we developed a family of parallel pivot strategies on GPU's shared address space, but applicable also to inter-GPU communication. Unlike common hybrid approaches, our algorithm in a single GPU setting needs a CPU for the controlling purposes only, while utilizing GPU's resources to the fullest extent permitted by the hardware. When required by the problem size, the algorithm, in principle, scales to an arbitrary number of GPU nodes. The scalability is demonstrated by more than twofold speedup for sufficiently large matrices on a Tesla S2050 system with four GPUs vs. a single Fermi card.Comment: Accepted for publication in SIAM Journal on Scientific Computin

arXiv.org e-Print Archive

CiteSeerX

A Parallel Solver for Graph Laplacians

Author: Boman Erik G.
Brannick James
Kepner Jeremy
Napov Artem
Ruge John W.
Spielman Daniel A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/07/2018
Field of study

Problems from graph drawing, spectral clustering, network flow and graph partitioning can all be expressed in terms of graph Laplacian matrices. There are a variety of practical approaches to solving these problems in serial. However, as problem sizes increase and single core speeds stagnate, parallelism is essential to solve such problems quickly. We present an unsmoothed aggregation multigrid method for solving graph Laplacians in a distributed memory setting. We introduce new parallel aggregation and low degree elimination algorithms targeted specifically at irregular degree graphs. These algorithms are expressed in terms of sparse matrix-vector products using generalized sum and product operations. This formulation is amenable to linear algebra using arbitrary distributions and allows us to operate on a 2D sparse matrix distribution, which is necessary for parallel scalability. Our solver outperforms the natural parallel extension of the current state of the art in an algorithmic comparison. We demonstrate scalability to 576 processes and graphs with up to 1.7 billion edges.Comment: PASC '18, Code: https://github.com/ligmg/ligm

arXiv.org e-Print Archive

Crossref

Computing subdominant unstable modes of turbulent plasma with a parallel Jacobi-Davidson eigensolver

Author: Arbenz
Baker
Dannert
Fokkema
Genseberger
Hernandez
Hernandez
Heuveline
Hochstenbach
Jacobi
Merz
Morgan
Morgan
Paige
Roman
Romero
Romero
Simoncini
Sleijpen
Sleijpen
Sleijpen
Sleijpen
Stathopoulos
Stathopoulos
Publication venue: 'Wiley'
Publication date: 10/12/2011
Field of study

In the numerical solution of large-scale eigenvalue problems, Davidson-type methods are an increasingly popular alternative to Krylov eigensolvers. The main motivation is to avoid the expensive factorizations that are often needed by Krylov solvers when the problem is generalized or interior eigenvalues are desired. In Davidson-type methods, the factorization is replaced by iterative linear solvers that can be accelerated by a smart preconditioner. Jacobi-Davidson is one of the most effective variants. However, parallel implementations of this method are not widely available, particularly for non-symmetric problems. We present a parallel implementation that has been included in SLEPc, the Scalable Library for Eigenvalue Problem Computations, and test it in the context of a highly scalable plasma turbulence simulation code. We analyze its parallel efficiency and compare it with a Krylov-Schur eigensolver. © 2011 John Wiley and Sons, Ltd..The authors are indebted to Florian Merz for providing us with the test cases and for his useful suggestions. The authors acknowledge the computer resources provided by the Barcelona Supercomputing Center (BSC). This work was supported by the Spanish Ministerio de Ciencia e Innovacion under project TIN2009-07519.Romero Alcalde, E.; Román Moltó, JE. (2011). Computing subdominant unstable modes of turbulent plasma with a parallel Jacobi-Davidson eigensolver. Concurrency and Computation: Practice and Experience. 23:2179-2191. https://doi.org/10.1002/cpe.1740S2179219123Hochstenbach, M. E., & Notay, Y. (2009). Controlling Inner Iterations in the Jacobi–Davidson Method. SIAM Journal on Matrix Analysis and Applications, 31(2), 460-477. doi:10.1137/080732110Heuveline, V., Philippe, B., & Sadkane, M. (1997). Numerical Algorithms, 16(1), 55-75. doi:10.1023/a:1019126827697Arbenz, P., Bečka, M., Geus, R., Hetmaniuk, U., & Mengotti, T. (2006). On a parallel multilevel preconditioned Maxwell eigensolver. Parallel Computing, 32(2), 157-165. doi:10.1016/j.parco.2005.06.005Genseberger, M. (2010). Improving the parallel performance of a domain decomposition preconditioning technique in the Jacobi–Davidson method for large scale eigenvalue problems. Applied Numerical Mathematics, 60(11), 1083-1099. doi:10.1016/j.apnum.2009.07.004Stathopoulos, A., & McCombs, J. R. (2010). PRIMME. ACM Transactions on Mathematical Software, 37(2), 1-30. doi:10.1145/1731022.1731031Baker, C. G., Hetmaniuk, U. L., Lehoucq, R. B., & Thornquist, H. K. (2009). Anasazi software for the numerical solution of large-scale eigenvalue problems. ACM Transactions on Mathematical Software, 36(3), 1-23. doi:10.1145/1527286.1527287Hernandez, V., Roman, J. E., & Vidal, V. (2005). SLEPc. ACM Transactions on Mathematical Software, 31(3), 351-362. doi:10.1145/1089014.1089019Romero, E., Cruz, M. B., Roman, J. E., & Vasconcelos, P. B. (2011). A Parallel Implementation of the Jacobi-Davidson Eigensolver for Unsymmetric Matrices. High Performance Computing for Computational Science – VECPAR 2010, 380-393. doi:10.1007/978-3-642-19328-6_35Romero, E., & Roman, J. E. (2010). A Parallel Implementation of the Jacobi-Davidson Eigensolver and Its Application in a Plasma Turbulence Code. Lecture Notes in Computer Science, 101-112. doi:10.1007/978-3-642-15291-7_11Über ein leichtes Verfahren die in der Theorie der Säcularstörungen vorkommenden Gleichungen numerisch aufzulösen*). (1846). Journal für die reine und angewandte Mathematik (Crelles Journal), 1846(30), 51-94. doi:10.1515/crll.1846.30.51G. Sleijpen, G. L., & Van der Vorst, H. A. (1996). A Jacobi–Davidson Iteration Method for Linear Eigenvalue Problems. SIAM Journal on Matrix Analysis and Applications, 17(2), 401-425. doi:10.1137/s0895479894270427Fokkema, D. R., Sleijpen, G. L. G., & Van der Vorst, H. A. (1998). Jacobi--Davidson Style QR and QZ Algorithms for the Reduction of Matrix Pencils. SIAM Journal on Scientific Computing, 20(1), 94-125. doi:10.1137/s1064827596300073Morgan, R. B. (1991). Computing interior eigenvalues of large matrices. Linear Algebra and its Applications, 154-156, 289-309. doi:10.1016/0024-3795(91)90381-6Paige, C. C., Parlett, B. N., & van der Vorst, H. A. (1995). Approximate solutions and eigenvalue bounds from Krylov subspaces. Numerical Linear Algebra with Applications, 2(2), 115-133. doi:10.1002/nla.1680020205Stathopoulos, A., Saad, Y., & Wu, K. (1998). Dynamic Thick Restarting of the Davidson, and the Implicitly Restarted Arnoldi Methods. SIAM Journal on Scientific Computing, 19(1), 227-245. doi:10.1137/s1064827596304162Sleijpen, G. L. G., Booten, A. G. L., Fokkema, D. R., & van der Vorst, H. A. (1996). Jacobi-davidson type methods for generalized eigenproblems and polynomial eigenproblems. BIT Numerical Mathematics, 36(3), 595-633. doi:10.1007/bf01731936Balay S Buschelman K Eijkhout V Gropp W Kaushik D Knepley M McInnes LC Smith B Zhang H PETSc users manual 2010Hernandez, V., Roman, J. E., & Tomas, A. (2007). Parallel Arnoldi eigensolvers with enhanced scalability via global communications rearrangement. Parallel Computing, 33(7-8), 521-540. doi:10.1016/j.parco.2007.04.004Dannert, T., & Jenko, F. (2005). Gyrokinetic simulation of collisionless trapped-electron mode turbulence. Physics of Plasmas, 12(7), 072309. doi:10.1063/1.1947447Roman, J. E., Kammerer, M., Merz, F., & Jenko, F. (2010). Fast eigenvalue calculations in a massively parallel plasma turbulence code. Parallel Computing, 36(5-6), 339-358. doi:10.1016/j.parco.2009.12.001Merz, F., & Jenko, F. (2010). Nonlinear interplay of TEM and ITG turbulence and its effect on transport. Nuclear Fusion, 50(5), 054005. doi:10.1088/0029-5515/50/5/054005Simoncini, V., & Szyld, D. B. (2002). Flexible Inner-Outer Krylov Subspace Methods. SIAM Journal on Numerical Analysis, 40(6), 2219-2239. doi:10.1137/s0036142902401074Morgan, R. B. (2002). GMRES with Deflated Restarting. SIAM Journal on Scientific Computing, 24(1), 20-37. doi:10.1137/s106482759936465

Crossref

RiuNet

Fast and accurate con-eigenvalue algorithm for optimal rational approximations

Author: Beylkin G.
Haut T. S.
Publication venue
Publication date: 01/01/2012
Field of study

The need to compute small con-eigenvalues and the associated con-eigenvectors of positive-definite Cauchy matrices naturally arises when constructing rational approximations with a (near) optimally small

L^{\infty}

error. Specifically, given a rational function with

n

poles in the unit disk, a rational approximation with

m\ll n

poles in the unit disk may be obtained from the

m

th con-eigenvector of an

n\times n

Cauchy matrix, where the associated con-eigenvalue

\lambda_{m}>0

gives the approximation error in the

L^{\infty}

norm. Unfortunately, standard algorithms do not accurately compute small con-eigenvalues (and the associated con-eigenvectors) and, in particular, yield few or no correct digits for con-eigenvalues smaller than the machine roundoff. We develop a fast and accurate algorithm for computing con-eigenvalues and con-eigenvectors of positive-definite Cauchy matrices, yielding even the tiniest con-eigenvalues with high relative accuracy. The algorithm computes the

m

th con-eigenvalue in

\mathcal{O}(m^{2}n)

operations and, since the con-eigenvalues of positive-definite Cauchy matrices decay exponentially fast, we obtain (near) optimal rational approximations in

\mathcal{O}(n(\log\delta^{-1})^{2})

operations, where

\delta

is the approximation error in the

L^{\infty}

norm. We derive error bounds demonstrating high relative accuracy of the computed con-eigenvalues and the high accuracy of the unit con-eigenvectors. We also provide examples of using the algorithm to compute (near) optimal rational approximations of functions with singularities and sharp transitions, where approximation errors close to machine precision are obtained. Finally, we present numerical tests on random (complex-valued) Cauchy matrices to show that the algorithm computes all the con-eigenvalues and con-eigenvectors with nearly full precision

arXiv.org e-Print Archive

CiteSeerX