17 research outputs found

    NEP: A Module for the Parallel Solution of Nonlinear Eigenvalue Problems in SLEPc

    Full text link
    [EN] SLEPc is a parallel library for the solution of various types of large-scale eigenvalue problems. Over the past few years, we have been developing a module within SLEPc, called NEP, that is intended for solving nonlinear eigenvalue problems. These problems can be defined by means of a matrix-valued function that depends nonlinearly on a single scalar parameter. We do not consider the particular case of polynomial eigenvalue problems (which are implemented in a different module in SLEPc) and focus here on rational eigenvalue problems and other general nonlinear eigenproblems involving square roots or any other nonlinear function. The article discusses how the NEP module has been designed to fit the needs of applications and provides a description of the available solvers, including some implementation details such as parallelization. Several test problems coming from real applications are used to evaluate the performance and reliability of the solvers.This work was partially funded by the Spanish Agencia Estatal de Investigacion AEI http://ciencia.gob.es under grants TIN2016-75985-P AEI and PID2019-107379RB-I00 AEI (including European Commission FEDER funds).Campos, C.; Roman, JE. (2021). NEP: A Module for the Parallel Solution of Nonlinear Eigenvalue Problems in SLEPc. ACM Transactions on Mathematical Software. 47(3):1-29. https://doi.org/10.1145/3447544S12947

    Parallel Calculation of the Electron Correlation Energy

    Full text link
    Calculation of electron correlation energy in molecules involves a very important computational effort, even in the simplest cases. Nowadays, using the new parallel libraries (PETSc and SLEPc) and MPI, we can resolve this calculation faster and with very big molecules. This result is a very important advance in chemical computation.Ramos Peinado, E. (2014). Parallel Calculation of the Electron Correlation Energy. OALib Journal. (1):1-15. doi:10.4236/oalib.1100411S115

    Dense and sparse parallel linear algebra algorithms on graphics processing units

    Full text link
    Una línea de desarrollo seguida en el campo de la supercomputación es el uso de procesadores de propósito específico para acelerar determinados tipos de cálculo. En esta tesis estudiamos el uso de tarjetas gráficas como aceleradores de la computación y lo aplicamos al ámbito del álgebra lineal. En particular trabajamos con la biblioteca SLEPc para resolver problemas de cálculo de autovalores en matrices de gran dimensión, y para aplicar funciones de matrices en los cálculos de aplicaciones científicas. SLEPc es una biblioteca paralela que se basa en el estándar MPI y está desarrollada con la premisa de ser escalable, esto es, de permitir resolver problemas más grandes al aumentar las unidades de procesado. El problema lineal de autovalores, Ax = lambda x en su forma estándar, lo abordamos con el uso de técnicas iterativas, en concreto con métodos de Krylov, con los que calculamos una pequeña porción del espectro de autovalores. Este tipo de algoritmos se basa en generar un subespacio de tamaño reducido (m) en el que proyectar el problema de gran dimensión (n), siendo m << n. Una vez se ha proyectado el problema, se resuelve este mediante métodos directos, que nos proporcionan aproximaciones a los autovalores del problema inicial que queríamos resolver. Las operaciones que se utilizan en la expansión del subespacio varían en función de si los autovalores deseados están en el exterior o en el interior del espectro. En caso de buscar autovalores en el exterior del espectro, la expansión se hace mediante multiplicaciones matriz-vector. Esta operación la realizamos en la GPU, bien mediante el uso de bibliotecas o mediante la creación de funciones que aprovechan la estructura de la matriz. En caso de autovalores en el interior del espectro, la expansión requiere resolver sistemas de ecuaciones lineales. En esta tesis implementamos varios algoritmos para la resolución de sistemas de ecuaciones lineales para el caso específico de matrices con estructura tridiagonal a bloques, que se ejecutan en GPU. En el cálculo de las funciones de matrices hemos de diferenciar entre la aplicación directa de una función sobre una matriz, f(A), y la aplicación de la acción de una función de matriz sobre un vector, f(A)b. El primer caso implica un cálculo denso que limita el tamaño del problema. El segundo permite trabajar con matrices dispersas grandes, y para resolverlo también hacemos uso de métodos de Krylov. La expansión del subespacio se hace mediante multiplicaciones matriz-vector, y hacemos uso de GPUs de la misma forma que al resolver autovalores. En este caso el problema proyectado comienza siendo de tamaño m, pero se incrementa en m en cada reinicio del método. La resolución del problema proyectado se hace aplicando una función de matriz de forma directa. Nosotros hemos implementado varios algoritmos para calcular las funciones de matrices raíz cuadrada y exponencial, en las que el uso de GPUs permite acelerar el cálculo.One line of development followed in the field of supercomputing is the use of specific purpose processors to speed up certain types of computations. In this thesis we study the use of graphics processing units as computer accelerators and apply it to the field of linear algebra. In particular, we work with the SLEPc library to solve large scale eigenvalue problems, and to apply matrix functions in scientific applications. SLEPc is a parallel library based on the MPI standard and is developed with the premise of being scalable, i.e. to allow solving larger problems by increasing the processing units. We address the linear eigenvalue problem, Ax = lambda x in its standard form, using iterative techniques, in particular with Krylov's methods, with which we calculate a small portion of the eigenvalue spectrum. This type of algorithms is based on generating a subspace of reduced size (m) in which to project the large dimension problem (n), being m << n. Once the problem has been projected, it is solved by direct methods, which provide us with approximations of the eigenvalues of the initial problem we wanted to solve. The operations used in the expansion of the subspace vary depending on whether the desired eigenvalues are from the exterior or from the interior of the spectrum. In the case of searching for exterior eigenvalues, the expansion is done by matrix-vector multiplications. We do this on the GPU, either by using libraries or by creating functions that take advantage of the structure of the matrix. In the case of eigenvalues from the interior of the spectrum, the expansion requires solving linear systems of equations. In this thesis we implemented several algorithms to solve linear systems of equations for the specific case of matrices with a block-tridiagonal structure, that are run on GPU. In the computation of matrix functions we have to distinguish between the direct application of a matrix function, f(A), and the action of a matrix function on a vector, f(A)b. The first case involves a dense computation that limits the size of the problem. The second allows us to work with large sparse matrices, and to solve it we also make use of Krylov's methods. The expansion of subspace is done by matrix-vector multiplication, and we use GPUs in the same way as when solving eigenvalues. In this case the projected problem starts being of size m, but it is increased by m on each restart of the method. The solution of the projected problem is done by directly applying a matrix function. We have implemented several algorithms to compute the square root and the exponential matrix functions, in which the use of GPUs allows us to speed up the computation.Una línia de desenvolupament seguida en el camp de la supercomputació és l'ús de processadors de propòsit específic per a accelerar determinats tipus de càlcul. En aquesta tesi estudiem l'ús de targetes gràfiques com a acceleradors de la computació i ho apliquem a l'àmbit de l'àlgebra lineal. En particular treballem amb la biblioteca SLEPc per a resoldre problemes de càlcul d'autovalors en matrius de gran dimensió, i per a aplicar funcions de matrius en els càlculs d'aplicacions científiques. SLEPc és una biblioteca paral·lela que es basa en l'estàndard MPI i està desenvolupada amb la premissa de ser escalable, açò és, de permetre resoldre problemes més grans en augmentar les unitats de processament. El problema lineal d'autovalors, Ax = lambda x en la seua forma estàndard, ho abordem amb l'ús de tècniques iteratives, en concret amb mètodes de Krylov, amb els quals calculem una xicoteta porció de l'espectre d'autovalors. Aquest tipus d'algorismes es basa a generar un subespai de grandària reduïda (m) en el qual projectar el problema de gran dimensió (n), sent m << n. Una vegada s'ha projectat el problema, es resol aquest mitjançant mètodes directes, que ens proporcionen aproximacions als autovalors del problema inicial que volíem resoldre. Les operacions que s'utilitzen en l'expansió del subespai varien en funció de si els autovalors desitjats estan en l'exterior o a l'interior de l'espectre. En cas de cercar autovalors en l'exterior de l'espectre, l'expansió es fa mitjançant multiplicacions matriu-vector. Aquesta operació la realitzem en la GPU, bé mitjançant l'ús de biblioteques o mitjançant la creació de funcions que aprofiten l'estructura de la matriu. En cas d'autovalors a l'interior de l'espectre, l'expansió requereix resoldre sistemes d'equacions lineals. En aquesta tesi implementem diversos algorismes per a la resolució de sistemes d'equacions lineals per al cas específic de matrius amb estructura tridiagonal a blocs, que s'executen en GPU. En el càlcul de les funcions de matrius hem de diferenciar entre l'aplicació directa d'una funció sobre una matriu, f(A), i l'aplicació de l'acció d'una funció de matriu sobre un vector, f(A)b. El primer cas implica un càlcul dens que limita la grandària del problema. El segon permet treballar amb matrius disperses grans, i per a resoldre-ho també fem ús de mètodes de Krylov. L'expansió del subespai es fa mitjançant multiplicacions matriu-vector, i fem ús de GPUs de la mateixa forma que en resoldre autovalors. En aquest cas el problema projectat comença sent de grandària m, però s'incrementa en m en cada reinici del mètode. La resolució del problema projectat es fa aplicant una funció de matriu de forma directa. Nosaltres hem implementat diversos algorismes per a calcular les funcions de matrius arrel quadrada i exponencial, en les quals l'ús de GPUs permet accelerar el càlcul.Lamas Daviña, A. (2018). Dense and sparse parallel linear algebra algorithms on graphics processing units [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/112425TESI

    Implementación paralela de métodos de Krylov con reinicio para problemas de valores propios y singulares

    Full text link
    Esta tesis aborda la paralelización de los métodos de Krylov con reinicio para problemas de valores propios y valores singulares (SVD). Estos métodos son de naturaleza iterativa y resultan adecuados para encontrar unos pocos valores propios o singulares de problemas dispersos. El procedimiento de ortogonalización suele ser la parte más costosa de este tipo de métodos, por lo que ha recibido especial atención en esta tesis, proponiendo y validando nuevos algoritmos para mejorar sus prestaciones paralelas. La implementación se ha realizado en el marco de la librería SLEPc, que proporciona una interfaz orientada a objetos para la resolución iterativa de problemas de valores propios o singulares. SLEPc está basada en la librería PETSc, que dispone de implementaciones paralelas de métodos iterativos para la resolución de sistemas lineales, precondicionadores, matrices dispersas y vectores. Ambas librerías están optimizadas para su ejecución en máquinas paralelas de memoria distribuida y con problemas dispersos de gran dimensión. Esta implementación incorpora los métodos para valores propios de Arnoldi con reinicio explícito, de Lanczos (incluyendo variantes semiortogonales) con reinicio explícito, y versiones de Krylov-Schur (equivalente al reinicio implícito) para problemas no Hermitianos y Hermitianos (Lanczos con reinicio grueso). Estos métodos comparten una interfaz común, permitiendo su comparación de forma sencilla, característica que no está disponible en otras implementaciones. Las mismas técnicas utilizadas para problemas de valores propios se han adaptado a los métodos de Golub-Kahan-Lanczos con reinicio explícito y grueso para problemas de valores singulares, de los que no existe ninguna otra implementación paralela con paso de mensajes. Cada uno de los métodos se ha validado mediante una batería de pruebas con matrices procedentes de aplicaciones reales. Las prestaciones paralelas se han medido en máquinas tipo cluster, comprobando una buena escalabilidad incTomás Domínguez, A. (2009). Implementación paralela de métodos de Krylov con reinicio para problemas de valores propios y singulares [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/5082Palanci

    Distributed Spectral Graph Methods for Analyzing Large-Scale Unstructured Biomedical Data

    Get PDF
    There is an ever-expanding body of biological data, growing in size and complexity, out- stripping the capabilities of standard database tools or traditional analysis techniques. Such examples include molecular dynamics simulations, drug-target interactions, gene regulatory networks, and high-throughput imaging. Large-scale acquisition and curation biological data has already yielded results in the form of lower costs for genome sequencing and greater cov- erage in databases such as GenBank, and is viewed as the future of biocuration. The “big data” philosophy and its associated paradigms and frameworks have the potential to uncover solutions to problems otherwise intractable with more traditional investigative techniques. Here, we focus on two biological systems whose data form large, undirected graphs. First, we develop a quantitative model of ciliary motion phenotypes, using spectral graph methods for unsupervised latent pattern discovery. Second, we apply similar techniques to identify a mapping between physiochemical structure and odor percept in human olfaction. In both cases, we experienced computational bottlenecks in our statistical machinery, necessitating the creation of a new analysis framework. At the core of this framework is a distributed hierarchical eigensolver, which we compare directly to other popular solvers. We demon- strate its essential role in enabling the discovery of novel ciliary motion phenotypes and in identifying physiochemical-perceptual associations

    Stable Sparse Orthogonal Factorization of Ill-Conditioned Banded Matrices for Parallel Computing

    Get PDF
    Sequential and parallel algorithms based on the LU factorization or the QR factorization have been intensely studied and widely used in the problems of computation with large-scale ill-conditioned banded matrices. Great concerns on existing methods include ill-conditioning, sparsity of factor matrices, computational complexity, and scalability. In this dissertation, we study a sparse orthogonal factorization of a banded matrix motivated by parallel computing. Specifically, we develop a process to factorize a banded matrix as a product of a sparse orthogonal matrix and a sparse matrix which can be transformed to an upper triangular matrix by column permutations. We prove that the proposed process requires low complexity, and it is numerically stable, maintaining similar stability results as the modified Gram-Schmidt process. On this basis, we develop a parallel algorithm for the factorization in a distributed computing environment. Through an analysis of its performance, we show that the communication costs reach the theoretical least upper bounds, while its parallel complexity or speedup approaches the optimal bound. For an ill-conditioned banded system, we construct a sequential solver that breaks it down into small-scale underdetermined systems, which are solved by the proposed factorization with high accuracy. We also implement a parallel solver with strategies to treat the memory issue appearing in extra large-scale linear systems of size over one billion. Numerical experiments confirm the theoretical results derived in this thesis, and demonstrate the superior accuracy and scalability of the proposed solvers for ill-conditioned linear systems, comparing to the most commonly used direct solvers

    Strategies for spectrum slicing based on restarted Lanczos methods

    Full text link
    In the context of symmetric-definite generalized eigenvalue problems, it is often required to compute all eigenvalues contained in a prescribed interval. For large-scale problems, the method of choice is the so-called spectrum slicing technique: a shift-and-invert Lanczos method combined with a dynamic shift selection that sweeps the interval in a smart way. This kind of strategies were proposed initially in the context of unrestarted Lanczos methods, back in the 1990's. We propose variations that try to incorporate recent developments in the field of Krylov methods, including thick restarting in the Lanczos solver and a rational Krylov update when moving from one shift to the next. We discuss a parallel implementation in the SLEPc library and provide performance results. © 2012 Springer Science+Business Media, LLC.This work was supported by the Spanish Ministerio de Ciencia e Innovacion under grant TIN2009-07519.Campos González, MC.; Román Moltó, JE. (2012). Strategies for spectrum slicing based on restarted Lanczos methods. Numerical Algorithms. 60(2):279-295. https://doi.org/10.1007/s11075-012-9564-z279295602Amestoy, P.R, Duff, I.S., L’Excellent, J.Y.: Multifrontal parallel distributed symmetric and unsymmetric solvers. Comput. Methods Appl. Mech. Eng. 184(2–4), 501–520 (2000)Balay, S., Brown, J., Buschelman, K., Eijkhout, V., Gropp, W., Kaushik, D., Knepley, M., McInnes, L.C., Smith, B., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11 - Revision 3.2, Argonne National Laboratory (2011)Ericsson, T., Ruhe, A.: The spectral transformation Lanczos method for the numerical solution of large sparse generalized symmetric eigenvalue problems. Math. Comput. 35(152), 1251–1268 (1980)Grimes, R.G., Lewis, J.G., Simon, H.D.: A shifted block Lanczos algorithm for solving sparse symmetric generalized eigenproblems. SIAM J. Matrix Anal. Appl. 15(1), 228–272 (1994)Hernandez, V., Roman, J.E., Vidal, V.: SLEPc: a scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw. 31(3), 351–362 (2005)Hernandez, V., Roman, J.E., Tomas, A.: Parallel Arnoldi eigensolvers with enhanced scalability via global communications rearrangement. Parallel Comput. 33(7–8), 521–540 (2007)Marques, O.A.: BLZPACK: description and user’s guide. Tech. Rep. TR/PA/95/30, CERFACS, Toulouse, France (1995)Meerbergen, K.: Changing poles in the rational Lanczos method for the Hermitian eigenvalue problem. Numer. Linear Algebra Appl. 8(1), 33–52 (2001)Meerbergen, K., Scott, J.: The design of a block rational Lanczos code with partial reorthogonalization and implicit restarting. Tech. Rep. RAL-TR-2000-011, Rutherford Appleton Laboratory (2000)Nour-Omid, B., Parlett, B.N., Ericsson, T., Jensen, P.S.: How to implement the spectral transformation. Math. Comput. 48(178), 663–673 (1987)Olsson, K.H.A., Ruhe, A.: Rational Krylov for eigenvalue computation and model order reduction. BIT Numer. Math. 46, 99–111 (2006)Ruhe, A.: Rational Krylov sequence methods for eigenvalue computation. Linear Algebra Appl. 58, 391–405 (1984)Ruhe, A.: Rational Krylov subspace method. In: Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H. (eds.) Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, Society for Industrial and Applied Mathematics, pp. 246–249. Philadelphia (2000)Sorensen, D.C.: Implicit application of polynomial filters in a k-step Arnoldi method. SIAM J. Matrix Anal. Appl. 13, 357–385 (1992)Stewart, G.W.: A Krylov–Schur algorithm for large eigenproblems. SIAM J. Matrix Anal. Appl. 23(3), 601–614 (2001)Vidal, AM., Garcia, V.M., Alonso, P., Bernabeu, M.O.: Parallel computation of the eigenvalues of symmetric Toeplitz matrices through iterative methods. J. Parallel Distrib. Comput. 68(8), 1113–1121 (2008)Wu, K., Simon, H.: Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22(2), 602–616 (2000)Zhang, H., Smith, B., Sternberg, M., Zapol, P.: SIPs: Shift-and-invert parallel spectral transformations. ACM Trans. Math. Softw. 33(2), 1–19 (2007

    Restarted Q-Arnoldi-type methods exploiting symmetry in quadratic eigenvalue problems

    Full text link
    The final publication is available at Springer via http://dx.doi.org/ 10.1007/s10543-016-0601-5.We investigate how to adapt the Q-Arnoldi method for the case of symmetric quadratic eigenvalue problems, that is, we are interested in computing a few eigenpairs of with M, C, K symmetric matrices. This problem has no particular structure, in the sense that eigenvalues can be complex or even defective. Still, symmetry of the matrices can be exploited to some extent. For this, we perform a symmetric linearization , where A, B are symmetric matrices but the pair (A, B) is indefinite and hence standard Lanczos methods are not applicable. We implement a symmetric-indefinite Lanczos method and enrich it with a thick-restart technique. This method uses pseudo inner products induced by matrix B for the orthogonalization of vectors (indefinite Gram-Schmidt). The projected problem is also an indefinite matrix pair. The next step is to write a specialized, memory-efficient version that exploits the block structure of A and B, referring only to the original problem matrices M, C, K as in the Q-Arnoldi method. This results in what we have called the Q-Lanczos method. Furthermore, we define a stabilized variant analog of the TOAR method. We show results obtained with parallel implementations in SLEPc.This work was supported by the Spanish Ministry of Economy and Competitiveness under Grant TIN2013-41049-P. Carmen Campos was supported by the Spanish Ministry of Education, Culture and Sport through an FPU Grant with reference AP2012-0608.Campos, C.; Román Moltó, JE. (2016). Restarted Q-Arnoldi-type methods exploiting symmetry in quadratic eigenvalue problems. BIT Numerical Mathematics. 56(4):1213-1236. https://doi.org/10.1007/s10543-016-0601-5S12131236564Bai, Z., Su, Y.: SOAR: a second-order Arnoldi method for the solution of the quadratic eigenvalue problem. SIAM J. Matrix Anal. Appl. 26(3), 640–659 (2005)Bai, Z., Day, D., Ye, Q.: ABLE: an adaptive block Lanczos method for non-Hermitian eigenvalue problems. SIAM J. Matrix Anal. Appl. 20(4), 1060–1082 (1999)Bai, Z., Ericsson, T., Kowalski, T.: Symmetric indefinite Lanczos method. In: Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H. (eds.) Templates for the solution of algebraic eigenvalue problems: a practical guide, pp. 249–260. Society for Industrial and Applied Mathematics, Philadelphia (2000)Balay, S., Abhyankar, S., Adams, M., Brown, J., Brune, P., Buschelman, K., Dalcin, L., Eijkhout, V., Gropp, W., Kaushik, D., Knepley, M., McInnes, L.C., Rupp, K., Smith, B., Zampini, S., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11 - Revision 3.6, Argonne National Laboratory (2015)Benner, P., Faßbender, H., Stoll, M.: Solving large-scale quadratic eigenvalue problems with Hamiltonian eigenstructure using a structure-preserving Krylov subspace method. Electron. Trans. Numer. Anal. 29, 212–229 (2008)Betcke, T., Higham, N.J., Mehrmann, V., Schröder, C., Tisseur, F.: NLEVP: a collection of nonlinear eigenvalue problems. ACM Trans. Math. Softw. 39(2), 7:1–7:28 (2013)Campos, C., Roman, J.E.: Parallel Krylov solvers for the polynomial eigenvalue problem in SLEPc (2015, submitted)Day, D.: An efficient implementation of the nonsymmetric Lanczos algorithm. SIAM J. Matrix Anal. Appl. 18(3), 566–589 (1997)Hernandez, V., Roman, J.E., Vidal, V.: SLEPc: a scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw. 31(3), 351–362 (2005)Hernandez, V., Roman, J.E., Tomas, A.: Parallel Arnoldi eigensolvers with enhanced scalability via global communications rearrangement. Parallel Comput. 33(7–8), 521–540 (2007)Jia, Z., Sun, Y.: A refined variant of SHIRA for the skew-Hamiltonian/Hamiltonian (SHH) pencil eigenvalue problem. Taiwan J. Math. 17(1), 259–274 (2013)Kressner, D., Roman, J.E.: Memory-efficient Arnoldi algorithms for linearizations of matrix polynomials in Chebyshev basis. Numer. Linear Algebra Appl. 21(4), 569–588 (2014)Kressner, D., Pandur, M.M., Shao, M.: An indefinite variant of LOBPCG for definite matrix pencils. Numer. Algorithms 66(4), 681–703 (2014)Lancaster, P.: Linearization of regular matrix polynomials. Electron. J. Linear Algebra 17, 21–27 (2008)Lancaster, P., Ye, Q.: Rayleigh-Ritz and Lanczos methods for symmetric matrix pencils. Linear Algebra Appl. 185, 173–201 (1993)Lu, D., Su, Y.: Two-level orthogonal Arnoldi process for the solution of quadratic eigenvalue problems (2012, manuscript)Meerbergen, K.: The Lanczos method with semi-definite inner product. BIT 41(5), 1069–1078 (2001)Meerbergen, K.: The Quadratic Arnoldi method for the solution of the quadratic eigenvalue problem. SIAM J. Matrix Anal. Appl. 30(4), 1463–1482 (2008)Mehrmann, V., Watkins, D.: Structure-preserving methods for computing eigenpairs of large sparse skew-Hamiltonian/Hamiltonian pencils. SIAM J. Sci. Comput. 22(6), 1905–1925 (2001)Parlett, B.N.: The symmetric Eigenvalue problem. Prentice-Hall, Englewood Cliffs (1980) (reissued with revisions by SIAM, Philadelphia)Parlett, B.N., Chen, H.C.: Use of indefinite pencils for computing damped natural modes. Linear Algebra Appl. 140(1), 53–88 (1990)Parlett, B.N., Taylor, D.R., Liu, Z.A.: A look-ahead Lánczos algorithm for unsymmetric matrices. Math. Comput. 44(169), 105–124 (1985)de Samblanx, G., Bultheel, A.: Nested Lanczos: implicitly restarting an unsymmetric Lanczos algorithm. Numer. Algorithms 18(1), 31–50 (1998)Sleijpen, G.L.G., Booten, A.G.L., Fokkema, D.R., van der Vorst, H.A.: Jacobi-Davidson type methods for generalized eigenproblems and polynomial eigenproblems. BIT 36(3), 595–633 (1996)Stewart, G.W.: A Krylov-Schur algorithm for large eigenproblems. SIAM J. Matrix Anal. Appl. 23(3), 601–614 (2001)Su, Y., Zhang, J., Bai, Z.: A compact Arnoldi algorithm for polynomial eigenvalue problems. In: Presented at RANMEP (2008)Tisseur, F.: Tridiagonal-diagonal reduction of symmetric indefinite pairs. SIAM J. Matrix Anal. Appl. 26(1), 215–232 (2004)Tisseur, F., Meerbergen, K.: The quadratic eigenvalue problem. SIAM Rev. 43(2), 235–286 (2001)Watkins, D.S.: The matrix Eigenvalue problem: GR and Krylov subspace methods. Society for Industrial and Applied Mathematics (2007)Wu, K., Simon, H.: Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22(2), 602–616 (2000

    Computing subdominant unstable modes of turbulent plasma with a parallel Jacobi-Davidson eigensolver

    Full text link
    In the numerical solution of large-scale eigenvalue problems, Davidson-type methods are an increasingly popular alternative to Krylov eigensolvers. The main motivation is to avoid the expensive factorizations that are often needed by Krylov solvers when the problem is generalized or interior eigenvalues are desired. In Davidson-type methods, the factorization is replaced by iterative linear solvers that can be accelerated by a smart preconditioner. Jacobi-Davidson is one of the most effective variants. However, parallel implementations of this method are not widely available, particularly for non-symmetric problems. We present a parallel implementation that has been included in SLEPc, the Scalable Library for Eigenvalue Problem Computations, and test it in the context of a highly scalable plasma turbulence simulation code. We analyze its parallel efficiency and compare it with a Krylov-Schur eigensolver. © 2011 John Wiley and Sons, Ltd..The authors are indebted to Florian Merz for providing us with the test cases and for his useful suggestions. The authors acknowledge the computer resources provided by the Barcelona Supercomputing Center (BSC). This work was supported by the Spanish Ministerio de Ciencia e Innovacion under project TIN2009-07519.Romero Alcalde, E.; Román Moltó, JE. (2011). Computing subdominant unstable modes of turbulent plasma with a parallel Jacobi-Davidson eigensolver. Concurrency and Computation: Practice and Experience. 23:2179-2191. https://doi.org/10.1002/cpe.1740S2179219123Hochstenbach, M. E., & Notay, Y. (2009). Controlling Inner Iterations in the Jacobi–Davidson Method. SIAM Journal on Matrix Analysis and Applications, 31(2), 460-477. doi:10.1137/080732110Heuveline, V., Philippe, B., & Sadkane, M. (1997). Numerical Algorithms, 16(1), 55-75. doi:10.1023/a:1019126827697Arbenz, P., Bečka, M., Geus, R., Hetmaniuk, U., & Mengotti, T. (2006). On a parallel multilevel preconditioned Maxwell eigensolver. Parallel Computing, 32(2), 157-165. doi:10.1016/j.parco.2005.06.005Genseberger, M. (2010). Improving the parallel performance of a domain decomposition preconditioning technique in the Jacobi–Davidson method for large scale eigenvalue problems. Applied Numerical Mathematics, 60(11), 1083-1099. doi:10.1016/j.apnum.2009.07.004Stathopoulos, A., & McCombs, J. R. (2010). PRIMME. ACM Transactions on Mathematical Software, 37(2), 1-30. doi:10.1145/1731022.1731031Baker, C. G., Hetmaniuk, U. L., Lehoucq, R. B., & Thornquist, H. K. (2009). Anasazi software for the numerical solution of large-scale eigenvalue problems. ACM Transactions on Mathematical Software, 36(3), 1-23. doi:10.1145/1527286.1527287Hernandez, V., Roman, J. E., & Vidal, V. (2005). SLEPc. ACM Transactions on Mathematical Software, 31(3), 351-362. doi:10.1145/1089014.1089019Romero, E., Cruz, M. B., Roman, J. E., & Vasconcelos, P. B. (2011). A Parallel Implementation of the Jacobi-Davidson Eigensolver for Unsymmetric Matrices. High Performance Computing for Computational Science – VECPAR 2010, 380-393. doi:10.1007/978-3-642-19328-6_35Romero, E., & Roman, J. E. (2010). A Parallel Implementation of the Jacobi-Davidson Eigensolver and Its Application in a Plasma Turbulence Code. Lecture Notes in Computer Science, 101-112. doi:10.1007/978-3-642-15291-7_11Über ein leichtes Verfahren die in der Theorie der Säcularstörungen vorkommenden Gleichungen numerisch aufzulösen*). (1846). Journal für die reine und angewandte Mathematik (Crelles Journal), 1846(30), 51-94. doi:10.1515/crll.1846.30.51G. Sleijpen, G. L., & Van der Vorst, H. A. (1996). A Jacobi–Davidson Iteration Method for Linear Eigenvalue Problems. SIAM Journal on Matrix Analysis and Applications, 17(2), 401-425. doi:10.1137/s0895479894270427Fokkema, D. R., Sleijpen, G. L. G., & Van der Vorst, H. A. (1998). Jacobi--Davidson Style QR and QZ Algorithms for the Reduction of Matrix Pencils. SIAM Journal on Scientific Computing, 20(1), 94-125. doi:10.1137/s1064827596300073Morgan, R. B. (1991). Computing interior eigenvalues of large matrices. Linear Algebra and its Applications, 154-156, 289-309. doi:10.1016/0024-3795(91)90381-6Paige, C. C., Parlett, B. N., & van der Vorst, H. A. (1995). Approximate solutions and eigenvalue bounds from Krylov subspaces. Numerical Linear Algebra with Applications, 2(2), 115-133. doi:10.1002/nla.1680020205Stathopoulos, A., Saad, Y., & Wu, K. (1998). Dynamic Thick Restarting of the Davidson, and the Implicitly Restarted Arnoldi Methods. SIAM Journal on Scientific Computing, 19(1), 227-245. doi:10.1137/s1064827596304162Sleijpen, G. L. G., Booten, A. G. L., Fokkema, D. R., & van der Vorst, H. A. (1996). Jacobi-davidson type methods for generalized eigenproblems and polynomial eigenproblems. BIT Numerical Mathematics, 36(3), 595-633. doi:10.1007/bf01731936Balay S Buschelman K Eijkhout V Gropp W Kaushik D Knepley M McInnes LC Smith B Zhang H PETSc users manual 2010Hernandez, V., Roman, J. E., & Tomas, A. (2007). Parallel Arnoldi eigensolvers with enhanced scalability via global communications rearrangement. Parallel Computing, 33(7-8), 521-540. doi:10.1016/j.parco.2007.04.004Dannert, T., & Jenko, F. (2005). Gyrokinetic simulation of collisionless trapped-electron mode turbulence. Physics of Plasmas, 12(7), 072309. doi:10.1063/1.1947447Roman, J. E., Kammerer, M., Merz, F., & Jenko, F. (2010). Fast eigenvalue calculations in a massively parallel plasma turbulence code. Parallel Computing, 36(5-6), 339-358. doi:10.1016/j.parco.2009.12.001Merz, F., & Jenko, F. (2010). Nonlinear interplay of TEM and ITG turbulence and its effect on transport. Nuclear Fusion, 50(5), 054005. doi:10.1088/0029-5515/50/5/054005Simoncini, V., & Szyld, D. B. (2002). Flexible Inner-Outer Krylov Subspace Methods. SIAM Journal on Numerical Analysis, 40(6), 2219-2239. doi:10.1137/s0036142902401074Morgan, R. B. (2002). GMRES with Deflated Restarting. SIAM Journal on Scientific Computing, 24(1), 20-37. doi:10.1137/s106482759936465

    Numerical simulation of a highly underexpanded carbon dioxide jet

    Get PDF
    The underexpanded jets are present in many processes such as rocket propulsion, mass spectrometry, fuel injection, as well as in the process called rapid expansion of supercritical solutions (RESS). In the RESS process a supercritical solution flows through a capillary nozzle until an expansion chamber where the strong changes in the thermodynamic properties of the solvent are used to encapsulate the solute in very fine particles. The research project was focused on the hydrodynamic modeling of an hypersonic carbon dioxide jet produced in the context of the RESS process. The mathematical modeling of the jet was developed using the set of the compressible Navier-Stokes equations along with the generalized Bender equation of state. This set of PDE was solved using an adaptive discontinuous Galerkin discretization for space and the exponential Rosenbrock-Euler method for the time integration. The numerical solver was implemented in C++ using several libraries such as deal.ii and Sacado-Trilinos
    corecore