49,928 research outputs found

    Efficient evaluation of matrix polynomials

    Full text link
    [EN] This paper presents a new family of methods for evaluating matrix polynomials more efficiently than the state-of-the-art Paterson-Stockmeyer method. Examples of the application of the methods to the Taylor polynomial approximation of matrix functions like the matrix exponential and matrix cosine are given. Their efficiency is compared with that of the best existing evaluation schemes for general polynomial and rational approximations, and also with a recent method based on mixed rational and polynomial approximants. For many years, the Paterson-Stockmeyer method has been considered the most efficient general method for the evaluation of matrix polynomials. In this paper we show that this statement is no longer true. Moreover, for many years rational approximations have been considered more efficient than polynomial approximations, although recently it has been shown that often this is not the case in the computation of the matrix exponential and matrix cosine. In this paper we show that in fact polynomial approximations provide a higher order of approximation than the state-of-the-art computational methods for rational approximations for the same cost in terms of matrix products. (C) 2017 Elsevier Inc. All rights reserved.This work has been supported by Spanish Ministerio de Economia y Competitividad and European Regional Development Fund (ERDF) grant TIN2014-59294-P. We thank the anonymous referee who revised this paper so thoroughly and carefully.Sastre, J. (2018). Efficient evaluation of matrix polynomials. Linear Algebra and its Applications. 539:229-250. https://doi.org/10.1016/j.laa.2017.11.010S22925053

    Efficient Evaluation of Matrix Polynomials beyond the Paterson-Stockmeyer Method

    Full text link
    [EN] Recently, two general methods for evaluating matrix polynomials requiring one matrix product less than the Paterson-Stockmeyer method were proposed, where the cost of evaluating a matrix polynomial is given asymptotically by the total number of matrix product evaluations. An analysis of the stability of those methods was given and the methods have been applied to Taylor-based implementations for computing the exponential, the cosine and the hyperbolic tangent matrix functions. Moreover, a particular example for the evaluation of the matrix exponential Taylor approximation of degree 15 requiring four matrix products was given, whereas the maximum polynomial degree available using Paterson-Stockmeyer method with four matrix products is 9. Based on this example, a new family of methods for evaluating matrix polynomials more efficiently than the Paterson-Stockmeyer method was proposed, having the potential to achieve a much higher efficiency, i.e., requiring less matrix products for evaluating a matrix polynomial of certain degree, or increasing the available degree for the same cost. However, the difficulty of these family of methods lies in the calculation of the coefficients involved for the evaluation of general matrix polynomials and approximations. In this paper, we provide a general matrix polynomial evaluation method for evaluating matrix polynomials requiring two matrix products less than the Paterson-Stockmeyer method for degrees higher than 30. Moreover, we provide general methods for evaluating matrix polynomial approximations of degrees 15 and 21 with four and five matrix product evaluations, respectively, whereas the maximum available degrees for the same cost with the Paterson-Stockmeyer method are 9 and 12, respectively. Finally, practical examples for evaluating Taylor approximations of the matrix cosine and the matrix logarithm accurately and efficiently with these new methods are given.This research was partially funded by the European Regional Development Fund (ERDF) and the Spanish Ministerio de Economia y Competitividad grant TIN2017-89314-P, and by the Programa de Apoyo a la Investigacion y Desarrollo 2018 of the Universitat Politecnica de Valencia grant PAID-06-18-SP20180016.Sastre, J.; Ibáñez González, JJ. (2021). Efficient Evaluation of Matrix Polynomials beyond the Paterson-Stockmeyer Method. Mathematics. 9(14):1-23. https://doi.org/10.3390/math9141600S12391

    On the expressive power of planar perfect matching and permanents of bounded treewidth matrices

    Get PDF
    Valiant introduced some 25 years ago an algebraic model of computation along with the complexity classes VP and VNP, which can be viewed as analogues of the classical classes P and NP. They are defined using non-uniform sequences of arithmetic circuits and provides a framework to study the complexity for sequences of polynomials. Prominent examples of difficult (that is, VNP-complete) problems in this model includes the permanent and hamiltonian polynomials. While the permanent and hamiltonian polynomials in general are difficult to evaluate, there have been research on which special cases of these polynomials admits efficient evaluation. For instance, Barvinok has shown that if the underlying matrix has bounded rank, both the permanent and the hamiltonian polynomials can be evaluated in polynomial time, and thus are in VP. Courcelle, Makowsky and Rotics have shown that for matrices of bounded treewidth several difficult problems (including evaluating the permanent and hamiltonian polynomials) can be solved efficiently. An earlier result of this flavour is Kasteleyn's theorem which states that the sum of weights of perfect matchings of a planar graph can be computed in polynomial time, and thus is in VP also. For general graphs this problem is VNP-complete. In this paper we investigate the expressive power of the above results. We show that the permanent and hamiltonian polynomials for matrices of bounded treewidth both are equivalent to arithmetic formulas. Also, arithmetic weakly skew circuits are shown to be equivalent to the sum of weights of perfect matchings of planar graphs.Comment: 14 page

    A numerical method to compute derivatives of functions of large complex matrices and its application to the overlap Dirac operator at finite chemical potential

    Full text link
    We present a method for the numerical calculation of derivatives of functions of general complex matrices. The method can be used in combination with any algorithm that evaluates or approximates the desired matrix function, in particular with implicit Krylov-Ritz-type approximations. An important use case for the method is the evaluation of the overlap Dirac operator in lattice Quantum Chromodynamics (QCD) at finite chemical potential, which requires the application of the sign function of a non-Hermitian matrix to some source vector. While the sign function of non-Hermitian matrices in practice cannot be efficiently approximated with source-independent polynomials or rational functions, sufficiently good approximating polynomials can still be constructed for each particular source vector. Our method allows for an efficient calculation of the derivatives of such implicit approximations with respect to the gauge field or other external parameters, which is necessary for the calculation of conserved lattice currents or the fermionic force in Hybrid Monte-Carlo or Langevin simulations. We also give an explicit deflation prescription for the case when one knows several eigenvalues and eigenvectors of the matrix being the argument of the differentiated function. We test the method for the two-sided Lanczos approximation of the finite-density overlap Dirac operator on realistic SU(3)SU(3) gauge field configurations on lattices with sizes as large as 14Ă—14314\times14^3 and 6Ă—1836\times18^3.Comment: 26 pages elsarticle style, 5 figures minor text changes, journal versio

    Heterogeneous computation of matrix products

    Get PDF
    Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016). Sofia (Bulgaria), October, 6-7, 2016.The work presented here is an experimental study of performance in execution time and energy consumption of matrix multiplications on a heterogeneous server. The server features three different devices: a multicore CPU, an NVIDIA Tesla GPU, and an Intel Xeon Phi coprocessor. Matrix multiplication is one of the most used linear algebra kernels and, consequently, applications that make an intensive use of this operation can greatly benefit from efficient implementations. This is the case of the evaluation of matrix polynomials, a core operation used to calculate many matrix functions, which involve a very large number of products of square matrices. Although there exist many proposals for efficient implementations of matrix multiplications in heterogeneous environments, it is still difficult to find packages providing a matrix multiplication routine that is so easy to use, efficient, and versatile as its homogeneous counterparts. Our approach here is based on a simple implementation using OpenMP sections. We have also devised a functional model for the execution time that has been successfully applied to the evaluation of matrix polynomials of large degree so that it allows to balance the workload and minimizes the runtime cost

    A matrix-free ILU realization based on surrogates

    Full text link
    Matrix-free techniques play an increasingly important role in large-scale simulations. Schur complement techniques and massively parallel multigrid solvers for second-order elliptic partial differential equations can significantly benefit from reduced memory traffic and consumption. The matrix-free approach often restricts solver components to purely local operations, for instance, the Jacobi- or Gauss--Seidel-Smoothers in multigrid methods. An incomplete LU (ILU) decomposition cannot be calculated from local information and is therefore not amenable to an on-the-fly computation which is typically needed for matrix-free calculations. It generally requires the storage and factorization of a sparse matrix which contradicts the low memory requirements in large scale scenarios. In this work, we propose a matrix-free ILU realization. More precisely, we introduce a memory-efficient, matrix-free ILU(0)-Smoother component for low-order conforming finite elements on tetrahedral hybrid grids. Hybrid grids consist of an unstructured macro-mesh which is subdivided into a structured micro-mesh. The ILU(0) is used for degrees-of-freedom assigned to the interior of macro-tetrahedra. This ILU(0)-Smoother can be used for the efficient matrix-free evaluation of the Steklov-Poincare operator from domain-decomposition methods. After introducing and formally defining our smoother, we investigate its performance on refined macro-tetrahedra. Secondly, the ILU(0)-Smoother on the macro-tetrahedrons is implemented via surrogate matrix polynomials in conjunction with a fast on-the-fly evaluation scheme resulting in an efficient matrix-free algorithm. The polynomial coefficients are obtained by solving a least-squares problem on a small part of the factorized ILU(0) matrices to stay memory efficient. The convergence rates of this smoother with respect to the polynomial order are thoroughly studied
    • …
    corecore