Search CORE

2 research outputs found

Computing eigenvalue bounds for iterative subspace matrix methods

Author: Michael Minkoff
Ron Shepard
Yunkai Zhou
Publication venue
Publication date: 01/01/2005
Field of study

Abstract A procedure is presented for the computation of bounds to eigenvalues of the generalized hermitian eigenvalue problem and to the standard hermitian eigenvalue problem. This procedure is applicable to iterative subspace eigenvalue methods and to both outer and inner eigenvalues. The Ritz values and their corresponding residual norms, all of which are computable quantities, are needed by the procedure. Knowledge of the exact eigenvalues is not needed by the procedure, but it must be known that the computed Ritz values are isolated from exact eigenvalues outside of the Ritz spectrum and that there are no skipped eigenvalues within the Ritz spectrum range. A multipass refinement procedure is described to compute the bounds for each Ritz value. This procedure requires O(m) effort where m is the subspace dimension for each pass. Published by Elsevier B.V. PACS: 02.10; 02.60; 02.70; 89.80 Keywords: Bounds; Eigenvalue; Subspace; Ritz; Hermitian; Generalized; Gap; Spread Background The generalized hermitian eigenvalue problem occurs in many applications. In this equation the matrices are of dimension N , H = H † , M = M † , and M is positive definite. An important special case is the standard hermitian eigenvalue problem for which M = 1. All of the results in this manuscript apply to both the generalized and the standard problems. One approach to the numerical solution of Eq. (1.1) is to expand the approximate eigenvectors in a linearly independent basis {x j ; j = 1 . . . m}. These vectors may be collected into the matrix The projected subspace representation of the matrices H and M are denoted H = X † HX and M = X † MX. In many cases, the basis vectors in X will be chosen such that M = 1, but we consider the general case hereafter. An approximation y to an eigenvector will be written as a linear combination of the basis vectors, y = Xc (for simplicity, the vector norm relation y † My = 1 is assumed hereafter). A measure of the error of the vector y is the residual vector (1.3) r = (H − ρM)y. r = 0 if and only if y is an exact eigenvector and ρ is the corresponding exact eigenvalue. Otherwise, as discussed in more detail in Appendix A, the quantity √ r † M −1 r, hereafter called the residual norm, is a measure of the error in both the eigenvector and the eigenvalue. The expansion coefficients c are usually determined from the subspace eigenvalue equation It is convenient to assume that the subspace eigenvalues {ρ j ; j = 1 . . . m}, called the Ritz values, and the exact eigenvalues {λ k ; k = 1 . . . N} are indexed in increasing order. This is not a requirement in order to apply the results of this manuscript, but this assumption simplifies the subsequent notation and discussions. It is convenient to use the Parlett index convention [1] for which negative integers are used to index eigenvalues ordered from highest to lowest (i.e. . We refer to a set of the lowest eigenvalues {λ k ; k = 1 . . . m}, or to a set of the highest eigenvalues {λ −k ; k = 1 . . . m} as outer eigenvalues, we refer to either or both of the highest and lowest eigenvalues {λ 1 , λ N } as extremal eigenvalues, and we refer to a sequence of eigenvalues that are in the interior of the spectrum {λ k ; k = m . . . n; 1 < m n < N} as inner eigenvalues. The above choice for the expansion coefficients c and approximate eigenvalue ρ is optimal in several respects As discussed in more detail below, the mapping of the Ritz value index j and the eigenvalue index k depends on whether the eigenvalues are inner or outer, and on whether the highest or lowest eigenvalues are computed. The further goal is that these bounds may be computed during the iterative procedure, so that they may be used not only to assess the accuracy of the final results, but also to allow the iterative procedure to be terminated when a predetermined accuracy has been achieved in order to avoid unnecessary computational effort. Several standard inequalities are used to this end. The first is the Ritz Bound from the Ritz variational principle This bound relation places strict upper bounds on each of the lowest m exact outer eigenvalues, and it places strict lower bounds on the highest m exact outer eigenvalues. No additional information beyond the Ritz values themselves is required, so these bounds are computable. The second inequality used is the Residual Norm Bound ( These bounds are computable, but they are often very conservative. 92 Y. Zhou et al. / Computer Physics Communications 167 (2005) 90-102 The next inequality used in this procedure is the Gap Theorem Bound [1,10] (see Eqs. (A.29)-(A.46)). The Gap Bound is tighter than the Residual Norm Bound when r † j M −1 r j < γ j , but it requires knowledge of the exact eigenvalues, and therefore it cannot be computed during the iterative process. However, if a Ritz value of interest ρ j is separated in the following sense and if it is further assumed that there are no skipped eigenvalues (i.e. eigenvalues that are poorly represented or not represented at all) within the computed spectrum range, then is a lower bound to the exact gap γ j . This separation is shown in When it is necessary to distinguish which bound is being discussed, Eq. ( The Spread Theorem requires knowledge of the exact matrix spread, σ = (λ N − λ 1 ), which, although computable in some situations, is usually not available. However, if an upper bound σ + is available such that σ σ + , then useful inner bounds can be computed for the two extremal eigenvalues. For example, when computing the largest eigenvalue of a positive definite matrix, the upper bound to the eigenvalue is itself an upper bound to the matrix spread. Combining this upper bound with Eq. (1.13) results in the computable bound (1.14) Subspace methods are often used in situations in which the matrices H and M are not explicitly com- Y. Zhou et al. / Computer Physics Communications 167 (2005) 90-102 93 puted and stored; instead, the products of these matrices with the basis vectors X are computed in operator form. Bounds that require information about individual matrix elements could not be used in these important applications because these individual matrix elements are either not available or they would be prohibitively expensive to extract from the matrix-vector product operation. The goal of the present work is to find the best computable eigenvalue bounds given a set of Ritz values, their residual norms, some knowledge of the eigenvalue spacings, and, optionally, an estimate of the matrix spread. This information is available for subspace methods even when the individual matrix elements are not explicitly available for examination. There are several issues that must be addressed toward this goal. ( These issues are addressed in order. In general, given only the computed Ritz values and their corresponding residual norms, it cannot be guaranteed that there is no skipped eigenvalue. The skipped eigenvalue depends on information outside of the current subspace X. However, in many applications, the general qualitative structure of the eigenvalues of interest is known beforehand. For example, the eigenpairs may be known at a lower approximation level, on a coarser grid, with a smaller underlying model basis set, at nearby values of some set of external parameters, or from other physical insights known to the user. In addition, the eigenvalue spacings may be known more accurately than the actual eigenvalues themselves. The matrix may be known to be, for example, positive semidefinite, or positive definite, or negative definite, or other upper or lower bounds may be known based on physical insights of the model upon which the numerical eigenproblem is derived. Furthermore, the rank of any degeneracies may be known for the eigenvalue spectrum based on, for example, physical or mathematical invariance properties, group theory, or other physical or mathematical insights. In these situations, the goal of the numerical calculation is to compute quantitatively the eigenpairs, the qualitative values being already known to some extent. This additional knowledge of the eigenvalue spectrum may be used to allow application of the Gap and the Spread Bounds even though the exact eigenvalues are not known. When computing the initial bounds from the residual norms, two situations occur for each of the Ritz values: either a Ritz value is separated from the other values, or it is not separated. If it is separated, and if there are no skipped eigenvalues, then one and only one exact eigenvalue will occur in the range If two (or more) Ritz values are not separated, then in the general case it is known only that a single eigenvalue exists in the combined residual norm value range. That is, the approximate eigenvectors associated with the two Ritz values may either both be approximating the same exact eigenpair, or they may be approximating two distinct eigenpairs. Consider, for example, the standard eigenvalue problem with Consider first the situation with ε ≈ 0.1. Both Ritz values are near the single λ 2 = 0 exact eigenvalue, and it is only for this single exact eigenvalue that the Residual Norm Bound, for either Ritz value, is seen to apply. The two Ritz values are not separated by the residual norm bound, thus the eigenvalue bounds cannot be refined with the Gap Bound. Furthermore, λ 1 < (ρ 1 − |r 1 |) and λ −1 > (ρ −1 + |r −1 |), so it is clear that this subspace representation does not satisfy the nonskipped eigenvalue requirement for either of the nonzero eigenvalues. On the other hand, consider this same problem 94 Y. Zhou et al. / Computer Physics Communications 167 (2005) [90][91][92][93][94][95][96][97][98][99][100][101][102] with ε ≈ 10. In this case the Ritz values are separated by the Residual Norm Bound, and the two Ritz values are good approximations to the two nonzero extremal exact eigenvalues. However, there is a skipped eigenvalue in this case, and the Computed Gap Bounds cannot be applied because lower bounds to the exact gaps cannot be determined from the subspace information. As noted above, it cannot be known simply by examining the residual norms or the subspace eigenvalues that an eigenvalue has been skipped, this information must be determined separately. The coupling of the bounds for one eigenvalue to the computed bounds for other, usually adjacent, eigenvalues results in the necessity to compute a sequence of bound values. At present, a closed-form solution to this problem is not known to the authors. Consequently, we adopt a straightforward bootstrap approach to this problem. We begin the procedure by assigning Residual Norm Bounds to each of the eigenvalues (along with the Ritz and Spread bounds if appropriate). Then, given the current bounds, all of the bounds for each isolated eigenvalue are refined if possible, and any improvements to the bounds are retained for subsequent passes. Depending on the eigenvalue spacings and the corresponding bounds, the final bounds may be computed in a single refinement pass or they may require multiple passes. The bounds that are computed in this process are valid at any intermediate step, and this allows the process to be truncated while still returning rigorous bounds. For example, once the bounds are below a given threshold, indicating convergence to some level of accuracy of the eigenvalue, the process could be terminated if desired. The application of Eqs. (1.9)-(1.12) appears to require a scan of m elements to compute the upper and lower gap bounds for each Ritz value; this in turn implies O(m 2 ) effort to refine the bounds for all m Ritz values each pass. We avoid this effort by precomputing the δ + array before the process begins, and then updating these elements, along with the corresponding element of δ − , in decreasing order as each eigenvalue bound is refined. This results overall in only O(m) effort for each pass. Because the outer eigenvalue bounds are not symmetric about the computed Ritz values, this introduces an asymmetry into the calculation procedure. Consider, for example, the computation of bounds for the lowest few m eigenvalues. First note that the gap associated with ρ m cannot be estimated. This is because there is no ρ m+1 estimate available that could be used to approximate this gap. The refinement process therefore begins with ρ m−1 and proceeds down, in some order, eventually to the bounds for ρ 1 . Secondly, note that the computed gap for ρ 1 depends only on the bound δ + 1 whereas the computed gaps for the remaining eigenvalues must be estimated by examining both the lower bounds to the higher eigenvalues and the upper bounds for the lower eigenvalues. Finally the upper bounds are given by the Ritz variational principle for all of the eigenvalues except for λ 1 , and for this eigenvalue the computed Spread Bound may be applied optionally (provided that an upper bound to the matrix spread is available), resulting in the situation in which the Ritz value lies outside of the computed eigenvalue range b The same considerations apply also to the computation of the highest m eigenvalues. As a practical matter, the computation of the residual norm r † j M −1 r j for the above bounds calculations requires either the explicit inversion of the metric matrix M or the solution of the linear equation Ms = r j . The inverse may be computed with relatively little effort when the metric M is a unit matrix (i.e. the standard eigenvalue problem), a diagonal matrix, or a block-diagonal matrix with small diagonal blocks. The linear equation solution is relatively simple in some other applications, including, for example, the tridiagonal or narrow-banded forms that occur in finite difference methods for which sparse LU or sparse Cholesky methods may be applied. In other applications for which the matrix M has no exploitable structure, the linear equation solution requires a more general iterative approach, the effort for which might range from minimal (e.g., using a preconditioned conjugate gradient approach with a very efficient preconditioner) to something comparable to that required for the eigenproblem itself. It is assumed herein that some suitable method for computing the above residual norm is available for the target problem such that the bounds refinement process described above is practical. Examples In this section a few examples are given to demonstrate the use of the bounds computations described i

CiteSeerX