4,433 research outputs found

    A biconjugate gradient type algorithm on massively parallel architectures

    Get PDF
    The biconjugate gradient (BCG) method is the natural generalization of the classical conjugate gradient algorithm for Hermitian positive definite matrices to general non-Hermitian linear systems. Unfortunately, the original BCG algorithm is susceptible to possible breakdowns and numerical instabilities. Recently, Freund and Nachtigal have proposed a novel BCG type approach, the quasi-minimal residual method (QMR), which overcomes the problems of BCG. Here, an implementation is presented of QMR based on an s-step version of the nonsymmetric look-ahead Lanczos algorithm. The main feature of the s-step Lanczos algorithm is that, in general, all inner products, except for one, can be computed in parallel at the end of each block; this is unlike the other standard Lanczos process where inner products are generated sequentially. The resulting implementation of QMR is particularly attractive on massively parallel SIMD architectures, such as the Connection Machine

    A Subspace Shift Technique for Nonsymmetric Algebraic Riccati Equations

    Full text link
    The worst situation in computing the minimal nonnegative solution of a nonsymmetric algebraic Riccati equation associated with an M-matrix occurs when the corresponding linearizing matrix has two very small eigenvalues, one with positive and one with negative real part. When both these eigenvalues are exactly zero, the problem is called critical or null recurrent. While in this case the problem is ill-conditioned and the convergence of the algorithms based on matrix iterations is slow, there exist some techniques to remove the singularity and transform the problem to a well-behaved one. Ill-conditioning and slow convergence appear also in close-to-critical problems, but when none of the eigenvalues is exactly zero the techniques used for the critical case cannot be applied. In this paper, we introduce a new method to accelerate the convergence properties of the iterations also in close-to-critical cases, by working on the invariant subspace associated with the problematic eigenvalues as a whole. We present a theoretical analysis and several numerical experiments which confirm the efficiency of the new method

    Many Masses on One Stroke: Economic Computation of Quark Propagators

    Get PDF
    The computational effort in the calculation of Wilson fermion quark propagators in Lattice Quantum Chromodynamics can be considerably reduced by exploiting the Wilson fermion matrix structure in inversion algorithms based on the non-symmetric Lanczos process. We consider two such methods: QMR (quasi minimal residual) and BCG (biconjugate gradients). Based on the decomposition M/κ=1/κDM/\kappa={\bf 1}/\kappa-D of the Wilson mass matrix, using QMR, one can carry out inversions on a {\em whole} trajectory of masses simultaneously, merely at the computational expense of a single propagator computation. In other words, one has to compute the propagator corresponding to the lightest mass only, while all the heavier masses are given for free, at the price of extra storage. Moreover, the symmetry γ5M=Mγ5\gamma_5\, M= M^{\dagger}\,\gamma_5 can be used to cut the computational effort in QMR and BCG by a factor of two. We show that both methods then become---in the critical regime of small quark masses---competitive to BiCGStab and significantly better than the standard MR method, with optimal relaxation factor, and CG as applied to the normal equations.Comment: 17 pages, uuencoded compressed postscrip

    On the Generation of Large Passive Macromodels for Complex Interconnect Structures

    Get PDF
    This paper addresses some issues related to the passivity of interconnect macromodels computed from measured or simulated port responses. The generation of such macromodels is usually performed via suitable least squares fitting algorithms. When the number of ports and the dynamic order of the macromodel is large, the inclusion of passivity constraints in the fitting process is cumbersome and results in excessive computational and storage requirements. Therefore, we consider in this work a post-processing approach for passivity enforcement, aimed at the detection and compensation of passivity violations without compromising the model accuracy. Two complementary issues are addressed. First, we consider the enforcement of asymptotic passivity at high frequencies based on the perturbation of the direct coupling term in the transfer matrix. We show how potential problems may arise when off-band poles are present in the model. Second, the enforcement of uniform passivity throughout the entire frequency axis is performed via an iterative perturbation scheme on the purely imaginary eigenvalues of associated Hamiltonian matrices. A special formulation of this spectral perturbation using possibly large but sparse matrices allows the passivity compensation to be performed at a cost which scales only linearly with the order of the system. This formulation involves a restarted Arnoldi iteration combined with a complex frequency hopping algorithm for the selective computation of the imaginary eigenvalues to be perturbed. Some examples of interconnect models are used to illustrate the performance of the proposed technique

    QCD simulations with staggered fermions on GPUs

    Full text link
    We report on our implementation of the RHMC algorithm for the simulation of lattice QCD with two staggered flavors on Graphics Processing Units, using the NVIDIA CUDA programming language. The main feature of our code is that the GPU is not used just as an accelerator, but instead the whole Molecular Dynamics trajectory is performed on it. After pointing out the main bottlenecks and how to circumvent them, we discuss the obtained performances. We present some preliminary results regarding OpenCL and multiGPU extensions of our code and discuss future perspectives.Comment: 22 pages, 14 eps figures, final version to be published in Computer Physics Communication

    Conjugate gradient type methods for linear systems with complex symmetric coefficient matrices

    Get PDF
    We consider conjugate gradient type methods for the solution of large sparse linear system Ax equals b with complex symmetric coefficient matrices A equals A(T). Such linear systems arise in important applications, such as the numerical solution of the complex Helmholtz equation. Furthermore, most complex non-Hermitian linear systems which occur in practice are actually complex symmetric. We investigate conjugate gradient type iterations which are based on a variant of the nonsymmetric Lanczos algorithm for complex symmetric matrices. We propose a new approach with iterates defined by a quasi-minimal residual property. The resulting algorithm presents several advantages over the standard biconjugate gradient method. We also include some remarks on the obvious approach to general complex linear systems by solving equivalent real linear systems for the real and imaginary parts of x. Finally, numerical experiments for linear systems arising from the complex Helmholtz equation are reported