70,669 research outputs found

    Diagonal and normal with Toeplitz-block splitting iteration method for space fractional coupled nonlinear Schr\"odinger equations with repulsive nonlinearities

    Full text link
    By applying the linearly implicit conservative difference scheme proposed in [D.-L. Wang, A.-G. Xiao, W. Yang. J. Comput. Phys. 2014;272:670-681], the system of repulsive space fractional coupled nonlinear Schr\"odinger equations leads to a sequence of linear systems with complex symmetric and Toeplitz-plus-diagonal structure. In this paper, we propose the diagonal and normal with Toeplitz-block splitting iteration method to solve the above linear systems. The new iteration method is proved to converge unconditionally, and the optimal iteration parameter is deducted. Naturally, this new iteration method leads to a diagonal and normal with circulant-block preconditioner which can be executed efficiently by fast algorithms. In theory, we provide sharp bounds for the eigenvalues of the discrete fractional Laplacian and its circulant approximation, and further analysis indicates that the spectral distribution of the preconditioned system matrix is tight. Numerical experiments show that the new preconditioner can significantly improve the computational efficiency of the Krylov subspace iteration methods. Moreover, the corresponding preconditioned GMRES method shows space mesh size independent and almost fractional order parameter insensitive convergence behaviors

    Parallel Factorizations in Numerical Analysis

    Full text link
    In this paper we review the parallel solution of sparse linear systems, usually deriving by the discretization of ODE-IVPs or ODE-BVPs. The approach is based on the concept of parallel factorization of a (block) tridiagonal matrix. This allows to obtain efficient parallel extensions of many known matrix factorizations, and to derive, as a by-product, a unifying approach to the parallel solution of ODEs.Comment: 15 pages, 5 figure

    Matrix Scaling and Balancing via Box Constrained Newton's Method and Interior Point Methods

    Full text link
    In this paper, we study matrix scaling and balancing, which are fundamental problems in scientific computing, with a long line of work on them that dates back to the 1960s. We provide algorithms for both these problems that, ignoring logarithmic factors involving the dimension of the input matrix and the size of its entries, both run in time O~(mlogκlog2(1/ϵ))\widetilde{O}\left(m\log \kappa \log^2 (1/\epsilon)\right) where ϵ\epsilon is the amount of error we are willing to tolerate. Here, κ\kappa represents the ratio between the largest and the smallest entries of the optimal scalings. This implies that our algorithms run in nearly-linear time whenever κ\kappa is quasi-polynomial, which includes, in particular, the case of strictly positive matrices. We complement our results by providing a separate algorithm that uses an interior-point method and runs in time O~(m3/2log(1/ϵ))\widetilde{O}(m^{3/2} \log (1/\epsilon)). In order to establish these results, we develop a new second-order optimization framework that enables us to treat both problems in a unified and principled manner. This framework identifies a certain generalization of linear system solving that we can use to efficiently minimize a broad class of functions, which we call second-order robust. We then show that in the context of the specific functions capturing matrix scaling and balancing, we can leverage and generalize the work on Laplacian system solving to make the algorithms obtained via this framework very efficient.Comment: To appear in FOCS 201

    On large-scale diagonalization techniques for the Anderson model of localization

    Get PDF
    We propose efficient preconditioning algorithms for an eigenvalue problem arising in quantum physics, namely the computation of a few interior eigenvalues and their associated eigenvectors for large-scale sparse real and symmetric indefinite matrices of the Anderson model of localization. We compare the Lanczos algorithm in the 1987 implementation by Cullum and Willoughby with the shift-and-invert techniques in the implicitly restarted Lanczos method and in the Jacobi–Davidson method. Our preconditioning approaches for the shift-and-invert symmetric indefinite linear system are based on maximum weighted matchings and algebraic multilevel incomplete LDLT factorizations. These techniques can be seen as a complement to the alternative idea of using more complete pivoting techniques for the highly ill-conditioned symmetric indefinite Anderson matrices. We demonstrate the effectiveness and the numerical accuracy of these algorithms. Our numerical examples reveal that recent algebraic multilevel preconditioning solvers can accelerate the computation of a large-scale eigenvalue problem corresponding to the Anderson model of localization by several orders of magnitude

    Three-Level Parallel J-Jacobi Algorithms for Hermitian Matrices

    Get PDF
    The paper describes several efficient parallel implementations of the one-sided hyperbolic Jacobi-type algorithm for computing eigenvalues and eigenvectors of Hermitian matrices. By appropriate blocking of the algorithms an almost ideal load balancing between all available processors/cores is obtained. A similar blocking technique can be used to exploit local cache memory of each processor to further speed up the process. Due to diversity of modern computer architectures, each of the algorithms described here may be the method of choice for a particular hardware and a given matrix size. All proposed block algorithms compute the eigenvalues with relative accuracy similar to the original non-blocked Jacobi algorithm.Comment: Submitted for publicatio

    Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

    Get PDF
    The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of computing resources. The pressure to maintain reasonable levels of performance and portability forces application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. In this paper, we study the benefits and limits of replacing the highly specialized internal scheduler of the PaStiX solver with two generic runtime systems: PaRSEC and StarPU. The tasks graph of the factorization step is made available to the two runtimes, providing them the opportunity to process and optimize its traversal in order to maximize the algorithm efficiency for the targeted hardware platform. A comparative study of the performance of the PaStiX solver on top of its native internal scheduler, PaRSEC, and StarPU frameworks, on different execution environments, is performed. The analysis highlights that these generic task-based runtimes achieve comparable results to the application-optimized embedded scheduler on homogeneous platforms. Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the programmer.Comment: Heterogeneity in Computing Workshop (2014

    Computational linear algebra over finite fields

    Get PDF
    We present here algorithms for efficient computation of linear algebra problems over finite fields
    corecore