70,669 research outputs found
Diagonal and normal with Toeplitz-block splitting iteration method for space fractional coupled nonlinear Schr\"odinger equations with repulsive nonlinearities
By applying the linearly implicit conservative difference scheme proposed in
[D.-L. Wang, A.-G. Xiao, W. Yang. J. Comput. Phys. 2014;272:670-681], the
system of repulsive space fractional coupled nonlinear Schr\"odinger equations
leads to a sequence of linear systems with complex symmetric and
Toeplitz-plus-diagonal structure. In this paper, we propose the diagonal and
normal with Toeplitz-block splitting iteration method to solve the above linear
systems. The new iteration method is proved to converge unconditionally, and
the optimal iteration parameter is deducted. Naturally, this new iteration
method leads to a diagonal and normal with circulant-block preconditioner which
can be executed efficiently by fast algorithms. In theory, we provide sharp
bounds for the eigenvalues of the discrete fractional Laplacian and its
circulant approximation, and further analysis indicates that the spectral
distribution of the preconditioned system matrix is tight. Numerical
experiments show that the new preconditioner can significantly improve the
computational efficiency of the Krylov subspace iteration methods. Moreover,
the corresponding preconditioned GMRES method shows space mesh size independent
and almost fractional order parameter insensitive convergence behaviors
Parallel Factorizations in Numerical Analysis
In this paper we review the parallel solution of sparse linear systems,
usually deriving by the discretization of ODE-IVPs or ODE-BVPs. The approach is
based on the concept of parallel factorization of a (block) tridiagonal matrix.
This allows to obtain efficient parallel extensions of many known matrix
factorizations, and to derive, as a by-product, a unifying approach to the
parallel solution of ODEs.Comment: 15 pages, 5 figure
Matrix Scaling and Balancing via Box Constrained Newton's Method and Interior Point Methods
In this paper, we study matrix scaling and balancing, which are fundamental
problems in scientific computing, with a long line of work on them that dates
back to the 1960s. We provide algorithms for both these problems that, ignoring
logarithmic factors involving the dimension of the input matrix and the size of
its entries, both run in time where is the amount of error we are willing to
tolerate. Here, represents the ratio between the largest and the
smallest entries of the optimal scalings. This implies that our algorithms run
in nearly-linear time whenever is quasi-polynomial, which includes, in
particular, the case of strictly positive matrices. We complement our results
by providing a separate algorithm that uses an interior-point method and runs
in time .
In order to establish these results, we develop a new second-order
optimization framework that enables us to treat both problems in a unified and
principled manner. This framework identifies a certain generalization of linear
system solving that we can use to efficiently minimize a broad class of
functions, which we call second-order robust. We then show that in the context
of the specific functions capturing matrix scaling and balancing, we can
leverage and generalize the work on Laplacian system solving to make the
algorithms obtained via this framework very efficient.Comment: To appear in FOCS 201
On large-scale diagonalization techniques for the Anderson model of localization
We propose efficient preconditioning algorithms for an eigenvalue problem arising in quantum physics, namely the computation of a few interior eigenvalues and their associated eigenvectors for large-scale sparse real and symmetric indefinite matrices of the Anderson model
of localization. We compare the Lanczos algorithm in the 1987 implementation by Cullum and Willoughby with the shift-and-invert techniques in the implicitly restarted Lanczos method and in the Jacobi–Davidson method. Our preconditioning approaches for the shift-and-invert symmetric indefinite linear system are based on maximum weighted matchings and algebraic multilevel incomplete
LDLT factorizations. These techniques can be seen as a complement to the alternative idea of using more complete pivoting techniques for the highly ill-conditioned symmetric indefinite Anderson matrices. We demonstrate the effectiveness and the numerical accuracy of these algorithms. Our numerical examples reveal that recent algebraic multilevel preconditioning solvers can accelerate the computation of a large-scale eigenvalue problem corresponding to the Anderson model of localization
by several orders of magnitude
Three-Level Parallel J-Jacobi Algorithms for Hermitian Matrices
The paper describes several efficient parallel implementations of the
one-sided hyperbolic Jacobi-type algorithm for computing eigenvalues and
eigenvectors of Hermitian matrices. By appropriate blocking of the algorithms
an almost ideal load balancing between all available processors/cores is
obtained. A similar blocking technique can be used to exploit local cache
memory of each processor to further speed up the process. Due to diversity of
modern computer architectures, each of the algorithms described here may be the
method of choice for a particular hardware and a given matrix size. All
proposed block algorithms compute the eigenvalues with relative accuracy
similar to the original non-blocked Jacobi algorithm.Comment: Submitted for publicatio
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
The ongoing hardware evolution exhibits an escalation in the number, as well
as in the heterogeneity, of computing resources. The pressure to maintain
reasonable levels of performance and portability forces application developers
to leave the traditional programming paradigms and explore alternative
solutions. PaStiX is a parallel sparse direct solver, based on a dynamic
scheduler for modern hierarchical manycore architectures. In this paper, we
study the benefits and limits of replacing the highly specialized internal
scheduler of the PaStiX solver with two generic runtime systems: PaRSEC and
StarPU. The tasks graph of the factorization step is made available to the two
runtimes, providing them the opportunity to process and optimize its traversal
in order to maximize the algorithm efficiency for the targeted hardware
platform. A comparative study of the performance of the PaStiX solver on top of
its native internal scheduler, PaRSEC, and StarPU frameworks, on different
execution environments, is performed. The analysis highlights that these
generic task-based runtimes achieve comparable results to the
application-optimized embedded scheduler on homogeneous platforms. Furthermore,
they are able to significantly speed up the solver on heterogeneous
environments by taking advantage of the accelerators while hiding the
complexity of their efficient manipulation from the programmer.Comment: Heterogeneity in Computing Workshop (2014
Computational linear algebra over finite fields
We present here algorithms for efficient computation of linear algebra
problems over finite fields
- …