22,685 research outputs found
A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices
We present the submatrix method, a highly parallelizable method for the
approximate calculation of inverse p-th roots of large sparse symmetric
matrices which are required in different scientific applications. We follow the
idea of Approximate Computing, allowing imprecision in the final result in
order to be able to utilize the sparsity of the input matrix and to allow
massively parallel execution. For an n x n matrix, the proposed algorithm
allows to distribute the calculations over n nodes with only little
communication overhead. The approximate result matrix exhibits the same
sparsity pattern as the input matrix, allowing for efficient reuse of allocated
data structures.
We evaluate the algorithm with respect to the error that it introduces into
calculated results, as well as its performance and scalability. We demonstrate
that the error is relatively limited for well-conditioned matrices and that
results are still valuable for error-resilient applications like
preconditioning even for ill-conditioned matrices. We discuss the execution
time and scaling of the algorithm on a theoretical level and present a
distributed implementation of the algorithm using MPI and OpenMP. We
demonstrate the scalability of this implementation by running it on a
high-performance compute cluster comprised of 1024 CPU cores, showing a speedup
of 665x compared to single-threaded execution
Distributing the Kalman Filter for Large-Scale Systems
This paper derives a \emph{distributed} Kalman filter to estimate a sparsely
connected, large-scale, dimensional, dynamical system monitored by a
network of sensors. Local Kalman filters are implemented on the
(dimensional, where ) sub-systems that are obtained after
spatially decomposing the large-scale system. The resulting sub-systems
overlap, which along with an assimilation procedure on the local Kalman
filters, preserve an th order Gauss-Markovian structure of the centralized
error processes. The information loss due to the th order Gauss-Markovian
approximation is controllable as it can be characterized by a divergence that
decreases as . The order of the approximation, , leads to a lower
bound on the dimension of the sub-systems, hence, providing a criterion for
sub-system selection. The assimilation procedure is carried out on the local
error covariances with a distributed iterate collapse inversion (DICI)
algorithm that we introduce. The DICI algorithm computes the (approximated)
centralized Riccati and Lyapunov equations iteratively with only local
communication and low-order computation. We fuse the observations that are
common among the local Kalman filters using bipartite fusion graphs and
consensus averaging algorithms. The proposed algorithm achieves full
distribution of the Kalman filter that is coherent with the centralized Kalman
filter with an th order Gaussian-Markovian structure on the centralized
error processes. Nowhere storage, communication, or computation of
dimensional vectors and matrices is needed; only dimensional
vectors and matrices are communicated or used in the computation at the
sensors
ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers
Solving the electronic structure from a generalized or standard eigenproblem
is often the bottleneck in large scale calculations based on Kohn-Sham
density-functional theory. This problem must be addressed by essentially all
current electronic structure codes, based on similar matrix expressions, and by
high-performance computation. We here present a unified software interface,
ELSI, to access different strategies that address the Kohn-Sham eigenvalue
problem. Currently supported algorithms include the dense generalized
eigensolver library ELPA, the orbital minimization method implemented in
libOMM, and the pole expansion and selected inversion (PEXSI) approach with
lower computational complexity for semilocal density functionals. The ELSI
interface aims to simplify the implementation and optimal use of the different
strategies, by offering (a) a unified software framework designed for the
electronic structure solvers in Kohn-Sham density-functional theory; (b)
reasonable default parameters for a chosen solver; (c) automatic conversion
between input and internal working matrix formats, and in the future (d)
recommendation of the optimal solver depending on the specific problem.
Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800
basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table
Parallel eigensolvers in plane-wave Density Functional Theory
We consider the problem of parallelizing electronic structure computations in
plane-wave Density Functional Theory. Because of the limited scalability of
Fourier transforms, parallelism has to be found at the eigensolver level. We
show how a recently proposed algorithm based on Chebyshev polynomials can scale
into the tens of thousands of processors, outperforming block conjugate
gradient algorithms for large computations
Contour integral method for obtaining the self-energy matrices of electrodes in electron transport calculations
We propose an efficient computational method for evaluating the self-energy
matrices of electrodes to study ballistic electron transport properties in
nanoscale systems. To reduce the high computational cost incurred in large
systems, a contour integral eigensolver based on the Sakurai-Sugiura method
combined with the shifted biconjugate gradient method is developed to solve
exponential-type eigenvalue problem for complex wave vectors. A remarkable
feature of the proposed algorithm is that the numerical procedure is very
similar to that of conventional band structure calculations. We implement the
developed method in the framework of the real-space higher-order finite
difference scheme with nonlocal pseudopotentials. Numerical tests for a wide
variety of materials validate the robustness, accuracy, and efficiency of the
proposed method. As an illustration of the method, we present the electron
transport property of the free-standing silicene with the line defect
originating from the reversed buckled phases.Comment: 36 pages, 13 figures, 2 table
Recommended from our members
Alternative methods for representing the inverse of linear programming basis matrices
Methods for representing the inverse of Linear Programming (LP) basis matrices are closely related to techniques for solving a system of sparse unsymmetric linear equations by direct methods. It is now well accepted that for these problems the static process of reordering the matrix in the lower block triangular (LBT) form constitutes the initial step. We introduce a combined static and dynamic factorisation of a basis matrix and derive its inverse which we call the partial elimination form of the inverse (PEFI). This factorization takes advantage of the LBT structure and produces a sparser representation of the inverse than the elimination form of the inverse (EFI). In this we make use of the original columns (of the constraint matrix) which are in the basis. To represent the factored inverse it is, however, necessary to introduce special data structures which are used in the forward and the backward transformations (the two major algorithmic steps) of the simplex method. These correspond to solving a system of equations and solving a system of equations with the transposed matrix respectively. In this paper we compare the nonzero build up of PEFI with that of EFI. We have also investigated alternative methods for updating the basis inverse in the PEFI representation. The results of our experimental investigation are presented in this pape
- …