Search CORE

22,685 research outputs found

A Massively Parallel Algorithm for the Approximate Calculation of Inverse p-th Roots of Large Sparse Matrices

Author: Kühne Thomas D.
Lass Michael
Mohr Stephan
Plessl Christian
Wiebeler Hendrik
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/04/2018
Field of study

We present the submatrix method, a highly parallelizable method for the approximate calculation of inverse p-th roots of large sparse symmetric matrices which are required in different scientific applications. We follow the idea of Approximate Computing, allowing imprecision in the final result in order to be able to utilize the sparsity of the input matrix and to allow massively parallel execution. For an n x n matrix, the proposed algorithm allows to distribute the calculations over n nodes with only little communication overhead. The approximate result matrix exhibits the same sparsity pattern as the input matrix, allowing for efficient reuse of allocated data structures. We evaluate the algorithm with respect to the error that it introduces into calculated results, as well as its performance and scalability. We demonstrate that the error is relatively limited for well-conditioned matrices and that results are still valuable for error-resilient applications like preconditioning even for ill-conditioned matrices. We discuss the execution time and scaling of the algorithm on a theoretical level and present a distributed implementation of the algorithm using MPI and OpenMP. We demonstrate the scalability of this implementation by running it on a high-performance compute cluster comprised of 1024 CPU cores, showing a speedup of 665x compared to single-threaded execution

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

Distributing the Kalman Filter for Large-Scale Systems

Author: Khan Usman A.
Moura Jose M. F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2008
Field of study

This paper derives a \emph{distributed} Kalman filter to estimate a sparsely connected, large-scale,

n-

dimensional, dynamical system monitored by a network of

N

sensors. Local Kalman filters are implemented on the (

n_l-

dimensional, where

n_l\ll n

) sub-systems that are obtained after spatially decomposing the large-scale system. The resulting sub-systems overlap, which along with an assimilation procedure on the local Kalman filters, preserve an

L

th order Gauss-Markovian structure of the centralized error processes. The information loss due to the

L

th order Gauss-Markovian approximation is controllable as it can be characterized by a divergence that decreases as

L\uparrow

. The order of the approximation,

L

, leads to a lower bound on the dimension of the sub-systems, hence, providing a criterion for sub-system selection. The assimilation procedure is carried out on the local error covariances with a distributed iterate collapse inversion (DICI) algorithm that we introduce. The DICI algorithm computes the (approximated) centralized Riccati and Lyapunov equations iteratively with only local communication and low-order computation. We fuse the observations that are common among the local Kalman filters using bipartite fusion graphs and consensus averaging algorithms. The proposed algorithm achieves full distribution of the Kalman filter that is coherent with the centralized Kalman filter with an

L

th order Gaussian-Markovian structure on the centralized error processes. Nowhere storage, communication, or computation of

n-

dimensional vectors and matrices is needed; only

n_l \ll n

dimensional vectors and matrices are communicated or used in the computation at the sensors

arXiv.org e-Print Archive

Crossref

ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers

Author: Blum Volker
Corsetti Fabiano
García Alberto
Huhn William P.
Jacquelin Mathias
Jia Weile
Lange Björn
Lin Lin
Lu Jianfeng
Mi Wenhui
Seifitokaldani Ali
Vázquez-Mayagoitia Álvaro
Yang Chao
Yang Haizhao
Yu Victor Wen-zhe
Publication venue: 'Elsevier BV'
Publication date: 31/05/2017
Field of study

Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.Comment: 55 pages, 14 figures, 2 table

arXiv.org e-Print Archive

eScholarship - University of California

Parallel eigensolvers in plane-wave Density Functional Theory

Author: Levitt Antoine
Torrent Marc
Publication venue: 'Elsevier BV'
Publication date: 07/10/2014
Field of study

We consider the problem of parallelizing electronic structure computations in plane-wave Density Functional Theory. Because of the limited scalability of Fourier transforms, parallelism has to be found at the eigensolver level. We show how a recently proposed algorithm based on Chebyshev polynomials can scale into the tens of thousands of processors, outperforming block conjugate gradient algorithms for large computations

arXiv.org e-Print Archive

CiteSeerX

HAL-CEA

Contour integral method for obtaining the self-energy matrices of electrodes in electron transport calculations

Author: Futamura Yasunori
Imakura Akira
Iwase Shigeru
Ono Tomoya
Sakurai Tetsuya
Tsukamoto Shigeru
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2018
Field of study

We propose an efficient computational method for evaluating the self-energy matrices of electrodes to study ballistic electron transport properties in nanoscale systems. To reduce the high computational cost incurred in large systems, a contour integral eigensolver based on the Sakurai-Sugiura method combined with the shifted biconjugate gradient method is developed to solve exponential-type eigenvalue problem for complex wave vectors. A remarkable feature of the proposed algorithm is that the numerical procedure is very similar to that of conventional band structure calculations. We implement the developed method in the framework of the real-space higher-order finite difference scheme with nonlocal pseudopotentials. Numerical tests for a wide variety of materials validate the robustness, accuracy, and efficiency of the proposed method. As an illustration of the method, we present the electron transport property of the free-standing silicene with the line defect originating from the reversed buckled phases.Comment: 36 pages, 13 figures, 2 table

arXiv.org e-Print Archive

Juelich Shared Electronic Resources

A linear algebra processor using Monte Carlo methods

Author: Alexandrov Vassil Nikolov
Cadenas Medina Jose Oswaldo
Megson Graham M
Plaks T P
Publication venue
Publication date: 11/09/2003
Field of study

Central Archive at the University of Reading

Recommended from our members

Alternative methods for representing the inverse of linear programming basis matrices

Author: Mitra G
Tamiz M
Publication venue: Brunel University
Publication date: 01/01/1988
Field of study

Methods for representing the inverse of Linear Programming (LP) basis matrices are closely related to techniques for solving a system of sparse unsymmetric linear equations by direct methods. It is now well accepted that for these problems the static process of reordering the matrix in the lower block triangular (LBT) form constitutes the initial step. We introduce a combined static and dynamic factorisation of a basis matrix and derive its inverse which we call the partial elimination form of the inverse (PEFI). This factorization takes advantage of the LBT structure and produces a sparser representation of the inverse than the elimination form of the inverse (EFI). In this we make use of the original columns (of the constraint matrix) which are in the basis. To represent the factored inverse it is, however, necessary to introduce special data structures which are used in the forward and the backward transformations (the two major algorithmic steps) of the simplex method. These correspond to solving a system of equations and solving a system of equations with the transposed matrix respectively. In this paper we compare the nonzero build up of PEFI with that of EFI. We have also investigated alternative methods for updating the basis inverse in the PEFI representation. The results of our experimental investigation are presented in this pape

Brunel University Research Archive