2 research outputs found
A Novel Partitioning Method for Accelerating the Block Cimmino Algorithm
We propose a novel block-row partitioning method in order to improve the
convergence rate of the block Cimmino algorithm for solving general sparse
linear systems of equations. The convergence rate of the block Cimmino
algorithm depends on the orthogonality among the block rows obtained by the
partitioning method. The proposed method takes numerical orthogonality among
block rows into account by proposing a row inner-product graph model of the
coefficient matrix. In the graph partitioning formulation defined on this graph
model, the partitioning objective of minimizing the cutsize directly
corresponds to minimizing the sum of inter-block inner products between block
rows thus leading to an improvement in the eigenvalue spectrum of the iteration
matrix. This in turn leads to a significant reduction in the number of
iterations required for convergence. Extensive experiments conducted on a large
set of matrices confirm the validity of the proposed method against a
state-of-the-art method
Parallel Minimum Norm Solution of Sparse Block Diagonal Column Overlapped Underdetermined Systems
Underdetermined systems of equations in which the minimum norm solution needs to be computed arise in many applications, such as geophysics, signal processing, and biomedical engineering. In this article, we introduce a new parallel algorithm for obtaining the minimum 2-norm solution of an underdetermined system of equations. The proposed algorithm is based on the Balance scheme, which was originally developed for the parallel solution of banded linear systems. The proposed scheme assumes a generalized banded form where the coefficient matrix has column overlapped block structure in which the blocks could be dense or sparse. In this article, we implement the more general sparse case. The blocks can be handled independently by any existing sequential or parallel QR factorization library. A smaller reduced system is formed and solved before obtaining the minimum norm solution of the original system in parallel. We experimentally compare and confirm the error bound of the proposed method against the QR factorization based techniques by using true single-precision arithmetic. We implement the proposed algorithm by using the message passing paradigm. We demonstrate numerical effectiveness as well as parallel scalability of the proposed algorithm on both shared and distributed memory architectures for solving various types of problems