Search CORE

15,508 research outputs found

Domain Decomposition Based High Performance Parallel Computing\ud

Author: Khaitan Siddhartha
Raju Mandhapati P.
Publication venue: International Journal of Computer Science Issues, IJCSI
Publication date: 01/10/2009
Field of study

The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition techniques. A highly efficient sparse direct solver PARDISO is used in this study. The scalability of both Newton and modified Newton algorithms are tested

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices

Author: Baruch E.
Cannon L. E.
Choi J
Choi J.
Choi J.
Solomonik E.
Szabo A.
van de Geijn R. A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2015
Field of study

A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).Comment: 8 pages, 6 figures, accepted to IA3 2015. arXiv admin note: text overlap with arXiv:1504.0504

arXiv.org e-Print Archive

Crossref

Objective multiscale analysis of random heterogeneous materials

Author: Everdij F. P. X.
Lloberas Valls Oriol
Rixen D. J.
Simone A.
Sluys L. J.
Publication venue: CIMNE
Publication date: 01/01/2013
Field of study

The multiscale framework presented in [1, 2] is assessed in this contribution for a study of random heterogeneous materials. Results are compared to direct numerical simulations (DNS) and the sensitivity to user-deﬁned parameters such as the domain decomposition type and initial coarse scale resolution is reported. The parallel performance of the implementation is studied for diﬀerent domain decompositions

UPCommons. Portal del coneixement obert de la UPC

Distributing the Kalman Filter for Large-Scale Systems

Author: Khan Usman A.
Moura Jose M. F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2008
Field of study

This paper derives a \emph{distributed} Kalman filter to estimate a sparsely connected, large-scale,

n-

dimensional, dynamical system monitored by a network of

N

sensors. Local Kalman filters are implemented on the (

n_l-

dimensional, where

n_l\ll n

) sub-systems that are obtained after spatially decomposing the large-scale system. The resulting sub-systems overlap, which along with an assimilation procedure on the local Kalman filters, preserve an

L

th order Gauss-Markovian structure of the centralized error processes. The information loss due to the

L

th order Gauss-Markovian approximation is controllable as it can be characterized by a divergence that decreases as

L\uparrow

. The order of the approximation,

L

, leads to a lower bound on the dimension of the sub-systems, hence, providing a criterion for sub-system selection. The assimilation procedure is carried out on the local error covariances with a distributed iterate collapse inversion (DICI) algorithm that we introduce. The DICI algorithm computes the (approximated) centralized Riccati and Lyapunov equations iteratively with only local communication and low-order computation. We fuse the observations that are common among the local Kalman filters using bipartite fusion graphs and consensus averaging algorithms. The proposed algorithm achieves full distribution of the Kalman filter that is coherent with the centralized Kalman filter with an

L

th order Gaussian-Markovian structure on the centralized error processes. Nowhere storage, communication, or computation of

n-

dimensional vectors and matrices is needed; only

n_l \ll n

dimensional vectors and matrices are communicated or used in the computation at the sensors

arXiv.org e-Print Archive

Crossref

Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication

Author: Azad Ariful
Ballard Grey
Buluc Aydin
Demmel James
Grigori Laura
Schwartz Oded
Toledo Sivan
Williams Samuel
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2016
Field of study

Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on Erdos-Renyi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first ever implementation of the 3D SpGEMM formulation that also exploits multiple (intra-node and inter-node) levels of parallelism, achieving significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

eScholarship - University of California

Hal-Diderot