Search CORE

1,243 research outputs found

Optimising Sparse Matrix Vector multiplication for large scale FEM problems on FPGA

Author: Burovskiy P
Grigoras P
Luk W
Sherwin S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/08/2016
Field of study

Sparse Matrix Vector multiplication (SpMV) is an important kernel in many scientific applications. In this work we propose an architecture and an automated customisation method to detect and optimise the architecture for block diagonal sparse matrices. We evaluate the proposed approach in the context of the spectral/hp Finite Element Method, using the local matrix assembly approach. This problem leads to a large sparse system of linear equations with block diagonal matrix which is typically solved using an iterative method such as the Preconditioned Conjugate Gradient. The efficiency of the proposed architecture combined with the effectiveness of the proposed customisation method reduces BRAM resource utilisation by as much as 10 times, while achieving identical throughput with existing state of the art designs and requiring minimal development effort from the end user. In the context of the Finite Element Method, our approach enables the solution of larger problems than previously possible, enabling the applicability of FPGAs to more interesting HPC problems

Spiral - Imperial College Digital Repository

Numerical Simulation in Automotive Design

Author: Lonsdale G
Publication venue: CERN
Publication date: 01/01/1996
Field of study

CERN Document Server

Research summary, January 1989 - June 1990

Author
Publication venue
Publication date
Field of study

The Research Institute for Advanced Computer Science (RIACS) was established at NASA ARC in June of 1983. RIACS is privately operated by the Universities Space Research Association (USRA), a consortium of 62 universities with graduate programs in the aerospace sciences, under a Cooperative Agreement with NASA. RIACS serves as the representative of the USRA universities at ARC. This document reports our activities and accomplishments for the period 1 Jan. 1989 - 30 Jun. 1990. The following topics are covered: learning systems, networked systems, and parallel systems

NASA Technical Reports Server

A 2D algorithm with asymmetric workload for the UPC conjugate gradient method

Author: DH Bailey
H Shan
J González-Domínguez
JC Pichel
Jorge González-Domínguez
Juan Touriño
María J. Martín
Osni A. Marques
R Barrett
R Vuduc
Y Saad
Y Zheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This is a post-peer-review, pre-copyedit version of an article published in Journal of Supercomputing. The final authenticated version is available online at: https://doi.org/10.1007/s11227-014-1300-0[Abstract] This paper examines four different strategies, each one with its own data distribution, for implementing the parallel conjugate gradient (CG) method and how they impact communication and overall performance. Firstly, typical 1D and 2D distributions of the matrix involved in CG computations are considered. Then, a new 2D version of the CG method with asymmetric workload, based on leaving some threads idle during part of the computation to reduce communication, is proposed. The four strategies are independent of sparse storage schemes and are implemented using Unified Parallel C (UPC), a Partitioned Global Address Space (PGAS) language. The strategies are evaluated on two different platforms through a set of matrices that exhibit distinct sparse patterns, demonstrating that our asymmetric proposal outperforms the others except for one matrix on one platform.Ministerio de Economía y Competitividad; TIN2013-42148-PXunta de Galicia; GRC2013/055United States. Department of Energy; DEAC03-76SF0009

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref