Location of Repository

this paper we will concentrate on distributed memory mesh multiprocessors. Multiprocessor systems with mesh topology present a simple interconnection network that makes them attractive for massively parallel computation. An important number of real machines based on this architecture are currently available. The multiprocessors with mesh topology are made up by a network of processing elements (PEs) arranged as a d-dimensional matrix A(p d-1 , p d-2 , ..., p 0 ), where p i is the size in dimension i. A PE located in A(i d-1 , i d-2 , ..., i 0 ) is connected with the PEs located in A(i d-1 , ..., i j ± 1, ..., i 0 ) with 0j<d (if they exist). In this work we will consider two dimensional meshes of size p×q. In what follows, we will call the processor located in row r and column s of the mesh PE[r,s]. Using simple indexing, PE[t] represents the t-th PE, 0t<p×q. In designing parallel sparse algorithms, a key issue is the distribution of the workload among the PEs. Usually we have to solve the tradeoff between a balanced distribution of workload and a minimal communication and synchronization overhead. The complexity of the parallel algorithm for the SpMxV product is strongly conditioned by the distribution of the data. Choosing of a good partitioning for the sparse matrix is crucial in order to balance the load and minimize communications. In this work we present a new distribution method we call Multiple Recursive Decomposition (MRD). The MRD method performs the data partitioning using the prime factor decomposition of the dimensions of the multiprocessor. Furthermore, we introduce a new variant of the Scatter distribution (we will name Block Row Scatter (BRS)), which organizes the storage of data using a storage-by-row-of-blocks. We will analyze and compare the performan..

Year: 1995

OAI identifier:
oai:CiteSeerX.psu:10.1.1.41.2538

Provided by:
CiteSeerX

Download PDF: