7 research outputs found
Recommended from our members
Multiplication of Matrices of Arbitrary Shape on a Data Parallel Computer
Some level-2 and level-3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been implemented on the Connection Machine system CM-200 are described. No assumption is made on the shape or size of the operands. For matrix-matrix multiplication, both the nonsystolic and the systolic algorithms are outlined. A systolic algorithm that computes the product matrix in-place is described in detail. We show that a level-3 DBLAS yields better performance than a level-2 DBLAS. On the Connection Machine system CM-200, blocking yields a performance improvement by a factor of up to three over level-2 DBLAS. For certain matrix shapes the systolic algorithms offer both improved performance and significantly reduced temporary storage requirements compared to the nonsystolic block algorithms. We show that, in order to minimize the communication time, an algorithm that leaves the largest operand matrix stationary should be chosen for matrix-matrix multiplication. Furthermore, it is shown both analytically and experimentally that the optimum shape of the processor array yields square stationary submatrices in each processor, i.e., the ratio between the length of the axes of the processing array must be the same as the ratio between the corresponding axes of the stationary matrix. The optimum processor array shape may yield a factor of square matrices. For rectangular matrices a factor of 30 improvement was observed for an optimum processor array shape compared to a poorly chosen processor array shape.Engineering and Applied Science
Efficient algorithms for the optical multi-trees (OMULT) architecture.
In this thesis, we have reported our investigations on efficiently implementing algorithms on the recently proposed Optical Multi-Trees (OMULT) multi-processors interconnection architecture that uses both electronic and optical links among processors. We have investigated algorithms for matrix multiplication of two matrices of size n2 x n2 and two matrices of arbitrary size, the prefix-sum of a series and some fundamental computational geometry problems. We show that some common algorithms for computational geometry---finding the convex hull, the smallest enclosing box, the empirical cumulative distribution function and the all-nearest neighbor problems of n data points can be computed on the OMULT network in O(log n) time, compared to O(√n) algorithms on the Optical Transpose Interconnection System (OTIS) mesh for each of these problems. Finally we have implemented our algorithm for matrix multiplication using the SimJava simulation tool and feel that this is a convenient environment for testing such parallel algorithms.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .I85. Source: Masters Abstracts International, Volume: 43-05, page: 1751. Adviser: Subir Bandyopadhyay. Thesis (M.Sc.)--University of Windsor (Canada), 2004
Multiplication of Matrices of Arbitrary Shape on a Data Parallel Computer
Some level--2 and level--3 Distributed Basic Linear Algebra Subroutines (DBLAS) that have been implemented on the Connection Machine system CM--200 are described. For matrix--matrix multiplication, both the nonsystolic and the systolic algorithms are outlined. A systolic algorithm that computes the product matrix in--place is described in detail. All algorithms that are presented here are part of the Connection Machine Scientific Software Library, CMSSL. We show that a level--3 DBLAS yields better performance than a level--2 DBLAS. On the Connection Machine system CM--200, blocking yields a performance improvement by a factor of up to three over level--2 DBLAS. For certain matrix shapes the systolic algorithms offer both improved performance and significantly reduced temporary storage requirements compared to the nonsystolic block algorithms. The performance improvement over the blocked nonsystolic algorithms may be as much as a factor of seven, or more than a factor of 20 over the lev..