Search CORE

8 research outputs found

BLAS-3 for the Quadrics parallel computer

Author: Hertzberger B
Lippert T
Petkov N
Schilling K
Sloot P
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/1997
Field of study

A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and! in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved

BLAS-3 for the Quadrics parallel computer

Author: Lippert T
Petkov N
Schilling K
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/1997
Field of study

ARTS repository - University of Groningen

BLAS-3 for the Quadrics parallel computer

Author: Lippert T
Petkov N
Schilling K
Publication venue: Springer
Publication date: 01/01/1997
Field of study

A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and, in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Juelich Shared Electronic Resources

Dissertations of the University of Groningen

BLAS-3 for the quadrics parallel computer

Author: Alenia Spazio S.p.A.
C. Battista
G. S. Almasi
H. Gupta
J. J. Dongarra
N. Christ
P. Palazzari
P. S. Paolucci
R. Tripiccione
T. Lippert
V. Kumar
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

BLAS-3 for the quadrics parallel computer

Author: Lippert Th.
Petkov N.
Schilling K.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1997
Field of study

A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and, in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved

Juelich Shared Electronic Resources

BLAS-3 for the Quadrics Parallel Computer

Author: K. Schilling
N. Petkov
Th. Lippert
Publication venue
Publication date
Field of study

. A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and, in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved. 1 Introduction The efficient implementation of basic linear algebra subroutines (BLAS) fo..

CiteSeerX