7 research outputs found
Recommended from our members
CMSSL: A acalable scientific software library
Massively parallel processors introduces new demands on software systems with respect to performance, scalability, robustness and portability. The increased complexity of the memory systems and the increased range of problem sizes for which a given piece of software is used poses serious challenges for software developers. The Connection Machine Scientific Software Library, CMSSL, uses several novel techniques to meet these challenges. The CMSSL contains routines for managing the data distribution and provides data distribution independent functionality. High performance is achieved through careful scheduling of operations and data motion, and through the automatic selection of algorithms at run{time. We discuss some of the techniques used, and provide evidence that CMSSL has reached the goals of performance and scalability for an important set of applications.Engineering and Applied Science
Recommended from our members
Local Basic Linear Algebra Subroutines (LBLAS) for Distributed Memory Architectures and Languages with Array Syntax
We describe a subset of the level-1, level-2, and level-3 BLAS implemented for each node of the Connection Machine system CM-200. The routines, collectively called LBLAS, have interfaces consistent with languages with an array syntax such as Fortran 90. One novel feature, important for distributed memory architectures, is the capability of performing computations on multiple instances of objects in a single call. The number of instances and their allocation across memory units, and the strides for the different axes within the local memories, are derived from an array descriptor that contains type, shape, and data distribution information. Another novel feature of the LBLAS is a selection of loop order for rank{1 updates and matrix-matrix multiplication based upon array shapes, strides, and DRAM page faults. The peak efficiencies for the routines are in excess of 75%. Matrix-vector multiplication achieves a peak efficiency of 92%. The optimization of loop ordering has a success rate exceeding 99.8% for matrices for which the sum of the lengths of the axes is at most 60. The success rate is even higher for all possible matrix shapes. The performance loss when a nonoptimal choice is made is less than ~15% of peak and typically less than 1% of peak. We also show that the performance gain for high rank updates may be as much as a factor of 6 over rank-1 updates.Engineering and Applied Science
GRASP: a data analysis package for gravitational wave detection
GRASP (Gravitational Radiation Analysis & Simulation Package) is a public-domain software tool-kit designed for analysis and simulation of data from gravitational wave detectors. This users manual describes the use and features of this package. Note: an up-to-date version of this manual may be obtained at: http://www.lsc-group.phys.uwm.edu/ ballen/grasp-distribution/. The software package is also available from this site