Search CORE

76 research outputs found

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

Benchmarking the computation and communication performance of the CM-5

Author: Geoffrey Fox
Kivanc Dincer
Sanjay Ranka
Zeki Bozkus
Publication venue: 'Wiley'
Publication date: 01/01/2002
Field of study

Crossref

Recommended from our members

Index Transformation Algorithms in a Linear Algebra Framework

Author: Edelman Alan
Heller Steve
Johnsson S. Lennart
Publication venue
Publication date: 21/01/2016
Field of study

We present a linear algebraic formulation for a class of index transformations such as Gray code encoding and decoding, matrix transpose, bit reversal, vector reversal, shuffles, and other index or dimension permutations. This formulation unifies, simplifies, and can be used to derive algorithms for hypercube multiprocessors. We show how all the widely known properties of Gray codes, and some not so well-known properties as well, can be derived using this framework. Using this framework, we relate hypercube communications algorithms to Gauss-Jordan elimination on a matrix of 0's and 1's.Engineering and Applied Science

Harvard University - DASH

Multi-Dimensional Astrophysical Structural and Dynamical Analysis I. Development of a Nonlinear Finite Element Approach

Author: D. L. Meier
Eriguchi Y.
Eriguchi Y.
Komatsu H.
Regge T.
van Leer B.
Publication venue: 'University of Chicago Press'
Publication date: 10/11/1998
Field of study

A new field of numerical astrophysics is introduced which addresses the solution of large, multidimensional structural or slowly-evolving problems (rotating stars, interacting binaries, thick advective accretion disks, four dimensional spacetimes, etc.). The technique employed is the Finite Element Method (FEM), commonly used to solve engineering structural problems. The approach developed herein has the following key features: 1. The computational mesh can extend into the time dimension, as well as space, perhaps only a few cells, or throughout spacetime. 2. Virtually all equations describing the astrophysics of continuous media, including the field equations, can be written in a compact form similar to that routinely solved by most engineering finite element codes. 3. The transformations that occur naturally in the four-dimensional FEM possess both coordinate and boost features, such that (a) although the computational mesh may have a complex, non-analytic, curvilinear structure, the physical equations still can be written in a simple coordinate system independent of the mesh geometry. (b) if the mesh has a complex flow velocity with respect to coordinate space, the transformations will form the proper arbitrary Lagrangian- Eulerian advective derivatives automatically. 4. The complex difference equations on the arbitrary curvilinear grid are generated automatically from encoded differential equations. This first paper concentrates on developing a robust and widely-applicable set of techniques using the nonlinear FEM and presents some examples.Comment: 28 pages, 9 figures; added integral boundary conditions, allowing very rapidly-rotating stars; accepted for publication in Ap.

arXiv.org e-Print Archive

CiteSeerX

Crossref

Trivariate polynomial approximation on Lissajous curves

Author: Bos Len
De Marchi Stefano
Vianello Marco
Publication venue
Publication date: 13/02/2015
Field of study

We study Lissajous curves in the 3-cube, that generate algebraic cubature formulas on a special family of rank-1 Chebyshev lattices. These formulas are used to construct trivariate hyperinterpolation polynomials via a single 1-d Fast Chebyshev Transform (by the Chebfun package), and to compute discrete extremal sets of Fekete and Leja type for trivariate polynomial interpolation. Applications could arise in the framework of Lissajous sampling for MPI (Magnetic Particle Imaging)

arXiv.org e-Print Archive

Crossref

Catalogo dei prodotti della ricerca

Archivio istituzionale della ricerca - Università di Padova

Efficient Generation of Correctness Certificates for the Abstract Domain of Polyhedra

Author: A. Miné
A. Simon
B. Jeannet
D. Monniaux
D. Monniaux
D. Monniaux
F. Benoy
G.C. Necula
R. Bagnara
R. Bagnara
X. Leroy
Publication venue
Publication date: 01/01/2013
Field of study

Polyhedra form an established abstract domain for inferring runtime properties of programs using abstract interpretation. Computations on them need to be certified for the whole static analysis results to be trusted. In this work, we look at how far we can get down the road of a posteriori verification to lower the overhead of certification of the abstract domain of polyhedra. We demonstrate methods for making the cost of inclusion certificate generation negligible. From a performance point of view, our single-representation, constraints-based implementation compares with state-of-the-art implementations

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

Communication and Matrix Computations on Large Message Passing Systems

Author: Stewart G. W.
Publication venue
Publication date: 15/10/1998
Field of study

This paper is concerned with the consequences for matrix computations of having a rather large number of general purpose processors, say ten or twenty thousand, connected in a network in such a way that a processor can communicate only with its immediate neighbors. Certain communication tasks associated with most matrix algorithms are defined and formulas developed for the time required to perform them under several communication regimes. The results are compared with the times for a nominal

n^3

floating point operations. The results suggest that it is possible to use a large number of processors to solve matrix problems at a relatively fine granularity, provided fine grain communication is available. Additional figures are available at ftp thales.cs.umd.edu in the directory pub/reports (Also cross-referenced as UMIACS-TR-88-81

Digital Repository at the University of Maryland

Hypercube matrix computation task

Author: Calalo Ruel H.
Imbriale William A.
Jacobi Nathan
Liewer Paulett C.
Lockhart Thomas G.
Lyons James R.
Lyzenga Gregory A.
Manshadi Farzin
Patterson Jean E.
Publication venue
Publication date
Field of study

A major objective of the Hypercube Matrix Computation effort at the Jet Propulsion Laboratory (JPL) is to investigate the applicability of a parallel computing architecture to the solution of large-scale electromagnetic scattering problems. Three scattering analysis codes are being implemented and assessed on a JPL/California Institute of Technology (Caltech) Mark 3 Hypercube. The codes, which utilize different underlying algorithms, give a means of evaluating the general applicability of this parallel architecture. The three analysis codes being implemented are a frequency domain method of moments code, a time domain finite difference code, and a frequency domain finite elements code. These analysis capabilities are being integrated into an electromagnetics interactive analysis workstation which can serve as a design tool for the construction of antennas and other radiating or scattering structures. The first two years of work on the Hypercube Matrix Computation effort is summarized. It includes both new developments and results as well as work previously reported in the Hypercube Matrix Computation Task: Final Report for 1986 to 1987 (JPL Publication 87-18)

NASA Technical Reports Server

A Computational Model for Tensor Core Units

Author: Chowdhury Rezaul
Silvestri Francesco
Vella Flavio
Publication venue
Publication date: 01/01/2020
Field of study

To respond to the need of efficient training and inference of deep neural networks, a plethora of domain-specific hardware architectures have been introduced, such as Google Tensor Processing Units and NVIDIA Tensor Cores. A common feature of these architectures is a hardware circuit for efficiently computing a dense matrix multiplication of a given small size. In order to broaden the class of algorithms that exploit these systems, we propose a computational model, named the TCU model, that captures the ability to natively multiply small matrices. We then use the TCU model for designing fast algorithms for several problems, including matrix operations (dense and sparse multiplication, Gaussian Elimination), graph algorithms (transitive closure, all pairs shortest distances), Discrete Fourier Transform, stencil computations, integer multiplication, and polynomial evaluation. We finally highlight a relation between the TCU model and the external memory model

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices

Author: Baruch E.
Cannon L. E.
Choi J
Choi J.
Choi J.
Solomonik E.
Szabo A.
van de Geijn R. A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2015
Field of study

A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).Comment: 8 pages, 6 figures, accepted to IA3 2015. arXiv admin note: text overlap with arXiv:1504.0504

arXiv.org e-Print Archive

Crossref