Search CORE

3,276 research outputs found

Parallel Sparse Matrix Solver on the GPU Applied to Simulation of Electrical Machines

Author: Dekeyser Jean-Luc
Guyomarch Frédéric
Menach Yvonnick Le
Rodrigues Antonio Wendell De Oliveira
Publication venue
Publication date: 22/11/2009
Field of study

Nowadays, several industrial applications are being ported to parallel architectures. In fact, these platforms allow acquire more performance for system modelling and simulation. In the electric machines area, there are many problems which need speed-up on their solution. This paper examines the parallelism of sparse matrix solver on the graphics processors. More specifically, we implement the conjugate gradient technique with input matrix stored in CSR, and Symmetric CSR and CSC formats. This method is one of the most efficient iterative methods available for solving the finite-element basis functions of Maxwell's equations. The GPU (Graphics Processing Unit), which is used for its implementation, provides mechanisms to parallel the algorithm. Thus, it increases significantly the computation speed in relation to serial code on CPU based systems

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Implementing the conjugate gradient algorithm on multi-core systems

Author: Bakker V.
Kokkeler A.B.J.
Smit G.J.M.
Wiggers W.A.
Publication venue: IEEE
Publication date: 01/01/2007
Field of study

In linear solvers, like the conjugate gradient algorithm, sparse-matrix vector multiplication is an important kernel. Due to the sparseness of the matrices, the solver runs relatively slow. For digital optical tomography (DOT), a large set of linear equations have to be solved which currently takes in the order of hours on desktop computers. Our goal was to speed up the conjugate gradient solver. In this paper we present the results of applying multiple optimization techniques and exploiting multi-core solutions offered by two recently introduced architectures: Intel’s Woodcrest\ud general purpose processor and NVIDIA’s G80 graphical processing unit. Using these techniques for these architectures, a speedup of a factor three\ud has been achieved

CiteSeerX

University of Twente Research Information

Research in computer science

Author: Ortega J. M.
Publication venue
Publication date
Field of study

Synopses are given for NASA supported work in computer science at the University of Virginia. Some areas of research include: error seeding as a testing method; knowledge representation for engineering design; analysis of faults in a multi-version software experiment; implementation of a parallel programming environment; two computer graphics systems for visualization of pressure distribution and convective density particles; task decomposition for multiple robot arms; vectorized incomplete conjugate gradient; and iterative methods for solving linear equations on the Flex/32

NASA Technical Reports Server

Hypercube matrix computation task

Author: Calalo Ruel H.
Imbriale William A.
Jacobi Nathan
Liewer Paulett C.
Lockhart Thomas G.
Lyons James R.
Lyzenga Gregory A.
Manshadi Farzin
Patterson Jean E.
Publication venue
Publication date
Field of study

A major objective of the Hypercube Matrix Computation effort at the Jet Propulsion Laboratory (JPL) is to investigate the applicability of a parallel computing architecture to the solution of large-scale electromagnetic scattering problems. Three scattering analysis codes are being implemented and assessed on a JPL/California Institute of Technology (Caltech) Mark 3 Hypercube. The codes, which utilize different underlying algorithms, give a means of evaluating the general applicability of this parallel architecture. The three analysis codes being implemented are a frequency domain method of moments code, a time domain finite difference code, and a frequency domain finite elements code. These analysis capabilities are being integrated into an electromagnetics interactive analysis workstation which can serve as a design tool for the construction of antennas and other radiating or scattering structures. The first two years of work on the Hypercube Matrix Computation effort is summarized. It includes both new developments and results as well as work previously reported in the Hypercube Matrix Computation Task: Final Report for 1986 to 1987 (JPL Publication 87-18)

NASA Technical Reports Server

A Modeling Approach based on UML/MARTE for GPU Architecture

Author: Dekeyser Jean-Luc
Guyomarc'H Frédéric
Rodrigues Antonio Wendell De Oliveira
Publication venue
Publication date: 10/05/2011
Field of study

Nowadays, the High Performance Computing is part of the context of embedded systems. Graphics Processing Units (GPUs) are more and more used in acceleration of the most part of algorithms and applications. Over the past years, not many efforts have been done to describe abstractions of applications in relation to their target architectures. Thus, when developers need to associate applications and GPUs, for example, they find difficulty and prefer using API for these architectures. This paper presents a metamodel extension for MARTE profile and a model for GPU architectures. The main goal is to specify the task and data allocation in the memory hierarchy of these architectures. The results show that this approach will help to generate code for GPUs based on model transformations using Model Driven Engineering (MDE).Comment: Symposium en Architectures nouvelles de machines (SympA'14) (2011

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Distributed Finite Element Analysis Using a Transputer Network

Author: Baehmann Peggy
Danial Albert
Favenesi James
Reynolds Brian
Shephard Mark
Tombrello Joseph
Turrentine Ronald
Watson James
Yang Dabby
Publication venue
Publication date
Field of study

The principal objective of this research effort was to demonstrate the extraordinarily cost effective acceleration of finite element structural analysis problems using a transputer-based parallel processing network. This objective was accomplished in the form of a commercially viable parallel processing workstation. The workstation is a desktop size, low-maintenance computing unit capable of supercomputer performance yet costs two orders of magnitude less. To achieve the principal research objective, a transputer based structural analysis workstation termed XPFEM was implemented with linear static structural analysis capabilities resembling commercially available NASTRAN. Finite element model files, generated using the on-line preprocessing module or external preprocessing packages, are downloaded to a network of 32 transputers for accelerated solution. The system currently executes at about one third Cray X-MP24 speed but additional acceleration appears likely. For the NASA selected demonstration problem of a Space Shuttle main engine turbine blade model with about 1500 nodes and 4500 independent degrees of freedom, the Cray X-MP24 required 23.9 seconds to obtain a solution while the transputer network, operated from an IBM PC-AT compatible host computer, required 71.7 seconds. Consequently, the

80,000 transputer network demonstrated a cost-performance ratio about 60 times better than the

15,000,000 Cray X-MP24 system

NASA Technical Reports Server

Optical tomography using the SCIRun problem solving environment: Preliminary results for three-dimensional geometries and parallel processing

Author: Arridge SR
Johnson CR
Schweiger M
Zhukov L
Publication venue: OPTICAL SOC AMER
Publication date: 12/04/1999
Field of study

We present a 3D implementation of the UCL imaging package for absorption and scatter reconstruction from time-resolved data (TOAST), embedded in the SCIRun interactive simulation and visualization package developed at the University of Utah. SCIRun is a scientific programming environment that allows the interactive construction, debugging, and steering of large-scale scientific computations. While the capabilities of SCIRun's interactive approach are not yet fully exploited in the current TOAST implementation, an immediate benefit of the combined TOAST/SCIRun package is the availability of optimized parallel finite element forward solvers, and the use of SCIRun's existing 3D visualisation tools. A reconstruction of a segmented 3D head model is used as an example for demonstrating the capability of TOAST/SCIRun of simulating anatomically shaped meshes

UCL Discovery