Search CORE

15,695 research outputs found

COSMIC/NASTRAN on the Cray Computer Systems

Author: Brown W. K.
Pamidi P. R.
Publication venue
Publication date: 01/08/1984
Field of study

COSMIC/NASTRAN was converted to the CRAY computer systems. The CRAY version is currently available and provides users with access to all of the machine independent source code of COSMIC/NASTRAN. Future releases of COSMIC/NASTRAN will be made available on the CRAY soon after they are released by COSMIC

NASA Technical Reports Server

A parallel nearly implicit time-stepping scheme

Author: Botchev M.A.
Vorst H.A. van der
Publication venue: Elsevier
Publication date: 01/01/2001
Field of study

Across-the-space parallelism still remains the most mature, convenient and natural way to parallelize large scale problems. One of the major problems here is that implicit time stepping is often difficult to parallelize due to the structure of the system. Approximate implicit schemes have been suggested to circumvent the problem. These schemes have attractive stability properties and they are also very well parallelizable.\ud The purpose of this article is to give an overall assessment of the parallelism of the method

Elsevier - Publisher Connector

Crossref

CWI's Institutional Repository

University of Twente Research Information

Experiences in porting mini-applications to OpenACC and OpenMP on heterogeneous systems

Author: Budiardja RD
Daley C
Gayatri R
Hernandez O
Joubert W
Vergara Larrea VG
Publication venue: eScholarship, University of California
Publication date: 25/10/2020
Field of study

This article studies mini-applications—Minisweep, GenASiS, GPP, and FF—that use computational methods commonly encountered in HPC. We have ported these applications to develop OpenACC and OpenMP versions, and evaluated their performance on Titan (Cray XK7 with K20x GPUs), Cori (Cray XC40 with Intel KNL), Summit (IBM AC922 with Volta GPUs), and Cori-GPU (Cray CS-Storm 500NX with Intel Skylake and Volta GPUs). Our goals are for these new ports to be useful to both application and compiler developers, to document and describe the lessons learned and the methodology to create optimized OpenMP and OpenACC versions, and to provide a description of possible migration paths between the two specifications. Cases where specific directives or code patterns result in improved performance for a given architecture are highlighted. We also include discussions of the functionality and maturity of the latest compilers available on the above platforms with respect to OpenACC or OpenMP implementations

Crossref

eScholarship - University of California

Flow Analysis of Space Shuttle Feed Line 17-inch Disconnect Valve

Author: Kandula Max
Pearce Daniel
Publication venue
Publication date: 25/07/1988
Field of study

A steady incompressible three-dimensional viscous flow analysis has been conducted for the Space Shuttle External Tank/Orbiter propellant feed line disconnect flapper valves with upstream elbows. The full Navier-Stokes code, INS3D, is modified to handle interior obstacles. Grids are generated by SVTGD3D code. Two dimensional initial grids in the flow cross section with and without the flappers are improved by elliptic smoothing to provide better orthogonality, clustering and smoothness to the three dimensional grid. The flow solver is tested for stability and convergence in the presence of interior flappers. An under-relaxation scheme has been incorporated to improve the solution stability. Important flow characteristics such as secondary flows, recirculation, vortex and wake regions, and separated flows are observed. Computed values for forces, moments, and pressure drop are in satisfactory agreement with water flow test data covering a maximum tube Reynolds number of 3.5 x 10(exp 6). The results will serve as a guide to improved design and enhanced testing of the disconnect

NASA Technical Reports Server

Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems

Author: Barrett R.
GEORG HAGER
GERALD SCHUBERT
GERHARD WELLEIN
HOLGER FEHSKE
Stüben K.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 29/06/2011
Field of study

We evaluate optimized parallel sparse matrix-vector operations for several representative application areas on widespread multicore-based cluster configurations. First the single-socket baseline performance is analyzed and modeled with respect to basic architectural properties of standard multicore chips. Beyond the single node, the performance of parallel sparse matrix-vector operations is often limited by communication overhead. Starting from the observation that nonblocking MPI is not able to hide communication cost using standard MPI implementations, we demonstrate that explicit overlap of communication and computation can be achieved by using a dedicated communication thread, which may run on a virtual core. Moreover we identify performance benefits of hybrid MPI/OpenMP programming due to improved load balancing even without explicit communication overlap. We compare performance results for pure MPI, the widely used "vector-like" hybrid programming strategies, and explicit overlap on a modern multicore-based cluster and a Cray XE6 system.Comment: 16 pages, 10 figure

arXiv.org e-Print Archive

Crossref

Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming

Author: Fehske Holger
Hager Georg
Schubert Gerald
Wellein Gerhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/12/2010
Field of study

We evaluate optimized parallel sparse matrix-vector operations for two representative application areas on widespread multicore-based cluster configurations. First the single-socket baseline performance is analyzed and modeled with respect to basic architectural properties of standard multicore chips. Going beyond the single node, parallel sparse matrix-vector operations often suffer from an unfavorable communication to computation ratio. Starting from the observation that nonblocking MPI is not able to hide communication cost using standard MPI implementations, we demonstrate that explicit overlap of communication and computation can be achieved by using a dedicated communication thread, which may run on a virtual core. We compare our approach to pure MPI and the widely used "vector-like" hybrid programming strategy.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Parallel computing for the finite element method

Author: Nicolas Alain
Nicolas Laurent
Vollaire Christian
Publication venue: 'EDP Sciences'
Publication date: 01/03/1998
Field of study

A finite element method is presented to compute time harmonic microwave fields in three dimensional configurations. Nodal-based finite elements have been coupled with an absorbing boundary condition to solve open boundary problems. This paper describes how the modeling of large devices has been made possible using parallel computation, New algorithms are then proposed to implement this formulation on a cluster of workstations (10 DEC ALPHA 300X) and on a CRAY C98. Analysis of the computation efficiency is performed using simple problems. The electromagnetic scattering of a plane wave by a perfect electric conducting airplane is finally given as example

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

HAL Descartes

Hal-Diderot