Search CORE

89,962 research outputs found

Integral formulation of the measured equation of invariance

Author: Rius Casals Juan Manuel
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/1996
Field of study

A novel integral formulation of the measured equation of invariance is derived from the reciprocity theorem. This formulation leads to a sparse matrix equation for the induced surface current, resulting in great CPU time and memory savings over the conventional approaches. The algorithm has been implemented for two-dimensional perfectly conducting scatterers.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Integral equation mei applied to three-dimensional arbitrary surfaces

Author: Mosig J R
Parrón Granados Josep
Rius Casals Juan Manuel
Úbeda Farré Eduard
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/1997
Field of study

The authors present a new formulation of the integral equation of the measured equation of invariance (MEI) as a confined field integral equation discretised by the method of moments, in which the use of numerically derived testing functions results in an approximately sparse linear system with storage memory requirements and a CPU time for computing the matrix coefficients proportional to the number of unknowns.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

A Second-Order Distributed Trotter-Suzuki Solver with a Hybrid Kernel

Author: Cucchietti
Dagum
De Raedt
De Raedt
Fernando M. Cucchietti
Lewenstein
Peter Wittek
Poulin
Suzuki
Suzuki
Suzuki
Trotter
Publication venue: 'Elsevier BV'
Publication date: 12/08/2012
Field of study

The Trotter-Suzuki approximation leads to an efficient algorithm for solving the time-dependent Schr\"odinger equation. Using existing highly optimized CPU and GPU kernels, we developed a distributed version of the algorithm that runs efficiently on a cluster. Our implementation also improves single node performance, and is able to use multiple GPUs within a node. The scaling is close to linear using the CPU kernels, whereas the efficiency of GPU kernels improve with larger matrices. We also introduce a hybrid kernel that simultaneously uses multicore CPUs and GPUs in a distributed system. This kernel is shown to be efficient when the matrix size would not fit in the GPU memory. Larger quantum systems scale especially well with a high number nodes. The code is available under an open source license.Comment: 11 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

University of Borås

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Author: Bader M
Brito Gadeschi G
Weinzierl T
Wille M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

We identify and show how to overcome an OpenMP bottleneck in the administration of GPU memory. It arises for a wave equation solver on dynamically adaptive block-structured Cartesian meshes, which keeps all CPU threads busy and allows all of them to offload sets of patches to the GPU. Our studies show that multithreaded, concurrent, non-deterministic access to the GPU leads to performance breakdowns, since the GPU memory bookkeeping as offered through OpenMP’s map clause, i.e., the allocation and freeing, becomes another runtime challenge besides expensive data transfer and actual computation. We, therefore, propose to retain the memory management responsibility on the host: A caching mechanism acquires memory on the accelerator for all CPU threads, keeps hold of this memory and hands it out to the offloading threads upon demand. We show that this user-managed, CPU-based memory administration helps us to overcome the GPU memory bookkeeping bottleneck and speeds up the time-to-solution of Finite Volume kernels by more than an order of magnitude

Durham Research Online

Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Author: Baboulin M.
Bader M
Bhatele A.
Brito Gadeschi G
Hammond J.
Kruse C.
Weinzierl T
Wille M
Publication venue: Springer Verlag
Publication date: 10/05/2023
Field of study

Durham Research Online

An efficient and robust algorithm for two dimensional time dependent incompressible Navier-Stokes equations: High Reynolds number flows

Author: Goodrich John W.
Publication venue
Publication date
Field of study

An algorithm is presented for unsteady two-dimensional incompressible Navier-Stokes calculations. This algorithm is based on the fourth order partial differential equation for incompressible fluid flow which uses the streamfunction as the only dependent variable. The algorithm is second order accurate in both time and space. It uses a multigrid solver at each time step. It is extremely efficient with respect to the use of both CPU time and physical memory. It is extremely robust with respect to Reynolds number

NASA Technical Reports Server

Two-dimensional Euler and Navier-Stokes Time accurate simulations of fan rotor flows

Author: Boretti A. A.
Publication venue
Publication date
Field of study

Two numerical methods are presented which describe the unsteady flow field in the blade-to-blade plane of an axial fan rotor. These methods solve the compressible, time-dependent, Euler and the compressible, turbulent, time-dependent, Navier-Stokes conservation equations for mass, momentum, and energy. The Navier-Stokes equations are written in Favre-averaged form and are closed with an approximate two-equation turbulence model with low Reynolds number and compressibility effects included. The unsteady aerodynamic component is obtained by superposing inflow or outflow unsteadiness to the steady conditions through time-dependent boundary conditions. The integration in space is performed by using a finite volume scheme, and the integration in time is performed by using k-stage Runge-Kutta schemes, k = 2,5. The numerical integration algorithm allows the reduction of the computational cost of an unsteady simulation involving high frequency disturbances in both CPU time and memory requirements. Less than 200 sec of CPU time are required to advance the Euler equations in a computational grid made up of about 2000 grid during 10,000 time steps on a CRAY Y-MP computer, with a required memory of less than 0.3 megawords

NASA Technical Reports Server

Numerical solution of a three-dimensional cubic cavity flow by using the Boltzmann equation

Author: Hwang Danny P.
Publication venue
Publication date
Field of study

A three-dimensional cubic cavity flow has been analyzed for diatomic gases by using the Boltzmann equation with the Bhatnagar-Gross-Krook (B-G-K) model. The method of discrete ordinate was applied, and the diffuse reflection boundary condition was assumed. The results, which show a consistent trend toward the Navier-Stokes solution as the Knudson number is reduced, give us confidence to apply the method to a three-dimensional geometry for practical predictions of rarefied-flow characteristics. The CPU time and the main memory required for a three-dimensional geometry using this method seem reasonable

NASA Technical Reports Server