Search CORE

20 research outputs found

Surface Reconstruction from Scattered Point via RBF Interpolation on GPU

Author: Cuomo Salvatore
Gallettiy Ardelio
Giuntay Giulio
Staracey Alfredo
Publication venue
Publication date: 01/01/2013
Field of study

In this paper we describe a parallel implicit method based on radial basis functions (RBF) for surface reconstruction. The applicability of RBF methods is hindered by its computational demand, that requires the solution of linear systems of size equal to the number of data points. Our reconstruction implementation relies on parallel scientific libraries and is supported for massively multi-core architectures, namely Graphic Processor Units (GPUs). The performance of the proposed method in terms of accuracy of the reconstruction and computing time shows that the RBF interpolant can be very effective for such problem.Comment: arXiv admin note: text overlap with arXiv:0909.5413 by other author

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli "Parthenope"

Archivio della ricerca - Università degli studi di Napoli Federico II

Optimal, scalable forward models for computing gravity anomalies

Author: Amestoy
Asgharzadeh
Boroomand
Briggs
Cai
Cruz
Dave A. May
Farquharson
Hughes
Johnson
Karypis
Knepley
Li
Li
Matthew G. Knepley
Saad
Smith
Trottenbert
Tufo
Wesseling
Yokota
Zienkiewicz
Publication venue: 'Wiley'
Publication date: 29/07/2011
Field of study

We describe three approaches for computing a gravity signal from a density anomaly. The first approach consists of the classical "summation" technique, whilst the remaining two methods solve the Poisson problem for the gravitational potential using either a Finite Element (FE) discretization employing a multilevel preconditioner, or a Green's function evaluated with the Fast Multipole Method (FMM). The methods utilizing the PDE formulation described here differ from previously published approaches used in gravity modeling in that they are optimal, implying that both the memory and computational time required scale linearly with respect to the number of unknowns in the potential field. Additionally, all of the implementations presented here are developed such that the computations can be performed in a massively parallel, distributed memory computing environment. Through numerical experiments, we compare the methods on the basis of their discretization error, CPU time and parallel scalability. We demonstrate the parallel scalability of all these techniques by running forward models with up to

10^8

voxels on 1000's of cores.Comment: 38 pages, 13 figures; accepted by Geophysical Journal Internationa

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Optimal, scalable forward models for computing gravity anomalies

Author: Knepley Matthew G.
May Dave A.
Publication venue
Publication date: 02/08/2017
Field of study

We describe three approaches for computing a gravity signal from a density anomaly. The first approach consists of the classical ‘summation' technique, while the remaining two methods solve the Poisson problem for the gravitational potential using either a finite-element (FE) discretization employing a multilevel pre-conditioner, or a Green′s function evaluated with the fast multipole method (FMM). The methods using the Poisson formulation described here differ from previously published approaches used in gravity modelling in that they are optimal, implying that both the memory and computational time required scale linearly with respect to the number of unknowns in the potential field. Additionally, all of the implementations presented here are developed such that the computations can be performed in a massively parallel, distributed memory-computing environment. Through numerical experiments, we compare the methods on the basis of their discretization error, CPU time and parallel scalability. We demonstrate the parallel scalability of all these techniques by running forward models with up to 108 voxels on 1000s of core

RERO DOC Digital Library

Algorithmic patterns for $\mathcal{H}$ -matrices on many-core processors

Author: Zaspel Peter
Publication venue
Publication date: 01/01/2017
Field of study

In this work, we consider the reformulation of hierarchical (

\mathcal{H}

) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs).

\mathcal{H}

matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of

\mathcal{H}

matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing

\mathcal{H}

matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full

\mathcal{H}

matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source

\mathcal{H}

matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard

\mathcal{H}

matrix library, highlighting profound speedups of our many-core parallel approach

arXiv.org e-Print Archive

edoc

Recommended from our members

Block preconditioners for linear systems arising from multiscale collocation with compactly supported RBFs

Author: Farrell Patricio
Pestana Jennifer
Publication venue: Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik
Publication date: 01/01/2014
Field of study

Symmetric collocation methods with radial basis functions allow approximation of the solution of a partial differential equation, even if the right-hand side is only known at scattered data points, without needing to generate a grid. However, the benefit of a guaranteed symmetric positive definite block system comes at a high computational cost. This cost can be alleviated somewhat by considering compactly supported radial basis functions and a multiscale technique. But the condition number and sparsity will still deteriorate with the number of data points. Therefore, we study certain block diagonal and triangular preconditioners. We investigate ideal preconditioners and determine the spectra of the preconditioned matrices before proposing more practical preconditioners based on a restricted additive Schwarz method with coarse grid correction (ARASM). Numerical results verify the effectiveness of the preconditioners

Repositorium für Naturwissenschaften und Technik

PARALLEL MESHLESS RADIAL BASIS FUNCTION COLLOCATION METHOD FOR NEUTRON DIFFUSION PROBLEMS

Author: Tayfun Tanbay
Publication venue: Bursa Uludag University
Publication date: 01/04/2024
Field of study

The meshless global radial basis function (RBF) collocation method is widely used to model physical phenomena in science and engineering. The method produces highly accurate solutions with an exponential convergence rate. However, due to the global approximation structure of the method, dense node distributions lead to long computation times and hinder the applicability of the technique. In order to overcome this issue, this study proposes a parallel meshless global RBF collocation algorithm. The algorithm is applied to 2-D neutron diffusion problems. The multiquadric is used as the RBF. The algorithm is developed with Mathematica and eight virtual processors are used in calculations on a multicore computer with four physical cores. The method provides accurate numerical results in a stable manner. Parallel speedup increases with the number of processors up to five and seven processors for external and fission source problems, respectively. The speedup values are limited by the constrained resource sharing of the multicore computer’s memory. On the other hand, significant time savings are achieved with parallel computation. For the four-group fission source problem, when 4316 interpolation nodes are employed, the utilization of seven processors instead of sequential computation decreases the computation time of the meshless approach by 716 s

Directory of Open Access Journals

FMM-based vortex method for simulation of isotropic turbulence on GPUs, compared with a spectral method

Author: Barba
Beale
Board
Cheng
Christiansen
Cottet
Ishihara
L.A. Barba
Leonard
Ould-Salihi
Rio Yokota
Rossi
Schumacher
Van Rees
Yokota
Yokota
Yokota
Yokota
Publication venue: 'Elsevier BV'
Publication date: 20/08/2012
Field of study

The Lagrangian vortex method offers an alternative numerical approach for direct numerical simulation of turbulence. The fact that it uses the fast multipole method (FMM)--a hierarchical algorithm for N-body problems with highly scalable parallel implementations--as numerical engine makes it a potentially good candidate for exascale systems. However, there have been few validation studies of Lagrangian vortex simulations and the insufficient comparisons against standard DNS codes has left ample room for skepticism. This paper presents a comparison between a Lagrangian vortex method and a pseudo-spectral method for the simulation of decaying homogeneous isotropic turbulence. This flow field is chosen despite the fact that it is not the most favorable flow problem for particle methods (which shine in wake flows or where vorticity is compact), due to the fact that it is ideal for the quantitative validation of DNS codes. We use a 256^3 grid with Re_lambda=50 and 100 and look at the turbulence statistics, including high-order moments. The focus is on the effect of the various parameters in the vortex method, e.g., order of FMM series expansion, frequency of reinitialization, overlap ratio and time step. The vortex method uses an FMM code (exaFMM) that runs on GPU hardware using CUDA, while the spectral code (hit3d) runs on CPU only. Results indicate that, for this application (and with the current code implementations), the spectral method is an order of magnitude faster than the vortex method when using a single GPU for the FMM and six CPU cores for the FFT

arXiv.org e-Print Archive

Crossref

Pole assignment control design for time–varying time–delay systems using radial basis functions

Author: Albrecht Olivia
Taylor C. James
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/03/2022
Field of study

Systems with time-varying time delays present a particularly challenging control problem. They have been observed across a wide array of domains, from hydraulic actuators to insulin delivery control systems. Control systems that address system time-delays, nonlinearities and uncertainty are the subject of much research but, whilst the specific concept of varying time delays is sometimes acknowledged (for example in the control of hydraulic manipulators), this appears to be less widely investigated than some other types of nonlinearity. In part motivated by recent research into internal multi-model control, as similarly applied to systems with unknown time-varying delays, the present work utilises a Gaussian radial basis function to switch between two or more partial controllers. Each partial controller is based on a linear model with a (time-invariant) time delay. The new algorithm is developed and evaluated via simulation using a non-minimal state space (NMSS) framework, with pole assignment as the design criterion. Simulation results suggest that it yields improved performance in comparison to a simpler switching approach and the equivalent linear control system. However, laboratory examples and further research into robustness and stability is required in the next step

Lancaster E-Prints