20 research outputs found

    Surface Reconstruction from Scattered Point via RBF Interpolation on GPU

    Full text link
    In this paper we describe a parallel implicit method based on radial basis functions (RBF) for surface reconstruction. The applicability of RBF methods is hindered by its computational demand, that requires the solution of linear systems of size equal to the number of data points. Our reconstruction implementation relies on parallel scientific libraries and is supported for massively multi-core architectures, namely Graphic Processor Units (GPUs). The performance of the proposed method in terms of accuracy of the reconstruction and computing time shows that the RBF interpolant can be very effective for such problem.Comment: arXiv admin note: text overlap with arXiv:0909.5413 by other author

    Optimal, scalable forward models for computing gravity anomalies

    Full text link
    We describe three approaches for computing a gravity signal from a density anomaly. The first approach consists of the classical "summation" technique, whilst the remaining two methods solve the Poisson problem for the gravitational potential using either a Finite Element (FE) discretization employing a multilevel preconditioner, or a Green's function evaluated with the Fast Multipole Method (FMM). The methods utilizing the PDE formulation described here differ from previously published approaches used in gravity modeling in that they are optimal, implying that both the memory and computational time required scale linearly with respect to the number of unknowns in the potential field. Additionally, all of the implementations presented here are developed such that the computations can be performed in a massively parallel, distributed memory computing environment. Through numerical experiments, we compare the methods on the basis of their discretization error, CPU time and parallel scalability. We demonstrate the parallel scalability of all these techniques by running forward models with up to 10810^8 voxels on 1000's of cores.Comment: 38 pages, 13 figures; accepted by Geophysical Journal Internationa

    Optimal, scalable forward models for computing gravity anomalies

    Get PDF
    We describe three approaches for computing a gravity signal from a density anomaly. The first approach consists of the classical ‘summation' technique, while the remaining two methods solve the Poisson problem for the gravitational potential using either a finite-element (FE) discretization employing a multilevel pre-conditioner, or a Green′s function evaluated with the fast multipole method (FMM). The methods using the Poisson formulation described here differ from previously published approaches used in gravity modelling in that they are optimal, implying that both the memory and computational time required scale linearly with respect to the number of unknowns in the potential field. Additionally, all of the implementations presented here are developed such that the computations can be performed in a massively parallel, distributed memory-computing environment. Through numerical experiments, we compare the methods on the basis of their discretization error, CPU time and parallel scalability. We demonstrate the parallel scalability of all these techniques by running forward models with up to 108 voxels on 1000s of core

    Algorithmic patterns for H\mathcal{H}-matrices on many-core processors

    Get PDF
    In this work, we consider the reformulation of hierarchical (H\mathcal{H}) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H\mathcal{H} matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H\mathcal{H} matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H\mathcal{H} matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full H\mathcal{H} matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source H\mathcal{H} matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard H\mathcal{H} matrix library, highlighting profound speedups of our many-core parallel approach

    PARALLEL MESHLESS RADIAL BASIS FUNCTION COLLOCATION METHOD FOR NEUTRON DIFFUSION PROBLEMS

    Get PDF
    The meshless global radial basis function (RBF) collocation method is widely used to model physical phenomena in science and engineering. The method produces highly accurate solutions with an exponential convergence rate. However, due to the global approximation structure of the method, dense node distributions lead to long computation times and hinder the applicability of the technique. In order to overcome this issue, this study proposes a parallel meshless global RBF collocation algorithm. The algorithm is applied to 2-D neutron diffusion problems. The multiquadric is used as the RBF. The algorithm is developed with Mathematica and eight virtual processors are used in calculations on a multicore computer with four physical cores. The method provides accurate numerical results in a stable manner. Parallel speedup increases with the number of processors up to five and seven processors for external and fission source problems, respectively. The speedup values are limited by the constrained resource sharing of the multicore computer’s memory. On the other hand, significant time savings are achieved with parallel computation. For the four-group fission source problem, when 4316 interpolation nodes are employed, the utilization of seven processors instead of sequential computation decreases the computation time of the meshless approach by 716 s

    FMM-based vortex method for simulation of isotropic turbulence on GPUs, compared with a spectral method

    Full text link
    The Lagrangian vortex method offers an alternative numerical approach for direct numerical simulation of turbulence. The fact that it uses the fast multipole method (FMM)--a hierarchical algorithm for N-body problems with highly scalable parallel implementations--as numerical engine makes it a potentially good candidate for exascale systems. However, there have been few validation studies of Lagrangian vortex simulations and the insufficient comparisons against standard DNS codes has left ample room for skepticism. This paper presents a comparison between a Lagrangian vortex method and a pseudo-spectral method for the simulation of decaying homogeneous isotropic turbulence. This flow field is chosen despite the fact that it is not the most favorable flow problem for particle methods (which shine in wake flows or where vorticity is compact), due to the fact that it is ideal for the quantitative validation of DNS codes. We use a 256^3 grid with Re_lambda=50 and 100 and look at the turbulence statistics, including high-order moments. The focus is on the effect of the various parameters in the vortex method, e.g., order of FMM series expansion, frequency of reinitialization, overlap ratio and time step. The vortex method uses an FMM code (exaFMM) that runs on GPU hardware using CUDA, while the spectral code (hit3d) runs on CPU only. Results indicate that, for this application (and with the current code implementations), the spectral method is an order of magnitude faster than the vortex method when using a single GPU for the FMM and six CPU cores for the FFT

    Pole assignment control design for time–varying time–delay systems using radial basis functions

    Get PDF
    Systems with time-varying time delays present a particularly challenging control problem. They have been observed across a wide array of domains, from hydraulic actuators to insulin delivery control systems. Control systems that address system time-delays, nonlinearities and uncertainty are the subject of much research but, whilst the specific concept of varying time delays is sometimes acknowledged (for example in the control of hydraulic manipulators), this appears to be less widely investigated than some other types of nonlinearity. In part motivated by recent research into internal multi-model control, as similarly applied to systems with unknown time-varying delays, the present work utilises a Gaussian radial basis function to switch between two or more partial controllers. Each partial controller is based on a linear model with a (time-invariant) time delay. The new algorithm is developed and evaluated via simulation using a non-minimal state space (NMSS) framework, with pole assignment as the design criterion. Simulation results suggest that it yields improved performance in comparison to a simpler switching approach and the equivalent linear control system. However, laboratory examples and further research into robustness and stability is required in the next step
    corecore