Search CORE

1,186 research outputs found

A new parallel algorithm for lagrange interpolation on a hypercube

Author: Katti C.P.
Kumari R.
Publication venue: Published by Elsevier Ltd.
Publication date: 30/04/2006
Field of study

AbstractWe present a new parallel algorithm for computing N point lagrange interpolation on an n-dimensional hypercube with total number of nodes p = 2n. Initially, we consider the case when N = p. The algorithm is extended to the case when only p (p fixed) processors are available, p < N. We assume that N is exactly divisible by p. By dividing the hypercube into subcubes of dimension two, we compute the products and sums appearing in Lagrange's formula in a novel way such that wasteful repetitions of forming products are avoided. The speed up and efficiency of our algorithm is calculated both theoretically and by simulating it over a network of PCs

A massively parallel semi-Lagrangian solver for the six-dimensional Vlasov-Poisson equation

Author: Kormann Katharina
Rampp Markus
Reuter Klaus
Publication venue: 'SAGE Publications'
Publication date: 01/01/2019
Field of study

This paper presents an optimized and scalable semi-Lagrangian solver for the Vlasov-Poisson system in six-dimensional phase space. Grid-based solvers of the Vlasov equation are known to give accurate results. At the same time, these solvers are challenged by the curse of dimensionality resulting in very high memory requirements, and moreover, requiring highly efficient parallelization schemes. In this paper, we consider the 6d Vlasov-Poisson problem discretized by a split-step semi-Lagrangian scheme, using successive 1d interpolations on 1d stripes of the 6d domain. Two parallelization paradigms are compared, a remapping scheme and a classical domain decomposition approach applied to the full 6d problem. From numerical experiments, the latter approach is found to be superior in the massively parallel case in various respects. We address the challenge of artificial time step restrictions due to the decomposition of the domain by introducing a blocked one-sided communication scheme for the purely electrostatic case and a rotating mesh for the case with a constant magnetic field. In addition, we propose a pipelining scheme that enables to hide the costs for the halo communication between neighbor processes efficiently behind useful computation. Parallel scalability on up to 65k processes is demonstrated for benchmark problems on a supercomputer

arXiv.org e-Print Archive

Parallel Deterministic and Stochastic Global Minimization of Functions with Very Many Minima

Author: Castle Brent S.
Easterling David R.
Madigan Michael L.
Trosset Michael W.
Watson Layne T.
Publication venue
Publication date: 01/01/2011
Field of study

The optimization of three problems with high dimensionality and many local minima are investigated under five different optimization algorithms: DIRECT, simulated annealing, Spall’s SPSA algorithm, the KNITRO package, and QNSTOP, a new algorithm developed at Indiana University

Computer Science Technical Reports @Virginia Tech

Computational methods and software systems for dynamics and control of large space structures

Author: Farhat C.
Felippa C. A.
Park K. C.
Pramono E.
Publication venue
Publication date
Field of study

Two key areas of crucial importance to the computer-based simulation of large space structures are discussed. The first area involves multibody dynamics (MBD) of flexible space structures, with applications directed to deployment, construction, and maneuvering. The second area deals with advanced software systems, with emphasis on parallel processing. The latest research thrust in the second area involves massively parallel computers

Rapid evaluation of radial basis functions

Author: Baxter Brad J.C.
Roussos George
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

Over the past decade, the radial basis function method has been shown to produce high quality solutions to the multivariate scattered data interpolation problem. However, this method has been associated with very high computational cost, as compared to alternative methods such as finite element or multivariate spline interpolation. For example. the direct evaluation at M locations of a radial basis function interpolant with N centres requires O(M N) floating-point operations. In this paper we introduce a fast evaluation method based on the Fast Gauss Transform and suitable quadrature rules. This method has been applied to the Hardy multiquadric, the inverse multiquadric and the thin-plate spline to reduce the computational complexity of the interpolant evaluation to O(M + N) floating point operations. By using certain localisation properties of conditionally negative definite functions this method has several performance advantages against traditional hierarchical rapid summation methods which we discuss in detail

Birkbeck Institutional Research Online

On sequential and parallel solution of initial value problems

Author: Kacewicz Bolesław Z
Publication venue: Published by Elsevier Inc.
Publication date: 30/06/1990
Field of study

AbstractWe deal with the solution of systems z′(x) = f(x, z(x)), x ϵ [0, 1], z(0) = η, where the function ƒ [0, 1] × Rs → Rs has r continuous bounded partial derivatives. We assume that available information about the problem consists of evaluations of n linear functionals at ƒ. If an adaptive choice of these functionals is allowed (which is suitable for sequential processing), then the minimal error of an algorithm is of order n−(r+1), for any dimension s. We show that if nonadaptive information (well-suited for parallel computation) is used, then the minimal error cannot be essentially less than n−(r+1)(s+1). Thus, adaption is significantly better, and the advantage of using it grows with s. This yields that the ε-complexity in sequential computation is smaller for adaptive information. For parallel computation, nonadaptive information is more efficient only if the number of processors is very large, depending exponentially on the dimension s. We conclude that using parallelism by computing the information nonadaptively is not feasible