Search CORE

15,183 research outputs found

Parallel Space Decomposition of the Mesh Adaptive Direct Search Algorithm

Author: Charles Audet
J. E. Dennis Jr.
Sébastien Le Digabel
Publication venue
Publication date: 01/01/2008
Field of study

This paper describes a Parallel Space Decomposition (PSD) technique for the Mesh Adaptive Direct Search (MADS) algorithm. MADS extends Generalized Pattern Search for constrained nonsmooth optimization problems. The objective here is to solve larger problems more efficiently. The new method (PSD-MADS) is an asynchronous parallel algorithm in which the processes solve problems over subsets of variables. The convergence analysis based on the Clarke calculus is essentially the same as for the MADS algorithm. A practical implementation is described and some numerical results on problems with up to 500 variables illustrate advantages and limitations of PSD-MADS

CiteSeerX

PolyPublie

DSpace at Rice University

Extensions à l'algorithme de recherche directe mads pour l'optimisation non lisse

Author: Le Digabel Sébastien
Publication venue
Publication date: 01/01/2008
Field of study

Revue de la littérature sur les méthodes de recherche directe pour l'optimisation non lisse -- Démarche et organisation de la thèse -- Nonsmooth optimization through mesh adaptive direct search and variable neighborhood search -- Parallel space decomposition of the mesh adaptive direct search algorithm -- Orthomads : a deterministic mads instance with orthogonal directions

PolyPublie

A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing

Author: Ayachit U
Burstedde C
Carslaw HS
Cole KD
Ern A
Kaufman L
Kergaßner A
Lindgren LE
Mozaffar M
Schroeder WJ
Wohlers Associates Inc
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

This work introduces an innovative parallel, fully-distributed finite element framework for growing geometries and its application to metal additive manufacturing. It is well-known that virtual part design and qualification in additive manufacturing requires highly-accurate multiscale and multiphysics analyses. Only high performance computing tools are able to handle such complexity in time frames compatible with time-to-market. However, efficiency, without loss of accuracy, has rarely held the centre stage in the numerical community. Here, in contrast, the framework is designed to adequately exploit the resources of high-end distributed-memory machines. It is grounded on three building blocks: (1) Hierarchical adaptive mesh refinement with octree-based meshes; (2) a parallel strategy to model the growth of the geometry; (3) state-of-the-art parallel iterative linear solvers. Computational experiments consider the heat transfer analysis at the part scale of the printing process by powder-bed technologies. After verification against a 3D benchmark, a strong-scaling analysis assesses performance and identifies major sources of parallel overhead. A third numerical example examines the efficiency and robustness of (2) in a curved 3D shape. Unprecedented parallelism and scalability were achieved in this work. Hence, this framework contributes to take on higher complexity and/or accuracy, not only of part-scale simulations of metal or polymer additive manufacturing, but also in welding, sedimentation, atherosclerosis, or any other physical problem where the physical domain of interest grows in time

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Scipedia

Strict lower bounds with separation of sources of error in non-overlapping domain decomposition methods

Author: Gosselet Pierre
Rey Christian
Rey Valentine
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

This article deals with the computation of guaranteed lower bounds of the error in the framework of finite element (FE) and domain decomposition (DD) methods. In addition to a fully parallel computation, the proposed lower bounds separate the algebraic error (due to the use of a DD iterative solver) from the discretization error (due to the FE), which enables the steering of the iterative solver by the discretization error. These lower bounds are also used to improve the goal-oriented error estimation in a substructured context. Assessments on 2D static linear mechanic problems illustrate the relevance of the separation of sources of error and the lower bounds' independence from the substructuring. We also steer the iterative solver by an objective of precision on a quantity of interest. This strategy consists in a sequence of solvings and takes advantage of adaptive remeshing and recycling of search directions.Comment: International Journal for Numerical Methods in Engineering, Wiley, 201

arXiv.org e-Print Archive

Crossref

Hal-Diderot

The cosmological simulation code GADGET-2

Author: Abel
Appel
Ascasibar
Bagla
Bagla
Balsara
Barnes
Barnes
Bate
Bode
Bode
Bonnell
Boss
Bryan
Burkert
Cen
Cen
Cen
Couchman
Couchman
Cox
Cuadra
Davé
Davé
Dehnen
Di Matteo
Dolag
Dolag
Dolag
Dolag
Dubinski
Dubinski
Duncan
Efstathiou
Evrard
Evrard
Frenk
Fryxell
Fukushige
Gao
Gingold
Gnedin
Hairer
Heitmann
Hernquist
Hernquist
Hernquist
Hernquist
Hernquist
Hockney
Hut
Jenkins
Jenkins
Jernigan
Jubelgas
Kang
Katz
Kay
Klein
Klypin
Knebe
Kravtsov
Kravtsov
Kravtsov
Linder
Lucy
Makino
Makino
Makino
Marri
Monaghan
Monaghan
Monaghan
Monaghan
Motl
Navarro
Navarro
Norman
O'Shea
O'Shea
Owen
Pen
Poludnenko
Power
Quilis
Quinn
Rasio
Refregier
Saha
Salmon
Scannapieco
Serna
Serna
Sommer-Larsen
Springel
Springel
Springel
Springel
Springel
Springel
Springel
Springel
Stadel
Steinmetz
Steinmetz
Teyssier
Tissera
Tormen
Tornatore
Tornatore
Van Den Bosch
Volker Springel
Wadsley
Warren
Warren
White
White
White
Whitehurst
Xu
Yepes
Yoshida
Yoshida
Publication venue: 'Wiley'
Publication date: 01/01/2005
Field of study

We discuss the cosmological simulation code GADGET-2, a new massively parallel TreeSPH code, capable of following a collisionless fluid with the N-body method, and an ideal gas by means of smoothed particle hydrodynamics (SPH). Our implementation of SPH manifestly conserves energy and entropy in regions free of dissipation, while allowing for fully adaptive smoothing lengths. Gravitational forces are computed with a hierarchical multipole expansion, which can optionally be applied in the form of a TreePM algorithm, where only short-range forces are computed with the `tree'-method while long-range forces are determined with Fourier techniques. Time integration is based on a quasi-symplectic scheme where long-range and short-range forces can be integrated with different timesteps. Individual and adaptive short-range timesteps may also be employed. The domain decomposition used in the parallelisation algorithm is based on a space-filling curve, resulting in high flexibility and tree force errors that do not depend on the way the domains are cut. The code is efficient in terms of memory consumption and required communication bandwidth. It has been used to compute the first cosmological N-body simulation with more than 10^10 dark matter particles, reaching a homogeneous spatial dynamic range of 10^5 per dimension in a 3D box. It has also been used to carry out very large cosmological SPH simulations that account for radiative cooling and star formation, reaching total particle numbers of more than 250 million. We present the algorithms used by the code and discuss their accuracy and performance using a number of test problems. GADGET-2 is publicly released to the research community.Comment: submitted to MNRAS, 31 pages, 20 figures (reduced resolution), code available at http://www.mpa-garching.mpg.de/gadge

arXiv.org e-Print Archive

A Parallel Adaptive P3M code with Hierarchical Particle Reordering

Author: Anderson
Bagla
Balsara
Barnes
Becciani
Blumenthal
Bode
Boris
Brieu
Couchman
Couchman
Dave
Decyk
Dubinski
Dubinski
Eastwood
Efstathiou
Evrard
Ferrell
Frenk
Frigo
Gingold
Greengard
H.M.P. Couchman
Hernquist
Hernquist
Hockney
Kawata
Kravtsov
Li
Lia
MacFarland
Miocchi
Monaghan
Navarro
Pearce
Robert J. Thacker
Serna
Snir
Spergel
Springel
Springel
Steinmetz
Sugimoto
Swarztrauber
Thacker
Thacker
Thacker
Thacker
Theuns
Vetterling
Wadsley
White
Wisdom
Wood
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

We discuss the design and implementation of HYDRA_OMP a parallel implementation of the Smoothed Particle Hydrodynamics-Adaptive P3M (SPH-AP3M) code HYDRA. The code is designed primarily for conducting cosmological hydrodynamic simulations and is written in Fortran77+OpenMP. A number of optimizations for RISC processors and SMP-NUMA architectures have been implemented, the most important optimization being hierarchical reordering of particles within chaining cells, which greatly improves data locality thereby removing the cache misses typically associated with linked lists. Parallel scaling is good, with a minimum parallel scaling of 73% achieved on 32 nodes for a variety of modern SMP architectures. We give performance data in terms of the number of particle updates per second, which is a more useful performance metric than raw MFlops. A basic version of the code will be made available to the community in the near future.Comment: 34 pages, 12 figures, accepted for publication in Computer Physics Communication

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Scalable parallel simulation of small-scale structures in cold dark matter

Author: Shirokov Alexander V. (Alexander Victorovich)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Physics, 2005.Includes bibliographical references (p. 179-181).We present a parallel implementation of the particle-particle/particle-mesh (P³M) algorithm for distributed memory clusters. The llp3m-hc code uses a hybrid method for both computation and domain decomposition. Long-range forces are computed using a Fourier transform gravity solver on a regular mesh; the mesh is distributed across parallel processes using a static one-dimensional slab domain decomposition. Short-range forces are computed by direct summation of close pairs; particles are distributed using a dynamic domain decomposition based on a space-filling Hilbert curve. A nearly-optimal method was devised to dynamically repartition the particle distribution so as to maintain load balance even for extremely inhomogeneous mass distributions. Tests using 800³ simulations on a 40-processor Beowulf cluster showed good load balance and scalability up to 80 processes. We discuss the limits on scalability imposed by communication and extreme clustering and suggest how they may be removed by extending our algorithm to include a new adaptive P³M technique, which we then introduce and present as a new llap3m-hc code. We optimize free parameters of adaptive P³M to minimize force errors and the timing required to compute short range forces. We apply our codes to simulate small scale structure of the universe at redshift z > 50. We observe and analyze the formation of caustics in the structure and compare it with the predictions of semi-analytic models of structure formation. The current limits on neutralino detection experiments assume a Maxwell-Boltzmann velocity distribution and smooth spatial distribution of dark matter.(cont.) It is shown in this thesis that inhomogeneous distribution of dark matter on small scales significantly changes the predicted event rates in direct detection dark matter experiments. The effect of spatial inhomogeneity weakens the upper limits on neutralino cross section produced in the Cryogenic Dark Matter Search Experiment.by Alexander V. Shirokov.Ph.D

DSpace@MIT

A Fast Parallel Poisson Solver on Irregular Domains Applied to Beam Dynamic Simulations

Author: A. Adelmann
Adams
Forsythe
Gluckstern
Greenbaum
Hackbusch
Hackbusch
Heroux
Hestenes
Hockney
Jomaa
Landau
LeVeque
McCorquodale
P. Arbenz
Pöplau
Qiang
Qiang
Saad
Sacherer
Serafini
Shortley
Struckmeier
Swarztrauber
Trottenberg
Trottenberg
van der Vorst
Vaněk
Wiedemann
Y. Ineichen
Young
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We discuss the scalable parallel solution of the Poisson equation within a Particle-In-Cell (PIC) code for the simulation of electron beams in particle accelerators of irregular shape. The problem is discretized by Finite Differences. Depending on the treatment of the Dirichlet boundary the resulting system of equations is symmetric or `mildly' nonsymmetric positive definite. In all cases, the system is solved by the preconditioned conjugate gradient algorithm with smoothed aggregation (SA) based algebraic multigrid (AMG) preconditioning. We investigate variants of the implementation of SA-AMG that lead to considerable improvements in the execution times. We demonstrate good scalability of the solver on distributed memory parallel processor with up to 2048 processors. We also compare our SAAMG-PCG solver with an FFT-based solver that is more commonly used for applications in beam dynamics

arXiv.org e-Print Archive

CiteSeerX

Crossref