1,065 research outputs found
A Parallel Mesh-Adaptive Framework for Hyperbolic Conservation Laws
We report on the development of a computational framework for the parallel,
mesh-adaptive solution of systems of hyperbolic conservation laws like the
time-dependent Euler equations in compressible gas dynamics or
Magneto-Hydrodynamics (MHD) and similar models in plasma physics. Local mesh
refinement is realized by the recursive bisection of grid blocks along each
spatial dimension, implemented numerical schemes include standard
finite-differences as well as shock-capturing central schemes, both in
connection with Runge-Kutta type integrators. Parallel execution is achieved
through a configurable hybrid of POSIX-multi-threading and MPI-distribution
with dynamic load balancing. One- two- and three-dimensional test computations
for the Euler equations have been carried out and show good parallel scaling
behavior. The Racoon framework is currently used to study the formation of
singularities in plasmas and fluids.Comment: late submissio
Distributed-memory parallelization of an explicit time-domain volume integral equation solver on Blue Gene/P
Two distributed-memory schemes for efficiently parallelizing the explicit marching-on in-time based solution of the time domain volume integral equation on the IBM Blue Gene/P platform are presented. In the first scheme, each processor stores the time history of all source fields and only the computationally dominant step of the tested field computations is distributed among processors. This scheme requires all-to-all global communications to update the time history of the source fields from the tested fields. In the second scheme, the source fields as well as all steps of the tested field computations are distributed among processors. This scheme requires sequential global communications to update the time history of the distributed source fields from the tested fields. Numerical results demonstrate that both schemes scale well on the IBM Blue Gene/P platform and the memory efficient second scheme allows for the characterization of transient wave interactions on composite structures discretized using three million spatial elements without an acceleration algorithm
A scalable H-matrix approach for the solution of boundary integral equations on multi-GPU clusters
In this work, we consider the solution of boundary integral equations by
means of a scalable hierarchical matrix approach on clusters equipped with
graphics hardware, i.e. graphics processing units (GPUs). To this end, we
extend our existing single-GPU hierarchical matrix library hmglib such that it
is able to scale on many GPUs and such that it can be coupled to arbitrary
application codes. Using a model GPU implementation of a boundary element
method (BEM) solver, we are able to achieve more than 67 percent relative
parallel speed-up going from 128 to 1024 GPUs for a model geometry test case
with 1.5 million unknowns and a real-world geometry test case with almost 1.2
million unknowns. On 1024 GPUs of the cluster Titan, it takes less than 6
minutes to solve the 1.5 million unknowns problem, with 5.7 minutes for the
setup phase and 20 seconds for the iterative solver. To the best of the
authors' knowledge, we here discuss the first fully GPU-based
distributed-memory parallel hierarchical matrix Open Source library using the
traditional H-matrix format and adaptive cross approximation with an
application to BEM problems
Coupled Kinetic-Fluid Simulations of Ganymede's Magnetosphere and Hybrid Parallelization of the Magnetohydrodynamics Model
The largest moon in the solar system, Ganymede, is the only moon known to possess a strong intrinsic magnetic field.
The interaction between the Jovian plasma and Ganymede's magnetic field creates a mini-magnetosphere with periodically varying upstream conditions, which creates a perfect laboratory in nature for studying magnetic reconnection and magnetospheric physics.
Using the latest version of Space Weather Modeling Framework (SWMF), we study the upstream plasma interactions and dynamics in this subsonic, sub-Alfvénic system.
We have developed a coupled fluid-kinetic Hall Magnetohydrodynamics with embedded Particle-in-Cell (MHD-EPIC) model for Ganymede's magnetosphere, with a self-consistently coupled resistive body representing the electrical properties of the moon's interior, improved inner boundary conditions, and high resolution charge and energy conserved PIC scheme.
I reimplemented the boundary condition setup in SWMF for more versatile control and functionalities, and developed a new user module for Ganymede's simulation.
Results from the models are validated with Galileo magnetometer data of all close encounters and compared with Plasma Subsystem (PLS) data.
The energy fluxes associated with the upstream reconnection in the model is estimated to be about 10^-7 W/cm^2, which accounts for about 40% to the total peak auroral emissions observed by the Hubble Space Telescope.
We find that under steady upstream conditions, magnetopause reconnection in our fluid-kinetic simulations occurs in a non-steady manner.
Flux ropes with length of Ganymede's radius form on the magnetopause at a rate about 3/minute and create spatiotemporal variations in plasma and field properties.
Upon reaching proper grid resolutions, the MHD-EPIC model can resolve both electron and ion kinetics at the magnetopause and show localized crescent shape distribution in both ion and electron phase space, non-gyrotropic and non-isotropic behavior inside the diffusion regions.
The estimated global reconnection rate from the models is about 80 kV with 60% efficiency.
There is weak evidence of minute periodicity in the temporal variations of the reconnection rate due to the dynamic reconnection process.
The requirement of high fidelity results promotes the development of hybrid parallelized numerical model strategy and faster data processing techniques.
The state-of-the-art finite volume/difference MHD code Block Adaptive Tree Solarwind Roe Upwind Scheme (BATS-R-US) was originally designed with pure MPI parallelization.
The maximum problem size achievable was limited by the storage requirements of the block tree structure.
To mitigate this limitation, we have added multithreaded OpenMP parallelization to the previous pure MPI implementation.
We opt to use a coarse-grained approach by making the loops over grid blocks multithreaded and have succeeded in making BATS-R-US an efficient hybrid parallel code with modest changes in the source code while preserving the performance.
Good weak scalings up to 50,0000 and 25,0000 of cores are achieved for the explicit and implicit time stepping schemes, respectively.
This parallelization strategy greatly extends the possible simulation scale by an order of magnitude, and paves the way for future GPU-portable code development.
To improve visualization and data processing, I have developed a whole new data processing workflow with the Julia programming language for efficient data analysis and visualization.
As a summary,
1. I build a single fluid Hall MHD-EPIC model of Ganymede's magnetosphere;
2. I did detailed analysis of the upstream reconnection;
3. I developed a MPI+OpenMP parallel MHD model with BATSRUS;
4. I wrote a package for data analysis and visualization.PHDClimate and Space Sciences and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163032/1/hyzhou_1.pd
A Parallel Iterative Method for Computing Molecular Absorption Spectra
We describe a fast parallel iterative method for computing molecular
absorption spectra within TDDFT linear response and using the LCAO method. We
use a local basis of "dominant products" to parametrize the space of orbital
products that occur in the LCAO approach. In this basis, the dynamical
polarizability is computed iteratively within an appropriate Krylov subspace.
The iterative procedure uses a a matrix-free GMRES method to determine the
(interacting) density response. The resulting code is about one order of
magnitude faster than our previous full-matrix method. This acceleration makes
the speed of our TDDFT code comparable with codes based on Casida's equation.
The implementation of our method uses hybrid MPI and OpenMP parallelization in
which load balancing and memory access are optimized. To validate our approach
and to establish benchmarks, we compute spectra of large molecules on various
types of parallel machines.
The methods developed here are fairly general and we believe they will find
useful applications in molecular physics/chemistry, even for problems that are
beyond TDDFT, such as organic semiconductors, particularly in photovoltaics.Comment: 20 pages, 17 figures, 3 table
Parallel Adaptive Monte Carlo Integration with the Event Generator WHIZARD
We describe a new parallel approach to the evaluation of phase space for
Monte-Carlo event generation, implemented within the framework of the WHIZARD
package. The program realizes a twofold self-adaptive multi-channel
parameterization of phase space and makes use of the standard OpenMP and MPI
protocols for parallelization. The modern MPI3 feature of asynchronous
communication is an essential ingredient of the computing model. Parallel
numerical evaluation applies both to phase-space integration and to event
generation, thus covering the most computing-intensive parts of physics
simulation for a realistic collider environment.Comment: 28 pages, 4 figure
UÄŤinkovita metoda paralelnog raÄŤunanja za obradu velike koliÄŤine podataka prikupljanih senzorom
In recent years we witness the advent of the Internet of Things and the wide deployment of sensors in many applications for collecting and aggregating data. Efficient techniques are required to analyze these massive data for supporting intelligent decisions making. Partial differential problems which involve large data are the most common in the engineering and scientific research. For simulations of large-scale three-dimensional partial differential equations, the intensive computation ability and large amounts of memory requirements for modeling are the main research problems. To address the two challenges, this paper provided an effective parallel method for partial differential equations. The proposed approach combines the overlapping domain decomposition strategy and the multi-core cluster technology to achieve parallel simulations of partial differential equations, uses the finite difference method to discretize equations and adopts the hybrid MPI/OpenMP programming model to exploit two-level parallelism on a multi-core cluster. The three-dimensional groundwater flow model with the parallel finite difference overlapping domain decomposition strategy was successfully set up and carried out by the parallel MPI/OpenMP implementation on a multi-core cluster with two nodes. The experimental results show that the proposed parallel approach can efficiently simulate partial differential problems with large amounts of data.Posljednjih godina svjedočimo dolasku tzv. interneta stvari i širokoj uporabi senzora u raznim primjenama prikupljanja i objedinjavanja podataka. Učinkovite metode su potrebne za analizu velike količine podataka u svrhu podrške inteligentnom odlučivanju. Parcijalno diferencijalni problemi koji uključuju veliku količinu podataka su gotovo uobičajeni u inženjerstvu i znanstvenom istraživanju. Za simulaciju masovnih trodimenzionalnih parcijalnih diferencijalnih jednadžbi potrebne su značajne računalne mogućnosti, a potreba za velikom količinom memorije za modeliranje je glavni istraživački problem. Za rješavanje oba problema ovaj rad pruža efektivnu paralelnu metodu za parcijalne diferencijalne jednadžbe. Predloženi pristup kombinira strategiju dekompozicije preklapajućih domena i tehnologiju višejezgrenih klastera za postizanje paralelnih simulacija parcijalnih diferencijalnih jednadžbi, koristi metodu konačnog diferenciranja za diskretizaciju jednadžbi te model MPI/OpenMP hibridnog programiranja za iskorištavanje dvorazinskog paralelizma na višejezgrenom klasteru. Formiran je trodimenzionalan model toka podzemnih voda sa strategijom dekompozicije preklapajućih domena konačnih diferencija. Za izvođenje je korištena paralelna MPI/OpenMP implementacija na višejezgrenom klasteru s dva čvora. Eksperimentalni rezultati su pokazali kako predloženi paralelni pristup može učinkovito simulirati parcijalno diferencijalne probleme s velikom količinom podataka
Recommended from our members
Preparing sparse solvers for exascale computing.
Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'
- …