Search CORE

22,601 research outputs found

Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach

Author: Ciorba Florina M.
Eleliemy Ahmed
Publication venue
Publication date: 01/01/2019
Field of study

Computationally-intensive loops are the primary source of parallelism in scientific applications. Such loops are often irregular and a balanced execution of their loop iterations is critical for achieving high performance. However, several factors may lead to an imbalanced load execution, such as problem characteristics, algorithmic, and systemic variations. Dynamic loop self-scheduling (DLS) techniques are devised to mitigate these factors, and consequently, improve application performance. On distributed-memory systems, DLS techniques can be implemented using a hierarchical master-worker execution model and are, therefore, called hierarchical DLS techniques. These techniques self-schedule loop iterations at two levels of hardware parallelism: across and within compute nodes. Hybrid programming approaches that combine the message passing interface (MPI) with open multi-processing (OpenMP) dominate the implementation of hierarchical DLS techniques. The MPI-3 standard includes the feature of sharing memory regions among MPI processes. This feature introduced the MPI+MPI approach that simplifies the implementation of parallel scientific applications. The present work designs and implements hierarchical DLS techniques by exploiting the MPI+MPI approach. Four well-known DLS techniques are considered in the evaluation proposed herein. The results indicate certain performance advantages of the proposed approach compared to the hybrid MPI+OpenMP approach

arXiv.org e-Print Archive

Crossref

edoc

From carbon nanotubes and silicate layers to graphene platelets for polymer nanocomposites

Author: Ajayan
Alex Sovi
Andrew Michelmore
Banhart
Bose
Cao
Caprino
Case
Dai
Fukushima
Ganguli
Gu
Hsiao
Hsu-Chiang Kuan
Huang
Huang
Izzuddin Zaman
Jia
Jia
Jingfei Dai
Jun Ma
Kuan
Le
Lee
Lee Luong
Li
Li
Liu
Lu
Ma
Ma
Ma
Ma
Ma
Ma
Ma
Nemes-Incze
Nobuyuki Kawashima
Park
Rochefort
Schnorr
Shen
Songyi Dong
Su
Suhr
Tang
Wang
Wang
Wang
Wang
Wang
Wang
Wu
Wu
Zaman
Zaman
Zhang
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2012
Field of study

In spite of extensive studies conducted on carbon nanotubes and silicate layers for their polymer-based nanocomposites, the rise of graphene now provides a more promising candidate due to its exceptionally high mechanical performance and electrical and thermal conductivities. The present study developed a facile approach to fabricate epoxy–graphene nanocomposites by thermally expanding a commercial product followed by ultrasonication and solution-compounding with epoxy, and investigated their morphologies, mechanical properties, electrical conductivity and thermal mechanical behaviour. Graphene platelets (GnPs) of 3.5

UTHM Institutional Repository

Crossref

An adaptive hierarchical domain decomposition method for parallel contact dynamics simulations of granular materials

Author: Allen
Anitescu
Brendel
Calvetti
Cundall
Deng
Dietrich E. Wolf
Fleissner
Haff
Iglberger
Iglberger
Jean
Joer
Jourdan
János Török
Kadau
Kaufman
Knudsen
Lothar Brendel
Luding
Lötstedt
M. Reza Shaebani
McNamara
Miller
Miller
Moreau
Mueth
Nassi
Nyland
Plimpton
Plimpton
Press
Radjai
Radjai
Rapaport
Renouf
Revathi
Rock
Shaebani
Shaebani
Stewart
Stewart
Stewart
Unger
Unger
Unger
Wackenhut
Walton
Zahra Shojaaee
Publication venue: 'Elsevier BV'
Publication date: 28/12/2011
Field of study

A fully parallel version of the contact dynamics (CD) method is presented in this paper. For large enough systems, 100% efficiency has been demonstrated for up to 256 processors using a hierarchical domain decomposition with dynamic load balancing. The iterative scheme to calculate the contact forces is left domain-wise sequential, with data exchange after each iteration step, which ensures its stability. The number of additional iterations required for convergence by the partially parallel updates at the domain boundaries becomes negligible with increasing number of particles, which allows for an effective parallelization. Compared to the sequential implementation, we found no influence of the parallelization on simulation results.Comment: 19 pages, 15 figures, published in Journal of Computational Physics (2011

arXiv.org e-Print Archive

Crossref