Search CORE

706 research outputs found

Parallelized reliability estimation of reconfigurable computer networks

Author: Das Subhendu
Nicol David M.
Palumbo Dan
Publication venue
Publication date
Field of study

A parallelized system, ASSURE, for computing the reliability of embedded avionics flight control systems which are able to reconfigure themselves in the event of failure is described. ASSURE accepts a grammar that describes a reliability semi-Markov state-space. From this it creates a parallel program that simultaneously generates and analyzes the state-space, placing upper and lower bounds on the probability of system failure. ASSURE is implemented on a 32-node Intel iPSC/860, and has achieved high processor efficiencies on real problems. Through a combination of improved algorithms, exploitation of parallelism, and use of an advanced microprocessor architecture, ASSURE has reduced the execution time on substantial problems by a factor of one thousand over previous workstation implementations. Furthermore, ASSURE's parallel execution rate on the iPSC/860 is an order of magnitude faster than its serial execution rate on a Cray-2 supercomputer. While dynamic load balancing is necessary for ASSURE's good performance, it is needed only infrequently; the particular method of load balancing used does not substantially affect performance

NASA Technical Reports Server

Multilevel Parallelization of AutoDock 4.2

Author: Andrew P Norgan
B Collignon
BK Shoichet
C McInnes
Carlos P Sosa
CP Sosa
David J Katzmann
DS Goodsell
DT Moustakas
G Morris
GM Morris
H Köppen
H Park
Jean-Pierre A Kocher
JJ Irwin
JR Schames
M Mezei
M Sanner
M Thormann
MW Chang
ND Prakhov
P Khodade
P Kolb
Paul K Coffman
S Cosconati
S Vetter
S Zhang
SF Sousa
X Jiang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Virtual (computational) screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4). Results Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O) traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Conclusions Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI) and node-level (OpenMP) parallelization to best fit both workloads and computational resources.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance

Author: Hager Georg
Kreutzer Moritz
Shahzad Faisal
Thies Jonas
Wellein Gerhard
Zeiser Thomas
Publication venue
Publication date: 07/08/2017
Field of study

In order to efficiently use the future generations of supercomputers, fault tolerance and power consumption are two of the prime challenges anticipated by the High Performance Computing (HPC) community. Checkpoint/Restart (CR) has been and still is the most widely used technique to deal with hard failures. Application-level CR is the most effective CR technique in terms of overhead efficiency but it takes a lot of implementation effort. This work presents the implementation of our C++ based library CRAFT (Checkpoint-Restart and Automatic Fault Tolerance), which serves two purposes. First, it provides an extendable library that significantly eases the implementation of application-level checkpointing. The most basic and frequently used checkpoint data types are already part of CRAFT and can be directly used out of the box. The library can be easily extended to add more data types. As means of overhead reduction, the library offers a build-in asynchronous checkpointing mechanism and also supports the Scalable Checkpoint/Restart (SCR) library for node level checkpointing. Second, CRAFT provides an easier interface for User-Level Failure Mitigation (ULFM) based dynamic process recovery, which significantly reduces the complexity and effort of failure detection and communication recovery mechanism. By utilizing both functionalities together, applications can write application-level checkpoints and recover dynamically from process failures with very limited programming effort. This work presents the design and use of our library in detail. The associated overheads are thoroughly analyzed using several benchmarks

arXiv.org e-Print Archive

Institute of Transport Research:Publications

A comparison using APPL and PVM for a parallel implementation of an unstructured grid generation program

Author: Arthur Trey
Bockelie Michael J.
Publication venue
Publication date
Field of study

Efforts to parallelize the VGRIDSG unstructured surface grid generation program are described. The inherent parallel nature of the grid generation algorithm used in VGRIDSG was exploited on a cluster of Silicon Graphics IRIS 4D workstations using the message passing libraries Application Portable Parallel Library (APPL) and Parallel Virtual Machine (PVM). Comparisons of speed up are presented for generating the surface grid of a unit cube and a Mach 3.0 High Speed Civil Transport. It was concluded that for this application, both APPL and PVM give approximately the same performance, however, APPL is easier to use

NASA Technical Reports Server

Tem_357 Harnessing the Power of Digital Transformation, Artificial Intelligence and Big Data Analytics with Parallel Computing

Author: Bong Chin Wei
Hoe Min
Jamal Ahmad Dargham
Publication venue
Publication date: 01/01/2020
Field of study

Traditionally, 2D and especially 3D forward modeling and inversion of large geophysical datasets are performed on supercomputing clusters. This was due to the fact computing time taken by using PC was too time consuming. With the introduction of parallel computing, attempts have been made to perform computationally intensive tasks on PC or clusters of personal computers where the computing power was based on Central Processing Unit (CPU). It is further enhanced with Graphical Processing Unit (GPU) as the GPU has become affordable with the launch of GPU based computing devices. Therefore this paper presents a didactic concept in learning and applying parallel computing with the use of General Purpose Graphical Processing Unit (GPGPU) was carried out and perform preliminary testing in migrating existing sequential codes for solving initially 2D forward modeling of geophysical dataset. There are many challenges in performing these tasks mainly due to lack of some necessary development software tools, but the preliminary findings are promising. Traditionally, 2D and especially 3D forward modeling and inversion of large geophysical datasets are performed on supercomputing clusters. This was due to the fact computing time taken by using PC was too time consuming. With the introduction of parallel computing, attempts have been made to perform computationally intensive tasks on PC or clusters of personal computers where the computing power was based on Central Processing Unit (CPU). It is further enhanced with Graphical Processing Unit (GPU) as the GPU has become affordable with the launch of GPU based computing devices. Therefore this paper presents a didactic concept in learning and applying parallel computing with the use of General Purpose Graphical Processing Unit (GPGPU) was carried out and perform preliminary testing in migrating existing sequential codes for solving initially 2D forward modeling of geophysical dataset. There are many challenges in performing these tasks mainly due to lack of some necessary development software tools, but the preliminary findings are promising.Traditionally, 2D and especially 3D forward modeling and inversion of large geophysical datasets are performed on supercomputing clusters. This was due to the fact computing time taken by using PC was too time consuming. With the introduction of parallel computing, attempts have been made to perform computationally intensive tasks on PC or clusters of personal computers where the computing power was based on Central Processing Unit (CPU). It is further enhanced with Graphical Processing Unit (GPU) as the GPU has become affordable with the launch of GPU based computing devices. Therefore this paper presents a didactic concept in learning and applying parallel computing with the use of General Purpose Graphical Processing Unit (GPGPU) was carried out and perform preliminary testing in migrating existing sequential codes for solving initially 2D forward modeling of geophysical dataset. There are many challenges in performing these tasks mainly due to lack of some necessary development software tools, but the preliminary findings are promising

UMS Institutional Repository

GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

Author: Abraham Mark James
Hess Berk
Lindahl Erik
Murtola Teemu
Páll Szilárd
Schulz Roland
Smith Jeremy C.
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 01/01/2015
Field of study

AbstractGROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. These work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. The latest best-in-class compressed trajectory storage format is supported

Publikationer från KTH

Elsevier - Publisher Connector

Crossref

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line