Search CORE

581,208 research outputs found

PETRI NET BASED MODELING OF PARALLEL PROGRAMS EXECUTING ON DISTRIBUTED MEMORY MULTIPROCESSOR SYSTEMS

Author: Ferscha A.
Haring G.
Publication venue: Periodica Polytechnica Electrical Engineering (Archives)
Publication date: 01/01/1991
Field of study

The development of parallel programs following the paradigm of communicating sequen- tial processes to be executed on distributed memory multiprocessor systems is addressed. The key issue in programming parallel machines today is to provide computerized tools supporting the development of efficient parallel software, i.e. software effectively har- nessing the power of parallel processing systems. The critical situations where a parallel programmer needs help is in expressing a parallel algorithm in a programming language, in getting a parallel program to work and in tuning it to get optimum performance (for example speedup). . We show that the Petri net formalism is higly suitable as a performance modeling technique for asynchronous parallel systems, by introducing a model taking care of the parallel program, parallel architecture and mapping influences on overall system perfor- mance. PRM -net (Program-Resource- Mapping) models comprise a Petri net model of the multiple flows of control in a parallel program, a Petri net model of the parallel hardware and the process-to-processor mapping information into a single integrated performance model. Automated analysis of PRM-net models addresses correctness and performance of parallel programs mapped to parallel hardware. Questions upon the correctness of parallel programs can be answered by investigating behavioural properties of Petri net programs like liveness, reachability, boundedness, mutualy exclusiveness etc. Peformance of parallel programs is usefully considered only in concern with a dedicated target hard- ware. For this reason it is essential to integrate multiprocessor hardware characteristics into the specification of a parallel program. The integration is done by assigning the concurrent processes to physical processing devices and communication patterns among parallel processes to communication media connecting processing elements yielding an in- tegrated, Petri net based performance model. Evaluation of the integrated model applies simulation and markovian analysis to derive expressions characterising the peformance of the program being developed. Synthesis and decomposition rules for hierarchical models naturally give raise to use PRM-net models for graphical, performance oriented parallel programming, support- ing top-down (stepwise refinement) as well as bottom-up development approaches. The graphical representation of Petri net programs visualizes phenomena like parallelism, syn- chronisation, communication, sequential and alternative execution. Modularity of pro- gram blocks aids reusability, prototyping is promoted by automated code generation on the basis of high level program specifications

Periodica Polytechnica (Budapest University of Technology and Economics)

A communication-ordered task graph allocation algorithm

Author: Evans John
Publication venue: University of Utah
Publication date: 01/01/1992
Field of study

technical reportThe inherently asynchronous nature of the data flow computation model allows the exploitation of maximum parallelism in program execution?? While this computational model holds great promise several problems must be solved in order to achieve a high degree of program performance?? The allocation and scheduling of programs on MIMD distributed memory parallel hardware is necessary for the implementation of e cient parallel systems?? Finding optimal solutions requires that maxi mum parallelism be achieved consistent with resource limits and minimizing communication costs and has been proven to be in the class of NP complete problems?? This paper addresses the problem of static allocation of tasks to distributed memory MIMD systems where simultaneous computation and communication is a factor?? This paper discusses similarities and di erences between several recent heuristic allocation approaches and identi es common problems inherent in these approaches?? This paper presents a new algorithm scheme and heuristics that resolves the identi ed problems and shows signi cant performance bene ts?

The University of Utah: J. Willard Marriott Digital Library

A communication-ordered task graph allocation algorithm

Author: Evans John D.
Kessler Robert R.
Publication venue: University of Utah
Publication date: 01/01/1992
Field of study

technical reportThe inherently asynchronous nature of the data flow computation model allows the exploitation of maximum parallelism in program execution. While this computational model holds great promise, several problems must be solved in order to achieve a high degree of program performance. The allocation and scheduling of programs on MIMD distributed memory parallel hardware, is necessary for the implementation of efficient parallel systems. Finding optimal solutions requires that maximum parallelism be achieved consistent with resource limits and minimizing communication costs, and has been proven to be in the class of NP-complete problems. This paper addresses the problem of static allocation of tasks to distributed memory MIMD systems where simultaneous computation and communication is a factor. This paper discusses similarities and differences between several recent heuristic allocation approaches and identifies common problems inherent in these approaches. This paper presents a new algorithm scheme and heuristics that resolves the identified problems and shows significant performance benefits

The University of Utah: J. Willard Marriott Digital Library

libcppa - Designing an Actor Semantic for C++11

Author: Charousset Dominik
Schmidt Thomas C.
Publication venue
Publication date: 01/01/2013
Field of study

Parallel hardware makes concurrency mandatory for efficient program execution. However, writing concurrent software is both challenging and error-prone. C++11 provides standard facilities for multiprogramming, such as atomic operations with acquire/release semantics and RAII mutex locking, but these primitives remain too low-level. Using them both correctly and efficiently still requires expert knowledge and hand-crafting. The actor model replaces implicit communication by sharing with an explicit message passing mechanism. It applies to concurrency as well as distribution, and a lightweight actor model implementation that schedules all actors in a properly pre-dimensioned thread pool can outperform equivalent thread-based applications. However, the actor model did not enter the domain of native programming languages yet besides vendor-specific island solutions. With the open source library libcppa, we want to combine the ability to build reliable and distributed systems provided by the actor model with the performance and resource-efficiency of C++11.Comment: 10 page

arXiv.org e-Print Archive

REPOSIT

Hybrid performance modeling and prediction of large-scale computing systems

Author: Barolli Leonard
Benkner Siegfried
Pllana Sabri
Xhafa Xhafa Fatos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Performance is a key feature of large-scale computing systems. However, the achieved performance when a certain program is executed is significantly lower than the maximal theoretical performance of the large-scale computing system. The model-based performance evaluation may be used to support the performance-oriented program development for large-scale computing systems. In this paper we present a hybrid approach for performance modeling and prediction of parallel and distributed computing systems, which combines mathematical modeling and discrete-event simulation. We use mathematical modeling to develop parameterized performance models for components of the system. Thereafter, we use discrete-event simulation to describe the structure of system and the interaction among its components. As a result, we obtain a high-level performance model, which combines the evaluation speed of mathematical models with the structure awareness and fidelity of the simulation model. We evaluate empirically our approach with a real-world material science program that comprises more than 15,000 lines of codePeer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Verification of MPI programs using Spin

Author: Barrus Steven
Gopalakrishnan Ganesh
Publication venue: University of Utah
Publication date: 01/01/2004
Field of study

technical reportVerification of distributed systems is a complex yet important process. Concurrent systems are vulnerable to problems such as deadlock, starvation, and race conditions. Parallel programs written using the MPI (Message Passing Interface) Standard are no exception. Spin can be used to formally verify a parallel program if it is given an accurate model written is Spin's process meta language (Promela). In this paper, we describe a generalized framework for verification of MPI-based parallel programs using the Spin model checker. Only select MPI calls are covered, but this framework could potentially be extended to include all of the MPI Standard. Our reduced MPI implementation (written in Promela) is designed to follow the MPI Standard as well as allow for the flexibility provided in certain aspects (like buffering). We also present a few examples to illustrate the use of our MPI implementation in Promela

The University of Utah: J. Willard Marriott Digital Library