35 research outputs found

    Performance evaluation and enhancement of Dendro

    Get PDF
    DENDRO is a collection of tools for solving Finite Element problems in parallel. This package is written in C++ using the standard template library (STL) and uses the Message Passing (MPI). Dendro uses an octree data-structure to solve image-registration problems using finite element techniques. For analyzing the behavior of the package in terms of speed-up and scalability, it is important to know which part of the package is consuming most of the execution-time. The single node performance and the overall performance of the package is dependent on the code-organization and class-hierarchy. We used the PETSC profiler to collect the performance statistics and instrument the code to know which part of the code takes most of the time. Along with the function-specific execution timings, PETSC profiler also provides the information regarding how many floating point operations is being performed in total and on average (FLOP/second). PETSC also provides information related to memory usage and number of MPI messages and reductions being performed to execute that particular function. We have analyzed these performance-statistics to provide some guidelines to how we can make Dendro more efficient by optimizing certain functions. We obtained around 12X speedup over the performance of (default) Dendro by using compiler-provided optimizations and achieved more than 65% speedup over compiler optimized performance (20X over the naive Dendro performance) by manually tuning some-block of code along with the compiler-optimizations

    Compositional Performance Modelling with the TIPPtool

    Get PDF
    Stochastic process algebras have been proposed as compositional specification formalisms for performance models. In this paper, we describe a tool which aims at realising all beneficial aspects of compositional performance modelling, the TIPPtool. It incorporates methods for compositional specification as well as solution, based on state-of-the-art techniques, and wrapped in a user-friendly graphical front end. Apart from highlighting the general benefits of the tool, we also discuss some lessons learned during development and application of the TIPPtool. A non-trivial model of a real life communication system serves as a case study to illustrate benefits and limitations

    Adaptive relaxation for the steady-state analysis of Markov chains

    Get PDF
    We consider a variant of the well-known Gauss-Seidel method for the solution of Markov chains in steady state. Whereas the standard algorithm visits each state exactly once per iteration in a predetermined order, the alternative approach uses a dynamic strategy. A set of states to be visited is maintained which can grow and shrink as the computation progresses. In this manner, we hope to concentrate the computational work in those areas of the chain in which maximum improvement in the solution can be achieved. We consider the adaptive approach both as a solver in its own right and as a relaxation method within the multi-level algorithm. Experimental results show significant computational savings in both cases

    Process algebra for performance evaluation

    Get PDF
    This paper surveys the theoretical developments in the field of stochastic process algebras, process algebras where action occurrences may be subject to a delay that is determined by a random variable. A huge class of resource-sharing systems – like large-scale computers, client–server architectures, networks – can accurately be described using such stochastic specification formalisms. The main emphasis of this paper is the treatment of operational semantics, notions of equivalence, and (sound and complete) axiomatisations of these equivalences for different types of Markovian process algebras, where delays are governed by exponential distributions. Starting from a simple actionless algebra for describing time-homogeneous continuous-time Markov chains, we consider the integration of actions and random delays both as a single entity (like in known Markovian process algebras like TIPP, PEPA and EMPA) and as separate entities (like in the timed process algebras timed CSP and TCCS). In total we consider four related calculi and investigate their relationship to existing Markovian process algebras. We also briefly indicate how one can profit from the separation of time and actions when incorporating more general, non-Markovian distributions

    Performance analysis and optimization of asynchronous circuits

    Get PDF
    Journal ArticleAsynchronous/Self-timed circuits are beginning to attract renewed attention as promising means of dealing with the complexity of modern VLSI designs. However, there are very few analysis techniques or tools available for estimating the performance of asynchronous circuits. In this paper we adapt the theory of Generalized Timed Petri-nets (GTPN) for analyzing and comparing a wide variety of asynchronous circuits, ranging from purely control-oriented circuits such as cross-bar arbiters to large asynchronous systems with data dependent control such as asynchronous processors. Experiments with the GTPN analyzer are found to track the observed performance of actual asynchronous circuits, thereby offering empirical evidence towards the soundness of the modeling approach. Our main contribution is in demonstrating how a quantitative design methodology for asynchronous circuits can be developed based on Timed Petri-nets