Skip to main content
Article thumbnail
Location of Repository

Predictive analysis and optimisation of pipelined wavefront applications using reusable analytic models

By Gihan R. Mudalige


Pipelined wavefront computations are an ubiquitous class of high performance parallel algorithms\ud used for the solution of many scientific and engineering applications. In order to aid\ud the design and optimisation of these applications, and to ensure that during procurement platforms\ud are chosen best suited to these codes, there has been considerable research in analysing\ud and evaluating their operational performance.\ud Wavefront codes exhibit complex computation, communication, synchronisation patterns,\ud and as a result there exist a large variety of such codes and possible optimisations. The\ud problem is compounded by each new generation of high performance computing system,\ud which has often introduced a previously unexplored architectural trait, requiring previous\ud performance models to be rewritten and reevaluated.\ud In this thesis, we address the performance modelling and optimisation of this class of\ud application, as a whole. This differs from previous studies in which bespoke models are applied\ud to specific applications. The analytic performance models are generalised and reusable,\ud and we demonstrate their application to the predictive analysis and optimisation of pipelined\ud wavefront computations running on modern high performance computing systems.\ud The performance model is based on the LogGP parameterisation, and uses a small\ud number of input parameters to specify the particular behaviour of most wavefront codes. The\ud new parameters and model equations capture the key structural and behavioural differences\ud among different wavefront application codes, providing a succinct summary of the operations\ud for each application and insights into alternative wavefront application design.\ud The models are applied to three industry-strength wavefront codes and are validated\ud on several systems including a Cray XT3/XT4 and an InfiniBand commodity cluster. Model\ud predictions show high quantitative accuracy (less than 20% error) for all high performance\ud configurations and excellent qualitative accuracy.\ud The thesis presents applications, projections and insights for optimisations using the\ud model, which show the utility of reusable analytic models for performance engineering of\ud high performance computing codes. In particular, we demonstrate the use of the model for:\ud (1) evaluating application configuration and resulting performance; (2) evaluating hardware\ud platform issues including platform sizing, configuration; (3) exploring hardware platform design\ud alternatives and system procurement and, (4) considering possible code and algorithmic\ud optimisations

Topics: QA76
OAI identifier:

Suggested articles


  1. (1995). A Framework for Characterising Parallel Systems for Performance Evaluation.
  2. (1994). A Layered Approach to Parallel Software Performance Prediction: A Case Study. doi
  3. (1997). An Introduction to the Layered Characterisation for High Performance Systems, doi
  4. (1999). Analytic Cache Modelling of Numerical Programs.
  5. Condor: High Throughput Computing. doi
  6. METIS 4.0: Unstructured Graph Partitioning and Sparse Matrix Ordering System. doi
  7. Modelling of ASCI High Performance Applications Using PACE. doi
  8. (2007). New Languages for High Performance, High Productivity Computing. doi
  9. (2008). Optimization of Infiniband for Scientific Applications. doi
  10. The Cascade High-Productivity Language.
  11. (1998). The Grid: Blueprint for a New Computing Infrastructure. doi
  12. Unified Parallel C. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.