Search CORE

7 research outputs found

A Performance Comparison Using HPC Benchmarks: Windows HPC Server 2008 and Red Hat Enterprise Linux 5

Author: Doleschal J.
Henschel R.
Li H.
Mueller M.S.
Teige S.
Publication venue
Publication date: 01/10/2011
Field of study

This document was developed with support from the National Science Foundation (NSF) under Grant No. 0910812 to Indiana University for ”FutureGrid: An Experimental, High-Performance Grid Test-bed.” Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.A collection of performance benchmarks have been run on an IBM System X iDataPlex cluster using two different operating systems. Windows HPC Server 2008 (WinHPC) and Red Hat Enterprise Linux v5.4 (RHEL5) are compared using SPEC MPI2007 v1.1, the High Performance Computing Challenge (HPCC) and National Science Foundation (NSF) acceptance test benchmark suites. Overall, we find the performance of WinHPC and RHEL5 to be equivalent but significant performance differences exist when analyzing specific applications. We focus on presenting the results from the application benchmarks and include the results of the HPCC microbenchmark for completeness

IUScholarWorks (University of Indiana)

Predicting Execution Readiness of MPI Binaries with FEAM, a Framework for Efficient Application Migration

Author: Andrew Grimshaw
Karolina Sarnowska-Upton
Publication venue
Publication date: 01/05/2020
Field of study

Abstract-As computational science becomes increasingly relevant for performing research, shared computing resources made accessible by cyberinfrastructures emerge as especially valuable for the majority of scientists who have not traditionally been the dominant users of such resources. However, in order to provide these newer computational scientists the opportunities to do great research, the ease-of-use of shared computing resources needs to be increased. In this paper, we present techniques that aim to make the migration to (and between) shared computing resources more efficient. Specifically, we focus on determining whether a computing site is a good fit for running an MPI binary. We present our methods and a Linux-based implementation called FEAM (a Framework for Efficient Application Migration). FEAM predicts execution readiness, resolves missing shared libraries, and composes site-specific configurations. We show that FEAM is more than 90% accurate at predicting execution readiness of MPI application binaries from the NAS Parallel and SPEC MPI2007 benchmark suites. In our evaluation, only half of the migrated binaries execute successfully at sites only configured with a matching MPI implementation. We show that by automatically resolving shared libraries requirements, FEAM is able to increase the number of successful executions by a third

CiteSeerX

Predicting Execution Readiness of MPI Binaries with FEAM, a Framework for Efficient Application Migration

Author: Andrew Grimshaw
Karolina Sarnowska-Upton
Publication venue
Publication date: 01/05/2020
Field of study

Abstract-Today's scientific computing infrastructures provide scientists with easy access to a wide variety of computing resources. However, migrating applications to new computing sites can be tedious and time consuming. When optimal performance is not a concern, scientists can benefit by moving binaries instead of source code. Our work aims to make migration of MPI application binaries more efficient by automation. We present general methods that assess if binaries are a good match for execution at computing sites. We also present methods for increasing execution readiness by resolving missing shared libraries. Our work aims to free scientists from extensive manual preparation at new sites. To evaluate the effectiveness of our methods, we present an automated Linux-based implementation called FEAM, a Framework for Efficient Application Migration. We show that FEAM is more than 90% accurate at predicting execution readiness of MPI application binaries from the NAS Parallel and SPEC MPI2007 benchmark suites. In our evaluation, only half of the migrated binaries execute successfully at sites configured with a matching MPI implementation. We show that by automatically resolving shared libraries requirements, FEAM is able to increase the number of successful executions by a third

CiteSeerX

Statistical Techniques to Model and Optimize Performance of Scientific, Numerically Intensive Workloads

Author: Steven Monteiro Steena Dominica
Publication venue: DigitalCommons@USU
Publication date: 01/12/2016
Field of study

Projecting performance of applications and hardware is important to several market segments—hardware designers, software developers, supercomputing centers, and end users. Hardware designers estimate performance of current applications on future systems when designing new hardware. Software developers make performance estimates to evaluate performance of their code on different architectures and input datasets. Supercomputing centers try to optimize the process of matching computing resources to computing needs. End users requesting time on supercomputers must provide estimates of their application’s run time, and incorrect estimates can lead to wasted supercomputing resources and time. However, application performance is challenging to predict because it is affected by several factors in application code, specifications of system hardware, choice of compilers, compiler flags, and libraries. This dissertation uses statistical techniques to model and optimize performance of scientific applications across different computer processors. The first study in this research offers statistical models that predict performance of an application across different input datasets prior to application execution. These models guide end users to select parameters that produce optimal application performance during execution. The second study offers a suite of statistical models that predict performance of a new application on a new processor. Both studies present statistical techniques that can be generalized to analyze, optimize, and predict performance of diverse computation- and data-intensive applications on different hardware

DigitalCommons@USU

Runtime MPI Correctness Checking with a Scalable Tools Infrastructure

Author: Hilbrich Tobias
Publication venue
Publication date: 08/06/2015
Field of study

Increasing computational demand of simulations motivates the use of parallel computing systems. At the same time, this parallelism poses challenges to application developers. The Message Passing Interface (MPI) is a de-facto standard for distributed memory programming in high performance computing. However, its use also enables complex parallel programing errors such as races, communication errors, and deadlocks. Automatic tools can assist application developers in the detection and removal of such errors. This thesis considers tools that detect such errors during an application run and advances them towards a combination of both precise checks (neither false positives nor false negatives) and scalability. This includes novel hierarchical checks that provide scalability, as well as a formal basis for a distributed deadlock detection approach. At the same time, the development of parallel runtime tools is challenging and time consuming, especially if scalability and portability are key design goals. Current tool development projects often create similar tool components, while component reuse remains low. To provide a perspective towards more efficient tool development, which simplifies scalable implementations, component reuse, and tool integration, this thesis proposes an abstraction for a parallel tools infrastructure along with a prototype implementation. This abstraction overcomes the use of multiple interfaces for different types of tool functionality, which limit flexible component reuse. Thus, this thesis advances runtime error detection tools and uses their redesign and their increased scalability requirements to apply and evaluate a novel tool infrastructure abstraction. The new abstraction ultimately allows developers to focus on their tool functionality, rather than on developing or integrating common tool components. The use of such an abstraction in wide ranges of parallel runtime tool development projects could greatly increase component reuse. Thus, decreasing tool development time and cost. An application study with up to 16,384 application processes demonstrates the applicability of both the proposed runtime correctness concepts and of the proposed tools infrastructure

Technische Universität Dresden: Qucosa

A Unified Infrastructure for Monitoring and Tuning the Energy Efficiency of HPC Applications

Author: Schöne Robert
Publication venue
Publication date: 19/09/2017
Field of study

High Performance Computing (HPC) has become an indispensable tool for the scientific community to perform simulations on models whose complexity would exceed the limits of a standard computer. An unfortunate trend concerning HPC systems is that their power consumption under high-demanding workloads increases. To counter this trend, hardware vendors have implemented power saving mechanisms in recent years, which has increased the variability in power demands of single nodes. These capabilities provide an opportunity to increase the energy efficiency of HPC applications. To utilize these hardware power saving mechanisms efficiently, their overhead must be analyzed. Furthermore, applications have to be examined for performance and energy efficiency issues, which can give hints for optimizations. This requires an infrastructure that is able to capture both, performance and power consumption information concurrently. The mechanisms that such an infrastructure would inherently support could further be used to implement a tool that is able to do both, measuring and tuning of energy efficiency. This thesis targets all steps in this process by making the following contributions: First, I provide a broad overview on different related fields. I list common performance measurement tools, power measurement infrastructures, hardware power saving capabilities, and tuning tools. Second, I lay out a model that can be used to define and describe energy efficiency tuning on program region scale. This model includes hardware and software dependent parameters. Hardware parameters include the runtime overhead and delay for switching power saving mechanisms as well as a contemplation of their scopes and the possible influence on application performance. Thus, in a third step, I present methods to evaluate common power saving mechanisms and list findings for different x86 processors. Software parameters include their performance and power consumption characteristics as well as the influence of power-saving mechanisms on these. To capture software parameters, an infrastructure for measuring performance and power consumption is necessary. With minor additions, the same infrastructure can later be used to tune software and hardware parameters. Thus, I lay out the structure for such an infrastructure and describe common components that are required for measuring and tuning. Based on that, I implement adequate interfaces that extend the functionality of contemporary performance measurement tools. Furthermore, I use these interfaces to conflate performance and power measurements and further process the gathered information for tuning. I conclude this work by demonstrating that the infrastructure can be used to manipulate power-saving mechanisms of contemporary x86 processors and increase the energy efficiency of HPC applications

Technische Universität Dresden: Qucosa