25 research outputs found
Workflow Critical Path: A Data-oriented Critical path metric for Holistic HPC Workflows
Current trends in HPC, such as the push to exascale, convergence with Big Data, and growing complexity of HPC applications, have created gaps that traditional performance tools do not cover. One example is Holistic HPC Workflows — HPC workflows comprising multiple codes, paradigms, or platforms that are not developed using a workflow management system. To diagnose the performance of these applications, we define a new metric called Workflow Critical Path (WCP), a data-oriented metric for Holistic HPC Workflows. WCP constructs graphs that span across the workflow codes and platforms, using data states as vertices and data mutations as edges. Using cloud-based technologies, we implement a prototype called Crux, a distributed analysis tool for calculating and visualizing WCP. Our experiments with a workflow simulator on Amazon Web Services show Crux is scalable and capable of correctly calculating WCP for common Holistic HPC workflow patterns. We explore the use of WCP and discuss how Crux could be used in a production HPC environment
Modeling the Office of Science Ten Year Facilities Plan: The PERI Architecture Tiger Team
The Performance Engineering Institute (PERI) originally proposed a tiger team activity as a mechanism to target significant effort optimizing key Office of Science applications, a model that was successfully realized with the assistance of two JOULE metric teams. However, the Office of Science requested a new focus beginning in 2008: assistance in forming its ten year facilities plan. To meet this request, PERI formed the Architecture Tiger Team, which is modeling the performance of key science applications on future architectures, with S3D, FLASH and GTC chosen as the first application targets. In this activity, we have measured the performance of these applications on current systems in order to understand their baseline performance and to ensure that our modeling activity focuses on the right versions and inputs of the applications. We have applied a variety of modeling techniques to anticipate the performance of these applications on a range of anticipated systems. While our initial findings predict that Office of Science applications will continue to perform well on future machines from major hardware vendors, we have also encountered several areas in which we must extend our modeling techniques in order to fulfill our mission accurately and completely. In addition, we anticipate that models of a wider range of applications will reveal critical differences between expected future systems, thus providing guidance for future Office of Science procurement decisions, and will enable DOE applications to exploit machines in future facilities fully
Evaluating Similarity-based Trace Reduction Techniques for Scalable Performance Analysis
Event traces are required to correctly diagnose a number of performance problems that arise on today’s highly parallel systems. Unfortunately, the collection of event traces can produce a large volume of data that is difficult, or even impossible, to store and analyze. One approach for compressing a trace is to identify repeating trace patterns and retain only one representative of each pattern. However, determining the similarity of sections of traces, i.e., identifying patterns, is not straightforward. In this paper, we investigate pattern-based methods for reducing traces that will be used for performance analysis. We evaluate the different methods against several criteria, including size reduction, introduced error, and retention of performance trends, using both benchmarks with carefully chosen performance behaviors, and a real application
Evaluating Similarity-based Trace Reduction Techniques for Scalable Performance Analysis
Event traces are required to correctly diagnose a number of performance problems that arise on today’s highly parallel systems. Unfortunately, the collection of event traces can produce a large volume of data that is difficult, or even impossible, to store and analyze. One approach for compressing a trace is to identify repeating trace patterns and retain only one representative of each pattern. However, determining the similarity of sections of traces, i.e., identifying patterns, is not straightforward. In this paper, we investigate pattern-based methods for reducing traces that will be used for performance analysis. We evaluate the different methods against several criteria, including size reduction, introduced error, and retention of performance trends, using both benchmarks with carefully chosen performance behaviors, and a real application. 1
Updating an Introductory Performance Course with PDC Topics
We report on our development of a new course: Introduction to Performance Measurement, Modeling and Analysis (IPMMA). The first offering was in Fall 2014, with a second offering in Winter 2015 in finals week at this writing. The course focuses on the fundamentals of measuring, analyzing, and modeling computer performance. As we cover the basics we will move through a set of case studies, allowing us to apply the techniques to increasingly complex problems. Case studies used in Fall 2014 include: multithreaded code; web servers; MPI code; virtualized servers; and Map/Reduce - Hadoop. In Winter 2015 we added a case study for single server queue simulation. These case studies include hands-on programming exercises both during class time and as take-home exercises. We use a variety of performance tools through the course to learn the state of the art for performance techniques and practices. We have successfully included a number of PDC topics into this course, with relatively small background needed for the students to succeed in hands-on exercises. However, we have found a total lack of textbook support for the inclusion of PDC topics into an introductory performance course