33 research outputs found

    Power monitoring with PAPI for extreme scale architectures and dataflow-based programming models

    Full text link
    Abstract—For more than a decade, the PAPI performance-monitoring library has provided a clear, portable interface to the hardware performance counters available on all modern CPUs and other components of interest (e.g., GPUs, network, and I/O systems). Most major end-user tools that application developers use to analyze the performance of their applications rely on PAPI to gain access to these performance counters. One of the critical roadblocks on the way to larger, more complex high performance systems, has been widely identified as being the energy efficiency constraints. With modern extreme scale machines having hundreds of thousands of cores, the ability to reduce power consumption for each CPU at the software level becomes critically important, both for economic and environmental reasons. In order for PAPI to continue playing its well established role in HPC, it is pressing to provide valuable performance data that not only originates from within the processing cores but also delivers insight into the power consumption of the system as a whole. An extensive effort has been made to extend the Perfor-mance API to support power monitoring capabilities for various platforms. This paper provides detailed information about three components that allow power monitoring on the Intel Xeon Phi and Blue Gene/Q. Furthermore, we discuss the integration of PAPI in PARSEC – a task-based dataflow-driven execution engine – enabling hardware performance counter and power monitoring at true task granularity. I

    Evaluation of Dataflow Programming Models for Electronic Structure Theory

    Get PDF
    International audienceDataflow programming models have been growing in popularity as a means to deliver a good balance between performance and portability in the post-petascale era. In this paper we evaluate different dataflow programming models for electronic structure methods and compare them in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms for expressing scientific applications in a dataflow form: (1) explicit dataflow, where the dataflow is specified explicitly by the developer, and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task data-access information embedded in a serial program. We discuss our findings and present a thorough experimental analysis using methods from the NWCHEM quantum chemistry application as our case study, and OPENMP, STARPU and PARSEC as the task-based runtimes that enable the different forms of dataflow execution. Furthermore, we derive an abstract model to explore the limits of the different dataflow programming paradigms

    PAPI-EX Poster Feb 2017-FINAL.pdf

    No full text
    Poster presenting the work done in the PAPI-EX project.<br

    SI2-SSE: PAPI Unifying Layer for Software-Defined Events (PULSE)

    No full text
    PULSE builds on the latest Performance API (PAPI) project and extends it with software-defined events (SDE) that originate from the HPC software stack and are currently treated as black boxes (i.e., communication libraries, math libraries, task-based runtime systems, applications).<br>The objective is to enable monitoring of both types of performance events---hardware- and software-related events---in a uniform way, through one consistent PAPI interface. Therefore, 3rd-party tools and application developers have to handle only a single hook to PAPI to access all hardware performance counters in a system, including the new software-defined events.<br

    SI2-SSE: PAPI Unifying Layer for Software-Defined Events (PULSE)

    No full text
    PULSE builds on the latest Performance API (PAPI) project and extends it with software-defined events (SDE) that originate from the HPC software stack and are currently treated as black boxes (i.e., communication libraries, math libraries, task-based runtime systems, applications).<br>The objective is to enable monitoring of both types of performance events---hardware- and software-related events---in a uniform way, through one consistent PAPI interface. Therefore, 3rd-party tools and application developers have to handle only a single hook to PAPI to access all hardware performance counters in a system, including the new software-defined events
    corecore