45 research outputs found

    Prediction of the impact of network switch utilization on application performance via active measurement

    Get PDF
    Although one of the key characteristics of High Performance Computing (HPC) infrastructures are their fast interconnecting networks, the increasingly large computational capacity of HPC nodes and the subsequent growth of data exchanges between them constitute a potential performance bottleneck. To achieve high performance in parallel executions despite network limitations, application developers require tools to measure their codes’ network utilization and to correlate the network’s communication capacity with the performance of their applications. This paper presents a new methodology to measure and understand network behavior. The approach is based in two different techniques that inject extra network communication. The first technique aims to measure the fraction of the network that is utilized by a software component (an application or an individual task) to determine the existence and severity of network contention. The second injects large amounts of network traffic to study how applications behave on less capable or fully utilized networks. The measurements obtained by these techniques are combined to predict the performance slowdown suffered by a particular software component when it shares the network with others. Predictions are obtained by considering several training sets that use raw data from the two measurement techniques. The sensitivity of the training set size is evaluated by considering 12 different scenarios. Our results find the optimum training set size to be around 200 training points. When optimal data sets are used, the proposed methodology provides predictions with an average error of 9.6% considering 36 scenarios.With the support of the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the 7th R&D Framework Programme of the European Union (Expedient 2013BP_B00243). The research leading to these results has received funding from the European Research Council under the European Union’s 7th FP (FP/2007-2013) /ERC GA n. 321253. Work partially supported by the Spanish Ministry of Science and Innovation (TIN2012-34557)Peer ReviewedPostprint (author's final draft

    Automated Application-level Checkpointing of MPI Programs

    Full text link
    Because of increasing hardware and software complexity, the running time of many computational science applications is now more than the mean-time-to-failure of high-performance computing platforms. Therefore, computational science applications need to tolerate hardware failures. In this paper, we focus on the stopping failure model in which a faulty process hangs and stops responding to the rest of the system. We argue that tolerating such faults is best done by an approach called application-level coordinated non-blocking checkpointing, and that existing fault-tolerance protocols in teh literature are not suitable for implementing this approach. In this paper, we present a suitable protocol, and show how it can be used with a precompiler that instruments C/MPI programs to save application and MPI library state. An advantage of our approach is that it is independent of the MPI implementation. We present experimental results that argue that the overhead of using our system can be small

    Active Measurement of Memory Resource Consumption

    Get PDF
    Hierarchical memory is a cornerstone of modern hardware design because it provides high memory performance and capacity at a low cost. However, the use of multiple levels of memory and complex cache management policies makes it very difficult to optimize the performance of applications running on hierarchical memories. As the number of compute cores per chip continues to rise faster than the total amount of available memory, applications will become increasingly starved for memory storage capacity and bandwidth, making the problem of performance optimization even more critical. We propose a new methodology for measuring and modeling the performance of hierarchical memories in terms of the application’s utilization of the key memory resources: capacity of a given memory level and bandwidth between two levels. This is done by actively interfering with the application’s use of these resources. The application’s sensitivity to reduced resource availability is measured by observing the effect of interference on application performance. The resulting resource-oriented model of performance both greatly simplifies application performance analysis and makes it possible to predict an application’s performance when running with various resource constraints. This is useful to predict performance for future memory-constrained architectures.The research leading to these results has received funding from the European Research Council under the European Union’s 7th FP (FP/2007-2013) / ERC GA n. 321253. Work partially supported by the Spanish Ministry of Science and Innovation (TIN2012-34557). This article has been authored in part by Lawrence Livermore National Security, LLC under Contract DE-AC52-07NA27344 with the U.S. Department of Energy. Accordingly, the United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this article or allow others to do so, for United States Government purposes. This work was partially supported by the Department of Energy Office of Science (Advanced Scientific Computing Research) Early Career Grant, award number NA27344.Peer ReviewedPostprint (author's final draft

    Subtitling for deaf and hard of hearing people

    Get PDF
    SIGLEAvailable from British Library Document Supply Centre-DSC:m01/12168 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

    Prediction of the impact of network switch utilization on application performance via active measurement

    No full text
    Although one of the key characteristics of High Performance Computing (HPC) infrastructures are their fast interconnecting networks, the increasingly large computational capacity of HPC nodes and the subsequent growth of data exchanges between them constitute a potential performance bottleneck. To achieve high performance in parallel executions despite network limitations, application developers require tools to measure their codes’ network utilization and to correlate the network’s communication capacity with the performance of their applications. This paper presents a new methodology to measure and understand network behavior. The approach is based in two different techniques that inject extra network communication. The first technique aims to measure the fraction of the network that is utilized by a software component (an application or an individual task) to determine the existence and severity of network contention. The second injects large amounts of network traffic to study how applications behave on less capable or fully utilized networks. The measurements obtained by these techniques are combined to predict the performance slowdown suffered by a particular software component when it shares the network with others. Predictions are obtained by considering several training sets that use raw data from the two measurement techniques. The sensitivity of the training set size is evaluated by considering 12 different scenarios. Our results find the optimum training set size to be around 200 training points. When optimal data sets are used, the proposed methodology provides predictions with an average error of 9.6% considering 36 scenarios.With the support of the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the 7th R&D Framework Programme of the European Union (Expedient 2013BP_B00243). The research leading to these results has received funding from the European Research Council under the European Union’s 7th FP (FP/2007-2013) /ERC GA n. 321253. Work partially supported by the Spanish Ministry of Science and Innovation (TIN2012-34557)Peer Reviewe

    Active measurement of memory resource consumption

    No full text
    Hierarchical memory is a cornerstone of modern hardware design because it provides high memory performance and capacity at a low cost. However, the use of multiple levels of memory and complex cache management policies makes it very difficult to optimize the performance of applications running on hierarchical memories. As the number of compute cores per chip continues to rise faster than the total amount of available memory, applications will become increasingly starved for memory storage capacity and bandwidth, making the problem of performance optimization even more critical. We propose a new methodology for measuring and modeling the performance of hierarchical memories in terms of the application’s utilization of the key memory resources: capacity of a given memory level and bandwidth between two levels. This is done by actively interfering with the application’s use of these resources. The application’s sensitivity to reduced resource availability is measured by observing the effect of interference on application performance. The resulting resource-oriented model of performance both greatly simplifies application performance analysis and makes it possible to predict an application’s performance when running with various resource constraints. This is useful to predict performance for future memory-constrained architectures.The research leading to these results has received funding from the European Research Council under the European Union’s 7th FP (FP/2007-2013) / ERC GA n. 321253. Work partially supported by the Spanish Ministry of Science and Innovation (TIN2012-34557). This article has been authored in part by Lawrence Livermore National Security, LLC under Contract DE-AC52-07NA27344 with the U.S. Department of Energy. Accordingly, the United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this article or allow others to do so, for United States Government purposes. This work was partially supported by the Department of Energy Office of Science (Advanced Scientific Computing Research) Early Career Grant, award number NA27344.Peer Reviewe

    Evaluation of HPC applications’ Memory Resource Consumption via Active Measurement

    No full text
    As the number of compute cores per chip continues to rise faster than the total amount of available memory, applications will become increasingly starved for memory storage capacity and bandwidth, making the problem of performance optimization even more critical. Also, understanding and optimizing the usage of an increasing number of hierarchical memory levels and complex cache management policies is becoming a very hard task. We propose a methodology for measuring and modeling the performance of hierarchical memories in terms of the application’s utilization of the key memory resources: capacity of a given memory level and bandwidth between two levels. This is done by actively interfering with the application’s use of these resources. The application’s sensitivity to reduced resource availability is measured by observing the effect of interference on application performance. The resulting resource-oriented model of performance both greatly simplifies application performance analysis and makes it possible to predict an application’s performance when running with various resource constraints. This is useful to predict performance for future memory-constrained architectures. This paper applies the proposed methodology to 6 important and well known High Performance Computing (HPC) codes to show the strength and the potential of analysis based on resource-oriented measurements.Peer Reviewe
    corecore