2,015 research outputs found
ScALPEL: A Scalable Adaptive Lightweight Performance Evaluation Library for application performance monitoring
As supercomputers continue to grow in scale and capabilities, it is becoming
increasingly difficult to isolate processor and system level causes of
performance degradation. Over the last several years, a significant number of
performance analysis and monitoring tools have been built/proposed. However,
these tools suffer from several important shortcomings, particularly in
distributed environments. In this paper we present ScALPEL, a Scalable Adaptive
Lightweight Performance Evaluation Library for application performance
monitoring at the functional level. Our approach provides several distinct
advantages. First, ScALPEL is portable across a wide variety of architectures,
and its ability to selectively monitor functions presents low run-time
overhead, enabling its use for large-scale production applications. Second, it
is run-time configurable, enabling both dynamic selection of functions to
profile as well as events of interest on a per function basis. Third, our
approach is transparent in that it requires no source code modifications.
Finally, ScALPEL is implemented as a pluggable unit by reusing existing
performance monitoring frameworks such as Perfmon and PAPI and extending them
to support both sequential and MPI applications.Comment: 10 pages, 4 figures, 2 table
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Exploiting Performance Counters to Predict and Improve Energy Performance of HPC Systems
International audienceHardware monitoring through performance counters is available on almost all modern processors. Although these counters are originally designed for performance tuning, they have also been used for evaluating power consumption. We propose two approaches for modelling and understanding the behaviour of high performance computing (HPC) systems relying on hardware monitoring counters. We evaluate the effectiveness of our system modelling approach considering both optimising the energy usage of HPC systems and predicting HPC applications' energy consumption as target objectives. Although hardware monitoring counters are used for modelling the system, other methods -- including partial phase recognition and cross platform energy prediction -- are used for energy optimisation and prediction. Experimental results for energy prediction demonstrate that we can accurately predict the peak energy consumption of an application on a target platform; whereas, results for energy optimisation indicate that with no a priori knowledge of workloads sharing the platform we can save up to 24\% of the overall HPC system's energy consumption under benchmarks and real-life workloads
An innovative approach to performance metrics calculus in cloud computing environments: a guest-to-host oriented perspective
In virtualized systems, the task of profiling and resource monitoring is not straight-forward. Many datacenters perform CPU overcommittment using hypervisors, running multiple virtual machines on a single computer where the total number of virtual CPUs exceeds the total number of physical CPUs available.
From a customer point of view, it could be indeed interesting to know if the purchased service levels are effectively respected by the cloud provider. The innovative approach to performance profiling described in this work is based on the use of virtual performance counters, only recently made available by some hypervisors to their virtual machines, to implement guest-wide profiling. Although it isn't possible for the virtual machine to access Virtual Machine Monitor, with this method it is able to gather interesting informations to deduce the state of resource overcommittment of the virtualization host where it is executed. Tests have been carried out inside the compute nodes of FIWARE Genoa Node, an instance of a widely distributed federated community cloud, based on OpenStack and KVM. AgiLab-DITEN, the laboratory I belonged to and where I conducted my studies, together with TnT-Lab\u2013DITEN and CNIT-GE-Unit designed, installed and configured the whole Genoa Node, that was hosted on DITEN-UniGE equipment rooms. All the software measuring instruments, operating systems and programs used in this research are publicly available and free, and can be easily installed in a micro
instance of virtual machine, rapidly deployable also in public clouds
- …