5 research outputs found
CloudProphet: A Machine Learning-Based Performance Prediction for Public Clouds
Computing servers have played a key role in developing and processing
emerging compute-intensive applications in recent years. Consolidating multiple
virtual machines (VMs) inside one server to run various applications introduces
severe competence for limited resources among VMs. Many techniques such as VM
scheduling and resource provisioning are proposed to maximize the
cost-efficiency of the computing servers while alleviating the performance
inference between VMs. However, these management techniques require accurate
performance prediction of the application running inside the VM, which is
challenging to get in the public cloud due to the black-box nature of the VMs.
From this perspective, this paper proposes a novel machine learning-based
performance prediction approach for applications running in the cloud. To
achieve high accuracy predictions for black-box VMs, the proposed method first
identifies the running application inside the virtual machine. It then selects
highly-correlated runtime metrics as the input of the machine learning approach
to accurately predict the performance level of the cloud application.
Experimental results with state-of-the-art cloud benchmarks demonstrate that
our proposed method outperforms the existing prediction methods by more than 2x
in terms of worst prediction error. In addition, we successfully tackle the
challenge in performance prediction for applications with variable workloads by
introducing the performance degradation index, which other comparison methods
fail to consider. The workflow versatility of the proposed approach has been
verified with different modern servers and VM configurations.Comment: 15 pages, 11 figures, summited to IEEE Transactions on Sustainable
Computin
Deriving Goal-oriented Performance Models by Systematic Experimentation
Performance modelling can require substantial effort when creating and maintaining performance models for software systems that are based on existing software. Therefore, this thesis addresses the challenge of performance prediction in such scenarios. It proposes a novel goal-oriented method for experimental, measurement-based performance modelling. We validated the approach in a number of case studies including standard industry benchmarks as well as a real development scenario at SAP
Power Bounded Computing on Current & Emerging HPC Systems
Power has become a critical constraint for the evolution of large scale High Performance Computing (HPC) systems and commercial data centers. This constraint spans almost every level of computing technologies, from IC chips all the way up to data centers due to physical, technical, and economic reasons. To cope with this reality, it is necessary to understand how available or permissible power impacts the design and performance of emergent computer systems. For this reason, we propose power bounded computing and corresponding technologies to optimize performance on HPC systems with limited power budgets.
We have multiple research objectives in this dissertation. They center on the understanding of the interaction between performance, power bounds, and a hierarchical power management strategy. First, we develop heuristics and application aware power allocation methods to improve application performance on a single node. Second, we develop algorithms to coordinate power across nodes and components based on application characteristic and power budget on a cluster. Third, we investigate performance interference induced by hardware and power contentions, and propose a contention aware job scheduling to maximize system throughput under given power budgets for node sharing system. Fourth, we extend to GPU-accelerated systems and workloads and develop an online dynamic performance & power approach to meet both performance requirement and power efficiency.
Power bounded computing improves performance scalability and power efficiency and decreases operation costs of HPC systems and data centers. This dissertation opens up several new ways for research in power bounded computing to address the power challenges in HPC systems. The proposed power and resource management techniques provide new directions and guidelines to green exscale computing and other computing systems
Deriving Goal-oriented Performance Models by Systematic Experimentation
Performance modelling can require substantial effort when creating and maintaining performance models for software systems that are based on existing software. Therefore, this thesis addresses the challenge of performance prediction in such scenarios. It proposes a novel goal-oriented method for experimental, measurement-based performance modelling. We validated the approach in a number of case studies including standard industry benchmarks as well as a real development scenario at SAP
Experiment management and analysis with perfbase
Achieving the desired performance with application software, middleware or operating system components on a parallel computer like a cluster is a complex task. Typically, a high-dimensional parameter space has to be reduced to a small number of core parameter which influence the performance most significantly, but still a large number of experiments is necessary to determine the optimal performance. Keeping track of these experiments to derive the correct conclusions is a major task. This paper presents perfbase, a set of frontend tools and an SQL database as backend, which together form a system for the management and analysis of the output of experiments. In this context, an experiment is an execution of an application or library on a computer system. The output of such an experiment are one or more text files containing information on the execution of the application. This output is the input for perfbase which extracts specified information to store it in the database and make it available for management and analysis purposes in a consistent, fast and flexible manner.