6,584 research outputs found
Recommended from our members
Scheduling, Characterization and Prediction of HPC Workloads for Distributed Computing Environments
As High Performance Computing (HPC) has grown considerably and is expected to grow even more, effective resource management for distributed computing sys- tems is motivated more than ever. As the computational workloads grow in quantity, it is becoming more crucial to apply efficient resource management and workload scheduling to use resources efficiently while keeping the computational performance reasonably good. The problem of efficiently scheduling workloads on resources while meeting performance standards is hard. Additionally, non-clairvoyance of job dimen- sions makes resource management even harder in real-world scenarios. Our research methodology investigates the scheduling problem compliant for HPC and researches the challenges for deploying the scheduling in real world-scenarios using state of the art machine learning and data science techniques.To this end, this Ph.D. dissertation makes the following core contributions: a) We perform a theoretical analysis of space-sharing, non-preemptive scheduling: we studied this scheduling problem and proposed scheduling algorithms with polyno- mial computation time. We also proved constant upper-bounds for the performance of these algorithms. b) We studied the sensitivity of scheduling algorithms to the accuracy of runtime and devised a meta-learning approach to estimate prediction accuracy for newly submitted jobs to the HPC system. c) We studied the runtime prediction problem for HPC applications. For this purpose, we studied the distri- bution of available public workloads and proposed two different solutions that can predict multi-modal distributions: switching state-space models and Mixture Density Networks. d) We studied the effectiveness of recent recurrent neural network models for CPU usage trace prediction for individual VM traces as well as aggregate CPU usage traces. In this dissertation, we explore solutions to improve the performance of scheduling workloads on distributed systems.We begin by looking at the problem from the theoretical perspective. Modeling the problem mathematically, we first propose a scheduling algorithm that finds a constant approximation of the optimal solution for the problem in polynomial time. We prove that the performance of the algorithm (average completion time is the constant approximation of the performance of the optimal scheduling. We next look at the problem in real-world scenarios. Considering High-Performance Computing (HPC) workload computing environments as the most similar real-world equivalent of our mathematical model, we explore the problem of predicting application runtime. We propose an algorithm to handle the existing uncertainties in the real world and show-case our algorithm with demonstrative effectiveness in terms of response time and resource utilization. After looking at the uncertainty problem, we focus on trying to improve the accuracy of existing prediction approaches for HPC application runtime. We propose two solutions, one based on Kalman filters and one based on deep density mixture networks. We showcase the effectiveness of our prediction approaches by comparing with previous prediction approaches in terms of prediction accuracy and impact on improving scheduling performance. In the end, we focus on predicting resource usage for individual applications during their execution. We explore the application of recurrent neural networks for predicting resource usage of applications deployed on individual virtual machines. To validate our proposed models and solutions, we performed extensive trace-driven simulation and measured the effectiveness of our approaches
A case study of proactive auto-scaling for an ecommerce workload
Preliminary data obtained from a partnership between the Federal University
of Campina Grande and an ecommerce company indicates that some applications
have issues when dealing with variable demand. This happens because a delay in
scaling resources leads to performance degradation and, in literature, is a
matter usually treated by improving the auto-scaling. To better understand the
current state-of-the-art on this subject, we re-evaluate an auto-scaling
algorithm proposed in the literature, in the context of ecommerce, using a
long-term real workload. Experimental results show that our proactive approach
is able to achieve an accuracy of up to 94 percent and led the auto-scaling to
a better performance than the reactive approach currently used by the ecommerce
company
Workload-Aware Performance Tuning for Autonomous DBMSs
Optimal configuration is vital for a DataBase Management System (DBMS) to achieve high performance. There is no one-size-fits-all configuration that works for different workloads since each workload has varying patterns with different resource requirements. There is a relationship between configuration, workload, and system performance. If a configuration cannot adapt to the dynamic changes of a workload, there could be a significant degradation in the overall performance of DBMS unless a sophisticated administrator is continuously re-configuring the DBMS. In this tutorial, we focus on autonomous workload-aware performance tuning, which is expected to automatically and continuously tune the configuration as the workload changes. We survey three research directions, including 1) workload classification, 2) workload forecasting, and 3) workload-based tuning. While the first two topics address the issue of obtaining accurate workload information, the third one tackles the problem of how to properly use the workload information to optimize performance. We also identify research challenges and open problems, and give real-world examples about leveraging workload information for database tuning in commercial products (e.g., Amazon Redshift). We will demonstrate workload-aware performance tuning in Amazon Redshift in the presentation.Peer reviewe
DeepScaler: Holistic Autoscaling for Microservices Based on Spatiotemporal GNN with Adaptive Graph Learning
Autoscaling functions provide the foundation for achieving elasticity in the
modern cloud computing paradigm. It enables dynamic provisioning or
de-provisioning resources for cloud software services and applications without
human intervention to adapt to workload fluctuations. However, autoscaling
microservice is challenging due to various factors. In particular, complex,
time-varying service dependencies are difficult to quantify accurately and can
lead to cascading effects when allocating resources. This paper presents
DeepScaler, a deep learning-based holistic autoscaling approach for
microservices that focus on coping with service dependencies to optimize
service-level agreements (SLA) assurance and cost efficiency. DeepScaler
employs (i) an expectation-maximization-based learning method to adaptively
generate affinity matrices revealing service dependencies and (ii) an
attention-based graph convolutional network to extract spatio-temporal features
of microservices by aggregating neighbors' information of graph-structural
data. Thus DeepScaler can capture more potential service dependencies and
accurately estimate the resource requirements of all services under dynamic
workloads. It allows DeepScaler to reconfigure the resources of the interacting
services simultaneously in one resource provisioning operation, avoiding the
cascading effect caused by service dependencies. Experimental results
demonstrate that our method implements a more effective autoscaling mechanism
for microservice that not only allocates resources accurately but also adapts
to dependencies changes, significantly reducing SLA violations by an average of
41% at lower costs.Comment: To be published in the 38th IEEE/ACM International Conference on
Automated Software Engineering (ASE 2023
- …