379 research outputs found

    Data-Driven Intelligent Scheduling For Long Running Workloads In Large-Scale Datacenters

    Get PDF
    Cloud computing is becoming a fundamental facility of society today. Large-scale public or private cloud datacenters spreading millions of servers, as a warehouse-scale computer, are supporting most business of Fortune-500 companies and serving billions of users around the world. Unfortunately, modern industry-wide average datacenter utilization is as low as 6% to 12%. Low utilization not only negatively impacts operational and capital components of cost efficiency, but also becomes the scaling bottleneck due to the limits of electricity delivered by nearby utility. It is critical and challenge to improve multi-resource efficiency for global datacenters. Additionally, with the great commercial success of diverse big data analytics services, enterprise datacenters are evolving to host heterogeneous computation workloads including online web services, batch processing, machine learning, streaming computing, interactive query and graph computation on shared clusters. Most of them are long-running workloads that leverage long-lived containers to execute tasks. We concluded datacenter resource scheduling works over last 15 years. Most previous works are designed to maximize the cluster efficiency for short-lived tasks in batch processing system like Hadoop. They are not suitable for modern long-running workloads of Microservices, Spark, Flink, Pregel, Storm or Tensorflow like systems. It is urgent to develop new effective scheduling and resource allocation approaches to improve efficiency in large-scale enterprise datacenters. In the dissertation, we are the first of works to define and identify the problems, challenges and scenarios of scheduling and resource management for diverse long-running workloads in modern datacenter. They rely on predictive scheduling techniques to perform reservation, auto-scaling, migration or rescheduling. It forces us to pursue and explore more intelligent scheduling techniques by adequate predictive knowledges. We innovatively specify what is intelligent scheduling, what abilities are necessary towards intelligent scheduling, how to leverage intelligent scheduling to transfer NP-hard online scheduling problems to resolvable offline scheduling issues. We designed and implemented an intelligent cloud datacenter scheduler, which automatically performs resource-to-performance modeling, predictive optimal reservation estimation, QoS (interference)-aware predictive scheduling to maximize resource efficiency of multi-dimensions (CPU, Memory, Network, Disk I/O), and strictly guarantee service level agreements (SLA) for long-running workloads. Finally, we introduced a large-scale co-location techniques of executing long-running and other workloads on the shared global datacenter infrastructure of Alibaba Group. It effectively improves cluster utilization from 10% to averagely 50%. It is far more complicated beyond scheduling that involves technique evolutions of IDC, network, physical datacenter topology, storage, server hardwares, operating systems and containerization. We demonstrate its effectiveness by analysis of newest Alibaba public cluster trace in 2017. We are the first of works to reveal the global view of scenarios, challenges and status in Alibaba large-scale global datacenters by data demonstration, including big promotion events like Double 11 . Data-driven intelligent scheduling methodologies and effective infrastructure co-location techniques are critical and necessary to pursue maximized multi-resource efficiency in modern large-scale datacenter, especially for long-running workloads

    epcAware: a game-based, energy, performance and cost efficient resource management technique for multi-access edge computing

    Get PDF
    The Internet of Things (IoT) is producing an extraordinary volume of data daily, and it is possible that the data may become useless while on its way to the cloud for analysis, due to longer distances and delays. Fog/edge computing is a new model for analyzing and acting on time-sensitive data (real-time applications) at the network edge, adjacent to where it is produced. The model sends only selected data to the cloud for analysis and long-term storage. Furthermore, cloud services provided by large companies such as Google, can also be localized to minimize the response time and increase service agility. This could be accomplished through deploying small-scale datacenters (reffered to by name as cloudlets) where essential, closer to customers (IoT devices) and connected to a centrealised cloud through networks - which form a multi-access edge cloud (MEC). The MEC setup involves three different parties, i.e. service providers (IaaS), application providers (SaaS), network providers (NaaS); which might have different goals, therefore, making resource management a defficult job. In the literature, various resource management techniques have been suggested in the context of what kind of services should they host and how the available resources should be allocated to customers’ applications, particularly, if mobility is involved. However, the existing literature considers the resource management problem with respect to a single party. In this paper, we assume resource management with respect to all three parties i.e. IaaS, SaaS, NaaS; and suggest a game theoritic resource management technique that minimises infrastructure energy consumption and costs while ensuring applications performance. Our empirical evaluation, using real workload traces from Google’s cluster, suggests that our approach could reduce up to 11.95% energy consumption, and approximately 17.86% user costs with negligible loss in performance. Moreover, IaaS can reduce up to 20.27% energy bills and NaaS can increase their costs savings up to 18.52% as compared to other methods

    Classification and Performance Study of Task Scheduling Algorithms in Cloud Computing Environment

    Get PDF
    Cloud computing is becoming very common in recent years and is growing rapidly due to its attractive benefits and features such as resource pooling, accessibility, availability, scalability, reliability, cost saving, security, flexibility, on-demand services, pay-per-use services, use from anywhere, quality of service, resilience, etc. With this rapid growth of cloud computing, there may exist too many users that require services or need to execute their tasks simultaneously by resources provided by service providers. To get these services with the best performance, and minimum cost, response time, makespan, effective use of resources, etc. an intelligent and efficient task scheduling technique is required and considered as one of the main and essential issues in the cloud computing environment. It is necessary for allocating tasks to the proper cloud resources and optimizing the overall system performance. To this end, researchers put huge efforts to develop several classes of scheduling algorithms to be suitable for the various computing environments and to satisfy the needs of the various types of individuals and organizations. This research article provides a classification of proposed scheduling strategies and developed algorithms in cloud computing environment along with the evaluation of their performance. A comparison of the performance of these algorithms with existing ones is also given. Additionally, the future research work in the reviewed articles (if available) is also pointed out. This research work includes a review of 88 task scheduling algorithms in cloud computing environment distributed over the seven scheduling classes suggested in this study. Each article deals with a novel scheduling technique and the performance improvement it introduces compared with previously existing task scheduling algorithms. Keywords: Cloud computing, Task scheduling, Load balancing, Makespan, Energy-aware, Turnaround time, Response time, Cost of task, QoS, Multi-objective. DOI: 10.7176/IKM/12-5-03 Publication date:September 30th 2022

    An Algorithm for Network and Data-aware Placement of Multi-Tier Applications in Cloud Data Centers

    Full text link
    Today's Cloud applications are dominated by composite applications comprising multiple computing and data components with strong communication correlations among them. Although Cloud providers are deploying large number of computing and storage devices to address the ever increasing demand for computing and storage resources, network resource demands are emerging as one of the key areas of performance bottleneck. This paper addresses network-aware placement of virtual components (computing and data) of multi-tier applications in data centers and formally defines the placement as an optimization problem. The simultaneous placement of Virtual Machines and data blocks aims at reducing the network overhead of the data center network infrastructure. A greedy heuristic is proposed for the on-demand application components placement that localizes network traffic in the data center interconnect. Such optimization helps reducing communication overhead in upper layer network switches that will eventually reduce the overall traffic volume across the data center. This, in turn, will help reducing packet transmission delay, increasing network performance, and minimizing the energy consumption of network components. Experimental results demonstrate performance superiority of the proposed algorithm over other approaches where it outperforms the state-of-the-art network-aware application placement algorithm across all performance metrics by reducing the average network cost up to 67% and network usage at core switches up to 84%, as well as increasing the average number of application deployments up to 18%.Comment: Submitted for publication consideration for the Journal of Network and Computer Applications (JNCA). Total page: 28. Number of figures: 15 figure

    On the Importance of Infrastructure-Awareness in Large-Scale Distributed Storage Systems

    Get PDF
    Big data applications put significant latency and throughput demands on distributed storage systems. Meeting these demands requires storage systems to use a significant amount of infrastructure resources, such as network capacity and storage devices. Resource demands largely depend on the workloads and can vary significantly over time. Moreover, demand hotspots can move rapidly between different infrastructure locations. Existing storage systems are largely infrastructure-oblivious as they are designed to support a broad range of hardware and deployment scenarios. Most only use basic configuration information about the infrastructure to make important placement and routing decisions. In the case of cloud-based storage systems, cloud services have their own infrastructure-specific limitations, such as minimum request sizes and maximum number of concurrent requests. By ignoring infrastructure-specific details, these storage systems are unable to react to resource demand changes and may have additional inefficiencies from performing redundant network operations. As a result, provisioning enough resources for these systems to address all possible workloads and scenarios would be cost prohibitive. This thesis studies the performance problems in commonly used distributed storage systems and introduces novel infrastructure-aware design methods to improve their performance. First, it addresses the problem of slow reads due to network congestion that is induced by disjoint replica and path selection. Selecting a read replica separately from the network path can perform poorly if all paths to the pre-selected endpoints are congested. Second, this thesis looks at scalability limitations of consensus protocols that are commonly used in geo-distributed key value stores and distributed ledgers. Due to their network-oblivious designs, existing protocols redundantly communicate over highly oversubscribed WAN links, which poorly utilize network resources and limits consistent replication at large scale. Finally, this thesis addresses the need for a cloud-specific realtime storage system for capital market use cases. Public cloud infrastructures provide feature-rich and cost-effective storage services. However, existing realtime timeseries databases are not built to take advantage of cloud storage services. Therefore, they do not effectively utilize cloud services to provide high performance while minimizing deployment cost. This thesis presents three systems that address these problems by using infrastructure-aware design methods. Our performance evaluation of these systems shows that infrastructure-aware design is highly effective in improving the performance of large scale distributed storage systems

    Towards Power- and Energy-Efficient Datacenters

    Full text link
    As the Internet evolves, cloud computing is now a dominant form of computation in modern lives. Warehouse-scale computers (WSCs), or datacenters, comprising the foundation of this cloud-centric web have been able to deliver satisfactory performance to both the Internet companies and the customers. With the increased focus and popularity of the cloud, however, datacenter loads rise and grow rapidly, and Internet companies are in need of boosted computing capacity to serve such demand. Unfortunately, power and energy are often the major limiting factors prohibiting datacenter growth: it is often the case that no more servers can be added to datacenters without surpassing the capacity of the existing power infrastructure. This dissertation aims to investigate the issues of power and energy usage in a modern datacenter environment. We identify the source of power and energy inefficiency at three levels in a modern datacenter environment and provides insights and solutions to address each of these problems, aiming to prepare datacenters for critical future growth. We start at the datacenter-level and find that the peak provisioning and improper service placement in multi-level power delivery infrastructures fragment the power budget inside production datacenters, degrading the compute capacity the existing infrastructure can support. We find that the heterogeneity among datacenter workloads is key to address this issue and design systematic methods to reduce the fragmentation and improve the utilization of the power budget. This dissertation then narrow the focus to examine the energy usage of individual servers running cloud workloads. Especially, we examine the power management mechanisms employed in these servers and find that the coarse time granularity of these mechanisms is one critical factor that leads to excessive energy consumption. We propose an intelligent and low overhead solution on top of the emerging finer granularity voltage/frequency boosting circuit to effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost, improving energy efficiency without sacrificing the quality of services. The final focus of this dissertation takes a further step to investigate how using a fundamentally more efficient computing substrate, field programmable gate arrays (FPGAs), benefit datacenter power and energy efficiency. Different from other types of hardware accelerations, FPGAs can be reconfigured on-the-fly to provide fine-grain control over hardware resource allocation and presents a unique set of challenges for optimal workload scheduling and resource allocation. We aim to design a set coordinated algorithms to manage these two key factors simultaneously and fully explore the benefit of deploying FPGAs in the highly varying cloud environment.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144043/1/hsuch_1.pd

    Modeling and simulation of data-driven applications in SDN-aware environments

    Get PDF
    PhD ThesisThe rising popularity of Software-Defined Networking (SDN) is increasing as it promises to offer a window of opportunity and new features in terms of network performance, configuration, and management. As such, SDN is exploited by several emerging applications and environments, such as cloud computing, edge computing, IoT, and data- driven applications. Although SDN has demonstrated significant improvements in industry, still little research has explored the embracing of SDN in the area of cross-layer optimization in different SDN-aware environments. Each application and computing environment require different functionalities and Quality of Service (QoS) requirements. For example, a typical MapReduce application would require data transmission at three different times while the data transmission of stream-based applications would be unknown due to uncertainty about the number of required tasks and dependencies among stream tasks. As such, the deployment of SDN with different applications are not identical, which require different deployment strategies and algorithms to meet different QoS requirements (e.g., high bandwidth, deadline). Further, each application and environment has unique architectures, which impose different form of complexity in terms of computing, storage, and network. Due to such complexities, finding optimal solutions for SDN-aware applications and environments become very challenging. Therefore, this thesis presents multilateral research towards optimization, modeling, and simulation of cross-layer optimization of SDN-aware applications and environments. Several tools and algorithms have been proposed, implemented, and evaluated, considering various environments and applications[1–4]. The main contributions of this thesis are as follows: • Proposing and modeling a new holistic framework that simulates MapReduce ap- plications, big data management systems (BDMS), and SDN-aware networks in cloud-based environments. Theoretical and mathematical models of MapReduce in SDN-aware cloud datacenters are also proposedThe government of Saudi Arabia represented by Saudi Electronic University (SEU) and the Royal Embassy of Saudi Arabia Cultural Burea

    A cost-efficient QoS-aware analytical model of future software content delivery networks

    Get PDF
    Freelance, part-time, work-at-home, and other flexible jobs are changing the concept of workplace, and bringing information and content exchange problems to companies. Geographically spread corporations may use remote distribution of software and data to attend employees' demands, by exploiting emerging delivery technologies. In this context, cost-efficient software distribution is crucial to allow business evolution and make IT infrastructures more agile. On the other hand, container based virtualization technology is shaping the new trends of software deployment and infrastructure design. We envision current and future enterprise IT management trends evolving towards container based software delivery over Hybrid CDNs. This paper presents a novel cost-efficient QoS aware analytical model and a Hybrid CDN-P2P architecture for enterprise software distribution. The model would allow delivery cost minimization for a wide range of companies, from big multinationals to SMEs, using CDN-P2P distribution under various industrial hypothetical scenarios. Model constraints guarantee acceptable deployment times and keep interchanged content amounts below the bandwidth and storage network limits in our scenarios. Indeed, key model parameters account for network bandwidth, storage limits and rental prices, which are empirically determined from their offered values by the commercial delivery networks KeyCDN, MaxCDN, CDN77 and BunnyCDN. This preliminary study indicates that MaxCDN offers the best cost-QoS trade-off. The model is implemented in the network simulation tool PeerSim, and then applied to diverse testing scenarios by varying company types, number and profile (either, technical or administrative) of employees and the number and size of content requests. Hybrid simulation results show overall economic savings between 5\% and 20\%, compared to just hiring resources from a commercial CDN, while guaranteeing satisfactory QoS levels in terms of deployment times and number of served requests.This work was partially supported by Generalitat de Catalunya under the SGR Program (2017-SGR-962) and the RIS3CAT DRAC Project (001-P-001723). We have also received funding from Ministry of Science and Innovation (Spain) under the project EQC2019-005653-P.Peer ReviewedPostprint (author's final draft

    Modeling virtualized application performance from hypervisor counters

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 61-64).Managing a virtualized datacenter has grown more challenging, as each virtual machine's service level agreement (SLA) must be satisfied, when the service levels are generally inaccessible to the hypervisor. To aid in VM consolidation and service level assurance, we develop a modeling technique that generates accurate models of service level. Using only hypervisor counters as inputs, we train models to predict application response times and predict SLA violations. To collect training data, we conduct a simulation phase which stresses the application across many workloads levels, and collects each response time. Simultaneously, hypervisor performance counters are collected. Afterwards, the data is synchronized and used as training data in ensemble-based genetic programming for symbolic regression. This modeling technique is quite efficient at dealing with high-dimensional datasets, and it also generates interpretable models. After training models for web servers and virtual desktops, we test generalization across different content. In our experiments, we found that our technique could distill small subsets of important hypervisor counters from over 700 counters. This was tested for both Apache web servers and Windows-based virtual desktop infrastructures. For the web servers, we accurately modeled the breakdown points and also the service levels. Our models could predict service levels with 90.5% accuracy on a test set. On a untrained scenario with completely different contending content, our models predict service levels with 70% accuracy, but predict SLA violation with 92.7% accuracy. For the virtual desktops, on test scenarios similar to training scenarios, model accuracy was 97.6%. Our main contribution is demonstrating that a completely data-driven approach to application performance modeling can be successful. In contrast to many other works, our models do not use workload level or response times as inputs to the models, but nevertheless predicts service level accurately. Our approach also lets the models determine which inputs are important to a particular model's performance, rather than hand choosing a few inputs to train on.by Lawrence L. Chan.M.Eng