1 research outputs found

    Carbon-profit-aware job scheduling and load balancing in geographically distributed cloud for HPC and web applications

    Get PDF
    This thesis introduces two carbon-profit-aware control mechanisms that can be used to improve performance of job scheduling and load balancing in an interconnected system of geographically distributed data centers for HPC and web applications. These control mechanisms consist of three primary components that perform: 1) measurement and modeling, 2) job planning, and 3) plan execution. The measurement and modeling component provide information on energy consumption and carbon footprint as well as utilization, weather, and pricing information. The job planning component uses this information to suggest the best arrangement of applications as a possible configuration to the plan execution component to perform it on the system. For reporting and decision making purposes, some metrics need to be modeled based on directly measured inputs. There are two challenges in accurately modeling of these necessary metrics: 1) feature selection and 2) curve fitting (regression). First, to improve the accuracy of power consumption models of the underutilized servers, advanced fitting methodologies were used on the selected server features. The resulting model is then evaluated on real servers and is used as part of load balancing mechanism for web applications. We also provide an inclusive model for cooling system in data centers to optimize the power consumption of cooling system, which in turn is used by the planning component. Furthermore, we introduce another model to calculate the profit of the system based on the price of electricity, carbon tax, operational costs, sales tax, and corporation taxes. This model is used for optimized scheduling of HPC jobs. For position allocation of web applications, a new heuristic algorithm is introduced for load balancing of virtual machines in a geographically distributed system in order to improve its carbon awareness. This new heuristic algorithm is based on genetic algorithm and is specifically tailored for optimization problems of interconnected system of distributed data centers. A simple version of this heuristic algorithm has been implemented in the GSN project, as a carbon-aware controller. Similarly, for scheduling of HPC jobs on servers, two new metrics are introduced: 1) profitper-core-hour-GHz and 2) virtual carbon tax. In the HPC job scheduler, these new metrics are used to maximize profit and minimize the carbon footprint of the system, respectively. Once the application execution plan is determined, plan execution component will attempt to implement it on the system. Plan execution component immediately uses the hypervisors on physical servers to create, remove, and migrate virtual machines. It also executes and controls the HPC jobs or web applications on the virtual machines. For validating systems designed using the proposed modeling and planning components, a simulation platform using real system data was developed, and new methodologies were compared with the state-of-the-art methods considering various scenarios. The experimental results show improvement in power modeling of servers, significant carbon reduction in load balancing of web applications, and significant profit-carbon improvement in HPC job scheduling
    corecore