17 research outputs found

    Resource management for cost-effective cloud and edge systems

    Get PDF
    With the booming of Internet-based and cloud/edge computing applications and services,datacenters hosting these services have become ubiquitous in every sector of our economy which leads to tremendous research opportunities. Specifically, in cloud computing, all data are gathered and processed in centralized cloud datacenters whereas in edge computing, the frontier of data and services is pushed away from the centralized cloud to the edge of the network. By fusing edge computing with cloud computing, the Internet companies and end users can benefit from their respective merits, abundant computation and storage resources from cloud computing, and the data-gathering potential of edge computing. However, resource management in cloud and edge systems is complicated and challenging due to the large scale of cloud datacenters, diverse interconnected resource types, unpredictable generated workloads, and a range of performance objectives. It necessitates the systematic modeling of cloud and edge systems to achieve desired performance objectives.This dissertation presents a holistic system modeling and novel solution methodology to effectivelysolve the optimization problems formulated in three cloud and edge architectures: 1) cloud computing in colocation datacenters; 2) cloud computing in geographically distributed datacenters; 3) UAV-enabled mobile edge computing. First, we study resource management with the goal of overall cost minimization in the context of cloud computing systems. A cooperative game is formulated to model the scenario where a multi-tenant colocation datacenter collectively procures electricity in the wholesale electricity market. Then, a two-stage stochastic programming is formulated to model the scenario where geographically distributed datacenters dispatch workload and procure electricity in the multi-timescale electricity markets. Last, we extend our focus on joint task offloading and resource management with the goal of overall cost minimization in the context of edge computing systems, where edge nodes with computing capabilities are deployed in proximity to end users. A nonconvex optimization problem is formulated in the UAV-enabled mobile edge computing system with the goal of minimizing both energy consumption for computation and task offloading and system response delay. Furthermore, a novel hybrid algorithm that unifies differential evolution and successive convex approximation is proposed to efficiently solve the problem with improved performance.This dissertation addresses several fundamental issues related to resource management incloud and edge computing systems that will further in-depth investigations to improve costeffective performance. The advanced modeling and efficient algorithms developed in this research enable the system operator to make optimal and strategic decisions in resource allocation and task offloading for cost savings

    Artificial Intelligence and Machine Learning Approaches to Energy Demand-Side Response: A Systematic Review

    Get PDF
    Recent years have seen an increasing interest in Demand Response (DR) as a means to provide flexibility, and hence improve the reliability of energy systems in a cost-effective way. Yet, the high complexity of the tasks associated with DR, combined with their use of large-scale data and the frequent need for near real-time de-cisions, means that Artificial Intelligence (AI) and Machine Learning (ML) — a branch of AI — have recently emerged as key technologies for enabling demand-side response. AI methods can be used to tackle various challenges, ranging from selecting the optimal set of consumers to respond, learning their attributes and pref-erences, dynamic pricing, scheduling and control of devices, learning how to incentivise participants in the DR schemes and how to reward them in a fair and economically efficient way. This work provides an overview of AI methods utilised for DR applications, based on a systematic review of over 160 papers, 40 companies and commercial initiatives, and 21 large-scale projects. The papers are classified with regards to both the AI/ML algorithm(s) used and the application area in energy DR. Next, commercial initiatives are presented (including both start-ups and established companies) and large-scale innovation projects, where AI methods have been used for energy DR. The paper concludes with a discussion of advantages and potential limitations of reviewed AI techniques for different DR tasks, and outlines directions for future research in this fast-growing area

    Strategic and operational services for workload management in the cloud

    Full text link
    In hosting environments such as Infrastructure as a Service (IaaS) clouds, desirable application performance is typically guaranteed through the use of Service Level Agreements (SLAs), which specify minimal fractions of resource capacities that must be allocated by a service provider for unencumbered use by customers to ensure proper operation of their workloads. Most IaaS offerings are presented to customers as fixed-size and fixed-price SLAs, that do not match well the needs of specific applications. Furthermore, arbitrary colocation of applications with different SLAs may result in inefficient utilization of hosts' resources, resulting in economically undesirable customer behavior. In this thesis, we propose the design and architecture of a Colocation as a Service (CaaS) framework: a set of strategic and operational services that allow the efficient colocation of customer workloads. CaaS strategic services provide customers the means to specify their application workload using an SLA language that provides them the opportunity and incentive to take advantage of any tolerances they may have regarding the scheduling of their workloads. CaaS operational services provide the information necessary for, and carry out the reconfigurations mandated by strategic services. We recognize that it could be the case that there are multiple, yet functionally equivalent ways to express an SLA. Thus, towards that end, we present a service that allows the provably-safe transformation of SLAs from one form to another for the purpose of achieving more efficient colocation. Our CaaS framework could be incorporated into an IaaS offering by providers or it could be implemented as a value added proposition by IaaS resellers. To establish the practicality of such offerings, we present a prototype implementation of our proposed CaaS framework

    Artificial intelligence for decision making in energy demand-side response

    Get PDF
    This thesis examines the role and application of data-driven Artificial Intelligence (AI) approaches for the energy demand-side response (DR). It follows the point of view of a service provider company/aggregator looking to support its decision-making and operation. Overall, the study identifies data-driven AI methods as an essential tool and a key enabler for DR. The thesis is organised into two parts. It first provides an overview of AI methods utilised for DR applications based on a systematic review of over 160 papers, 40 commercial initiatives, and 21 large-scale projects. The reviewed work is categorised based on the type of AI algorithm(s) employed and the DR application area of the AI methods. The end of the first part of the thesis discusses the advantages and potential limitations of the reviewed AI techniques for different DR tasks and how they compare to traditional approaches. The second part of the thesis centres around designing machine learning algorithms for DR. The undertaken empirical work highlights the importance of data quality for providing fair, robust, and safe AI systems in DR — a high-stakes domain. It furthers the state of the art by providing a structured approach for data preparation and data augmentation in DR to minimise propagating effects in the modelling process. The empirical findings on residential response behaviour show better response behaviour in households with internet access, air-conditioning systems, power-intensive appliances, and lower gas usage. However, some insights raise questions about whether the reported levels of consumers’ engagement in DR schemes translate to actual curtailment behaviour and the individual rationale of customer response to DR signals. The presented approach also proposes a reinforcement learning framework for the decision problem of an aggregator selecting a set of consumers for DR events. This approach can support an aggregator in leveraging small-scale flexibility resources by providing an automated end-to-end framework to select the set of consumers for demand curtailment during Demand-Side Response (DR) signals in a dynamic environment while considering a long-term view of their selection process

    Intelligent middleware for HPC systems to improve performance and energy cost efficiency

    Full text link
    High-performance computing (HPC) systems play an essential role in large-scale scientific computations. As the number of nodes in HPC systems continues to increase, their power consumption leads to larger energy costs. The energy costs pose a financial burden on maintaining HPC systems, which will be more challenging on future extreme-scale systems where the number of nodes and power consumption are expected to further grow. To support this growth, higher degrees of network and memory resource sharing are implemented, causing a substantial increase in performance variation and degradation. These challenges call for innovations in HPC system middleware that reduce energy cost without trading off performance. By taking the performance of an HPC system as a first-order constraint, this thesis establishes that HPC systems can participate in demand response programs while providing performance guarantees through a novel design of the middleware. Well-designed middleware also enables enhanced performance by mitigating resource contention induced by energy or cost restrictions. This thesis aims to realize these goals through two complementary approaches. First, this thesis proposes novel policies for HPC systems to enable their participation in emerging power markets, where participants reduce their energy costs by following market requirements. Our policies guarantee that the Quality-of-Service (QoS) of jobs does not drop below given constraints and systematically optimize cost reduction based on large deviation analysis in queueing theory. Through experiments on a real-world cluster whose power consumption is regulated to follow a dynamically changing power target, this thesis claims that HPC systems can participate in emerging power programs without violating the QoS constraints of jobs. Second, this thesis proposes novel resource management strategies to improve the performance of HPC systems. Better resource management can mitigate contention that causes performance degradation and poor system utilization. To resolve network contention, we design an intelligent job allocation policy for HPC systems that incorporate the state-of-the-art dragonfly network topology. Our allocation policy mitigates network contention, reduces network communication latency, and consequently improves the performance of the systems. As some latest HPC systems support the collection of high-granularity network performance metrics at runtime, we also propose a method to quantify the impact of network congestion and demonstrate that a network-data-driven job allocation policy improves HPC performance by avoiding network traffic hot spots.2022-01-18T00:00:00

    Strategic and operational services for workload management in the cloud (PhD thesis)

    Full text link
    In hosting environments such as Infrastructure as a Service (IaaS) clouds, desirable application performance is typically guaranteed through the use of Service Level Agreements (SLAs), which specify minimal fractions of resource capacities that must be allocated by a service provider for unencumbered use by customers to ensure proper operation of their workloads. Most IaaS offerings are presented to customers as fixed-size and fixed-price SLAs, that do not match well the needs of specific applications. Furthermore, arbitrary colocation of applications with different SLAs may result in inefficient utilization of hosts’ resources, resulting in economically undesirable customer behavior. In this thesis, we propose the design and architecture of a Colocation as a Service (CaaS) framework: a set of strategic and operational services that allow the efficient colocation of customer workloads. CaaS strategic services provide customers the means to specify their application workload using an SLA language that provides them the opportunity and incentive to take advantage of any tolerances they may have regarding the scheduling of their workloads. CaaS operational services provide the information necessary for, and carry out the reconfigurations mandated by strategic services. We recognize that it could be the case that there are multiple, yet functionally equivalent ways to express an SLA. Thus, towards that end, we present a service that allows the provably-safe transformation of SLAs from one form to another for the purpose of achieving more efficient colocation. Our CaaS framework could be incorporated into an IaaS offering by providers or it could be implemented as a value added proposition by IaaS resellers. To establish the practicality of such offerings, we present a prototype implementation of our proposed CaaS framework. (Major Advisor: Azer Bestavros

    Energy-efficient Communications in Cloud, Mobile Cloud and Fog Computing

    Get PDF
    This thesis studies the problem of energy efficiency of communications in distributed computing paradigms, including cloud computing, mobile cloud computing and fog/edge computing. Distributed computing paradigms have significantly changed the way of doing business. With cloud computing, companies and end users can access the vast majority services online through a virtualized environment in a pay-as-you-go basis. %Three are the main services typically consumed by cloud users are Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). Mobile cloud and fog/edge computing are the natural extension of the cloud computing paradigm for mobile and Internet of Things (IoT) devices. Based on offloading, the process of outsourcing computing tasks from mobile devices to the cloud, mobile cloud and fog/edge computing paradigms have become popular techniques to augment the capabilities of the mobile devices and to reduce their battery drain. Being equipped with a number of sensors, the proliferation of mobile and IoT devices has given rise to a new cloud-based paradigm for collecting data, which is called mobile crowdsensing as for proper operation it requires a large number of participants. A plethora of communication technologies is applicable to distributing computing paradigms. For example, cloud data centers typically implement wired technologies while mobile cloud and fog/edge environments exploit wireless technologies such as 3G/4G, WiFi and Bluetooth. Communication technologies directly impact the performance and the energy drain of the system. This Ph.D. thesis analyzes from a global perspective the efficiency in using energy of communications systems in distributed computing paradigms. In particular, the following contributions are proposed: - A new framework of performance metrics for communication systems of cloud computing data centers. The proposed framework allows a fine-grain analysis and comparison of communication systems, processes, and protocols, defining their influence on the performance of cloud applications. - A novel model for the problem of computation offloading, which describes the workflow of mobile applications through a new Directed Acyclic Graph (DAG) technique. This methodology is suitable for IoT devices working in fog computing environments and was used to design an Android application, called TreeGlass, which performs recognition of trees using Google Glass. TreeGlass is evaluated experimentally in different offloading scenarios by measuring battery drain and time of execution as key performance indicators. - In mobile crowdsensing systems, novel performance metrics and a new framework for data acquisition, which exploits a new policy for user recruitment. Performance of the framework are validated through CrowdSenSim, which is a new simulator designed for mobile crowdsensing activities in large scale urban scenarios
    corecore