30 research outputs found

    Resource Management In Cloud And Big Data Systems

    Get PDF
    Cloud computing is a paradigm shift in computing, where services are offered and acquired on demand in a cost-effective way. These services are often virtualized, and they can handle the computing needs of big data analytics. The ever-growing demand for cloud services arises in many areas including healthcare, transportation, energy systems, and manufacturing. However, cloud resources such as computing power, storage, energy, dollars for infrastructure, and dollars for operations, are limited. Effective use of the existing resources raises several fundamental challenges that place the cloud resource management at the heart of the cloud providers\u27 decision-making process. One of these challenges faced by the cloud providers is to provision, allocate, and price the resources such that their profit is maximized and the resources are utilized efficiently. In addition, executing large-scale applications in clouds may require resources from several cloud providers. Another challenge when processing data intensive applications is minimizing their energy costs. Electricity used in US data centers in 2010 accounted for about 2% of total electricity used nationwide. In addition, the energy consumed by the data centers is growing at over 15% annually, and the energy costs make up about 42% of the data centers\u27 operating costs. Therefore, it is critical for the data centers to minimize their energy consumption when offering services to customers. In this Ph.D. dissertation, we address these challenges by designing, developing, and analyzing mechanisms for resource management in cloud computing systems and data centers. The goal is to allocate resources efficiently while optimizing a global performance objective of the system (e.g., maximizing revenue, maximizing social welfare, or minimizing energy). We improve the state-of-the-art in both methodologies and applications. As for methodologies, we introduce novel resource management mechanisms based on mechanism design, approximation algorithms, cooperative game theory, and hedonic games. These mechanisms can be applied in cloud virtual machine (VM) allocation and pricing, cloud federation formation, and energy-efficient computing. In this dissertation, we outline our contributions and possible directions for future research in this field

    Resource Management In Cloud And Big Data Systems

    Get PDF
    Cloud computing is a paradigm shift in computing, where services are offered and acquired on demand in a cost-effective way. These services are often virtualized, and they can handle the computing needs of big data analytics. The ever-growing demand for cloud services arises in many areas including healthcare, transportation, energy systems, and manufacturing. However, cloud resources such as computing power, storage, energy, dollars for infrastructure, and dollars for operations, are limited. Effective use of the existing resources raises several fundamental challenges that place the cloud resource management at the heart of the cloud providers\u27 decision-making process. One of these challenges faced by the cloud providers is to provision, allocate, and price the resources such that their profit is maximized and the resources are utilized efficiently. In addition, executing large-scale applications in clouds may require resources from several cloud providers. Another challenge when processing data intensive applications is minimizing their energy costs. Electricity used in US data centers in 2010 accounted for about 2% of total electricity used nationwide. In addition, the energy consumed by the data centers is growing at over 15% annually, and the energy costs make up about 42% of the data centers\u27 operating costs. Therefore, it is critical for the data centers to minimize their energy consumption when offering services to customers. In this Ph.D. dissertation, we address these challenges by designing, developing, and analyzing mechanisms for resource management in cloud computing systems and data centers. The goal is to allocate resources efficiently while optimizing a global performance objective of the system (e.g., maximizing revenue, maximizing social welfare, or minimizing energy). We improve the state-of-the-art in both methodologies and applications. As for methodologies, we introduce novel resource management mechanisms based on mechanism design, approximation algorithms, cooperative game theory, and hedonic games. These mechanisms can be applied in cloud virtual machine (VM) allocation and pricing, cloud federation formation, and energy-efficient computing. In this dissertation, we outline our contributions and possible directions for future research in this field

    A framework for allocating server time to spot and on-demand services in cloud computing

    Get PDF
    Cloud computing delivers value to users by facilitating their access to computing capacity in periods when their need arises. An approach is to provide both on-demand and spot services on shared servers. The former allows users to access servers on demand at a fixed price and users occupy different periods of servers. The latter allows users to bid for the remaining unoccupied periods via dynamic pricing; however, without appropriate design, such periods may be arbitrarily small since on-demand users arrive randomly. This is also the current service model adopted by Amazon Elastic Cloud Compute. In this paper, we provide the first integral framework for sharing the time of servers between on-demand and spot services while optimally pricing spot instances. It guarantees that on-demand users can get served quickly while spot users can stably utilize servers for a properly long period once accepted, which is a key feature to make both on-demand and spot services accessible. Simulation results show that, by complementing the on-demand market with a spot market, a cloud provider can improve revenue by up to 464.7%. The framework is designed under assumptions which are met in real environments. It is a new tool that cloud operators can use to quantify the advantage of a hybrid spot and on-demand service, eventually making the case for operating such service model in their own infrastructures

    Optimized Contract-based Model for Resource Allocation in Federated Geo-distributed Clouds

    Get PDF
    In the era of Big Data, with data growing massively in scale and velocity, cloud computing and its pay-as-you-go modelcontinues to provide significant cost benefits and a seamless service delivery model for cloud consumers. The evolution of small-scaleand large-scale geo-distributed datacenters operated and managed by individual Cloud Service Providers (CSPs) raises newchallenges in terms of effective global resource sharing and management of autonomously-controlled individual datacenter resourcestowards a globally efficient resource allocation model. Earlier solutions for geo-distributed clouds have focused primarily on achievingglobal efficiency in resource sharing, that although tries to maximize the global resource allocation, results in significant inefficiencies inlocal resource allocation for individual datacenters and individual cloud provi ders leading to unfairness in their revenue and profitearned. In this paper, we propose a new contracts-based resource sharing model for federated geo-distributed clouds that allows CSPsto establish resource sharing contracts with individual datacentersapriorifor defined time intervals during a 24 hour time period. Based on the established contracts, individual CSPs employ a contracts cost and duration aware job scheduling and provisioning algorithm that enables jobs to complete and meet their response time requirements while achieving both global resource allocation efficiency and local fairness in the profit earned. The proposed techniques are evaluated through extensive experiments using realistic workloads generated using the SHARCNET cluster trace. The experiments demonstrate the effectiveness, scalability and resource sharing fairness of the proposed model

    Strategies to Manage Cloud Computing Operational Costs

    Get PDF
    Information technology (IT) managers worldwide have adopted cloud computing because of its potential to improve reliability, scalability, security, business agility, and cost savings; however, the rapid adoption of cloud computing has created challenges for IT managers, who have reported an estimated 30% wastage of cloud resources. The purpose of this single case study was to explore successful strategies and processes for managing infrastructure operations costs in cloud computing. The sociotechnical systems (STS) approach was the conceptual framework for the study. Semistructured interviews were conducted with 6 IT managers directly involved in cloud cost management. The data were analyzed using a qualitative data-analysis software to identify initial categories and emerging themes, which were refined in alignment with the STS framework. The key themes from the analysis indicated that successful cloud cost management began with assessing the current environment and architecting applications and systems to fit cloud services, using tools for monitoring and reporting, and actively managing costs in alignment with medium- and long-term goals. Findings also indicated that social considerations such as fostering collaboration among all stakeholders, employee training, and skills development were critical for success. The implications for positive social change that derive from effectively managing operational costs include improved financial posture, job stability, and environmental sustainability

    Geo-distributed Edge and Cloud Resource Management for Low-latency Stream Processing

    Get PDF
    The proliferation of Internet-of-Things (IoT) devices is rapidly increasing the demands for efficient processing of low latency stream data generated close to the edge of the network. Edge Computing provides a layer of infrastructure to fill latency gaps between the IoT devices and the back-end cloud computing infrastructure. A large number of IoT applications require continuous processing of data streams in real-time. Edge computing-based stream processing techniques that carefully consider the heterogeneity of the computing and network resources available in the geo-distributed infrastructure provide significant benefits in optimizing the throughput and end-to-end latency of the data streams. Managing geo-distributed resources operated by individual service providers raises new challenges in terms of effective global resource sharing and achieving global efficiency in the resource allocation process. In this dissertation, we present a distributed stream processing framework that optimizes the performance of stream processing applications through a careful allocation of computing and network resources available at the edge of the network. The proposed approach differentiates itself from the state-of-the-art through its careful consideration of data locality and resource constraints during physical plan generation and operator placement for the stream queries. Additionally, it considers co-flow dependencies that exist between the data streams to optimize the network resource allocation through an application-level rate control mechanism. The proposed framework incorporates resilience through a cost-aware partial active replication strategy that minimizes the recovery cost when applications incur failures. The framework employs a reinforcement learning-based online learning model for dynamically determining the level of parallelism to adapt to changing workload conditions. The second dimension of this dissertation proposes a novel model for allocating computing resources in edge and cloud computing environments. In edge computing environments, it allows service providers to establish resource sharing contracts with infrastructure providers apriori in a latency-aware manner. In geo-distributed cloud environments, it allows cloud service providers to establish resource sharing contracts with individual datacenters apriori for defined time intervals in a cost-aware manner. Based on these mechanisms, we develop a decentralized implementation of the contract-based resource allocation model for geo-distributed resources using Smart Contracts in Ethereum

    DRIVE: A Distributed Economic Meta-Scheduler for the Federation of Grid and Cloud Systems

    No full text
    The computational landscape is littered with islands of disjoint resource providers including commercial Clouds, private Clouds, national Grids, institutional Grids, clusters, and data centers. These providers are independent and isolated due to a lack of communication and coordination, they are also often proprietary without standardised interfaces, protocols, or execution environments. The lack of standardisation and global transparency has the effect of binding consumers to individual providers. With the increasing ubiquity of computation providers there is an opportunity to create federated architectures that span both Grid and Cloud computing providers effectively creating a global computing infrastructure. In order to realise this vision, secure and scalable mechanisms to coordinate resource access are required. This thesis proposes a generic meta-scheduling architecture to facilitate federated resource allocation in which users can provision resources from a range of heterogeneous (service) providers. Efficient resource allocation is difficult in large scale distributed environments due to the inherent lack of centralised control. In a Grid model, local resource managers govern access to a pool of resources within a single administrative domain but have only a local view of the Grid and are unable to collaborate when allocating jobs. Meta-schedulers act at a higher level able to submit jobs to multiple resource managers, however they are most often deployed on a per-client basis and are therefore concerned with only their allocations, essentially competing against one another. In a federated environment the widespread adoption of utility computing models seen in commercial Cloud providers has re-motivated the need for economically aware meta-schedulers. Economies provide a way to represent the different goals and strategies that exist in a competitive distributed environment. The use of economic allocation principles effectively creates an open service market that provides efficient allocation and incentives for participation. The major contributions of this thesis are the architecture and prototype implementation of the DRIVE meta-scheduler. DRIVE is a Virtual Organisation (VO) based distributed economic metascheduler in which members of the VO collaboratively allocate services or resources. Providers joining the VO contribute obligation services to the VO. These contributed services are in effect membership “dues” and are used in the running of the VOs operations – for example allocation, advertising, and general management. DRIVE is independent from a particular class of provider (Service, Grid, or Cloud) or specific economic protocol. This independence enables allocation in federated environments composed of heterogeneous providers in vastly different scenarios. Protocol independence facilitates the use of arbitrary protocols based on specific requirements and infrastructural availability. For instance, within a single organisation where internal trust exists, users can achieve maximum allocation performance by choosing a simple economic protocol. In a global utility Grid no such trust exists. The same meta-scheduler architecture can be used with a secure protocol which ensures the allocation is carried out fairly in the absence of trust. DRIVE establishes contracts between participants as the result of allocation. A contract describes individual requirements and obligations of each party. A unique two stage contract negotiation protocol is used to minimise the effect of allocation latency. In addition due to the co-op nature of the architecture and the use of secure privacy preserving protocols, DRIVE can be deployed in a distributed environment without requiring large scale dedicated resources. This thesis presents several other contributions related to meta-scheduling and open service markets. To overcome the perceived performance limitations of economic systems four high utilisation strategies have been developed and evaluated. Each strategy is shown to improve occupancy, utilisation and profit using synthetic workloads based on a production Grid trace. The gRAVI service wrapping toolkit is presented to address the difficulty web enabling existing applications. The gRAVI toolkit has been extended for this thesis such that it creates economically aware (DRIVE-enabled) services that can be transparently traded in a DRIVE market without requiring developer input. The final contribution of this thesis is the definition and architecture of a Social Cloud – a dynamic Cloud computing infrastructure composed of virtualised resources contributed by members of a Social network. The Social Cloud prototype is based on DRIVE and highlights the ease in which dynamic DRIVE markets can be created and used in different domains

    Cost-effective resource management for distributed computing

    Get PDF
    Current distributed computing and resource management infrastructures (e.g., Cluster and Grid) suffer from a wide variety of problems related to resource management, which include scalability bottleneck, resource allocation delay, limited quality-of-service (QoS) support, and lack of cost-aware and service level agreement (SLA) mechanisms. This thesis addresses these issues by presenting a cost-effective resource management solution which introduces the possibility of managing geographically distributed resources in resource units that are under the control of a Virtual Authority (VA). A VA is a collection of resources controlled, but not necessarily owned, by a group of users or an authority representing a group of users. It leverages the fact that different resources in disparate locations will have varying usage levels. By creating smaller divisions of resources called VAs, users would be given the opportunity to choose between a variety of cost models, and each VA could rent resources from resource providers when necessary, or could potentially rent out its own resources when underloaded. The resource management is simplified since the user and owner of a resource recognize only the VA because all permissions and charges are associated directly with the VA. The VA is controlled by a ’rental’ policy which is supported by a pool of resources that the system may rent from external resource providers. As far as scheduling is concerned, the VA is independent from competitors and can instead concentrate on managing its own resources. As a result, the VA offers scalable resource management with minimal infrastructure and operating costs. We demonstrate the feasibility of the VA through both a practical implementation of the prototype system and an illustration of its quantitative advantages through the use of extensive simulations. First, the VA concept is demonstrated through a practical implementation of the prototype system. Further, we perform a cost-benefit analysis of current distributed resource infrastructures to demonstrate the potential cost benefit of such a VA system. We then propose a costing model for evaluating the cost effectiveness of the VA approach by using an economic approach that captures revenues generated from applications and expenses incurred from renting resources. Based on our costing methodology, we present rental policies that can potentially offer effective mechanisms for running distributed and parallel applications without a heavy upfront investment and without the cost of maintaining idle resources. By using real workload trace data, we test the effectiveness of our proposed rental approaches. Finally, we propose an extension to the VA framework that promotes long-term negotiations and rentals based on service level agreements or long-term contracts. Based on the extended framework, we present new SLA-aware policies and evaluate them using real workload traces to demonstrate their effectiveness in improving rental decisions
    corecore