3 research outputs found

    A multi-resource load balancing algorithm for cloud cache systems

    Get PDF
    With the advent of cloud computing model, distributed caches have become the cornerstone for building scalable applications. Popular systems like Facebook [1] or Twitter use Memcached [5], a highly scalable distributed object cache, to speed up applications by avoiding database accesses. Distributed object caches assign objects to cache instances based on a hashing function, and objects are not moved from a cache instance to another unless more instances are added to the cache and objects are redistributed. This may lead to situations where some cache instances are overloaded when some of the objects they store are frequently accessed, while other cache instances are less frequently used. In this paper we propose a multi-resource load balancing algorithm for distributed cache systems. The algorithm aims at balancing both CPU and Memory resources among cache instances by redistributing stored data. Considering the possible conflict of balancing multiple resources at the same time, we give CPU and Memory resources weighted priorities based on the runtime load distributions. A scarcer resource is given a higher weight than a less scarce resource when load balancing. The system imbalance degree is evaluated based on monitoring information, and the utility load of a node, a unit for resource consumption. Besides, since continuous rebalance of the system may affect the QoS of applications utilizing the cache system, our data selection policy ensures that each data migration minimizes the system imbalance degree and hence, the total reconfiguration cost can be minimized. An extensive simulation is conducted to compare our policy with other policies. Our policy shows a significant improvement in time efficiency and decrease in reconfiguration cost

    Virtual Machine Management for Efficient Cloud Data Centers with Applications to Big Data Analytics

    Get PDF
    Infrastructure-as-a-Service (IaaS) cloud data centers offer computing resources in the form of virtual machine (VM) instances as a service over the Internet. This allows cloud users to lease and manage computing resources based on the pay-as-you-go model. In such a scenario, the cloud users run their applications on the most appropriate VM instances and pay for the actual resources that are used. To support the growing service demands of end users, cloud providers are now building an increasing number of large-scale IaaS cloud data centers, consisting of many thousands of heterogeneous servers. The ever increasing heterogeneity of both servers and VMs requires efficient management to balance the load in the data centers and, more importantly, to reduce the energy consumption due to underutilized physical servers. To achieve these goals, the key aspect is to eliminate inefficiencies while using computing resources. This dissertation investigates the VM management problem for efficient IaaS cloud data centers. In particular, it considers VM placement and VM consolidation to achieve effective load balancing and energy efficiency in cloud infrastructures. VM placement allows cloud providers to allocate a set of requested or migrating VMs onto physical servers with the goal to balance the load or minimize the number of active servers. While addressing the VM placement problem is important, VM consolidation is even more important to enable continuous reorganization of already-placed VMs on the least number of servers. It helps create idle servers during periods of low resource utilization by taking advantage of live VM migration provided by virtualization technologies. Energy consumption is then reduced by dynamically switching idle servers into a power saving state. As VM migrations and server switches consume additional energy, the frequency of VM migrations and server switches needs to be limited as well. This dissertation concludes with a sample application of distributed computing to big data analytics

    Service-Level-Driven Load Scheduling and Balancing in Multi-Tier Cloud Computing

    Get PDF
    Cloud computing environments often deal with random-arrival computational workloads that vary in resource requirements and demand high Quality of Service (QoS) obligations. A Service Level Agreement (SLA) is employed to govern the QoS obligations of the cloud service provider to the client. A service provider conundrum revolves around the desire to maintain a balance between the limited resources available for computing and the high QoS requirements of the varying random computing demands. Any imbalance in managing these conflicting objectives may result in either dissatisfied clients that can incur potentially significant commercial penalties, or an over-sourced cloud computing environment that can be significantly costly to acquire and operate. To optimize response to such client demands, cloud service providers organize the cloud computing environment as a multi-tier architecture. Each tier executes its designated tasks and passes them to the next tier, in a fashion similar, but not identical, to the traditional job-shop environments. Each tier consists of multiple computing resources, though an optimization process must take place to assign and schedule the appropriate tasks of the job on the resources of the tier, so as to meet the job’s QoS expectations. Thus, scheduling the clients’ workloads as they arrive at the multi-tier cloud environment to ensure their timely execution has been a central issue in cloud computing. Various approaches have been presented in the literature to address this problem: Join-Shortest-Queue (JSQ), Join-Idle-Queue (JIQ), enhanced Round Robin (RR) and Least Connection (LC), as well as enhanced MinMin and MaxMin, to name a few. This thesis presents a service-level-driven load scheduling and balancing framework for multi-tier cloud computing. A model is used to quantify the penalty a cloud service provider incurs as a function of the jobs’ total waiting time and QoS violations. This model facilitates penalty mitigation in situations of high demand and resource shortage. The framework accounts for multi-tier job execution dependencies in capturing QoS violation penalties as the client jobs progress through subsequent tiers, thus optimizing the performance at the multi-tier level. Scheduling and balancing operations are employed to distribute client jobs on resources such that the total waiting time and, hence, SLA violations of client jobs are minimized. Optimal job allocation and scheduling is an NP combinatorial problem. The dynamics of random job arrival make the optimality goal even harder to achieve and maintain as new jobs arrive at the environment. Thus, the thesis proposes a queue virtualization as an abstract that allows jobs to migrate between resources within a given tier, as well, altering the sequencing of job execution within a given resource, during the optimization process. Given the computational complexity of the job allocation and scheduling problem, a genetic algorithm is proposed to seek optimal solutions. The queue virtualization is proposed as a basis for defining chromosome structure and operations. As computing jobs tend to vary with respect to delay tolerance, two SLA scenarios are tackled, that is, equal cost of time delays and differentiated cost of time delays. Experimental work is conducted to investigate the performance of the proposed approach both at the tier and system level
    corecore