9 research outputs found

    QoS-Driven Job Scheduling: Multi-Tier Dependency Considerations

    Full text link
    For a cloud service provider, delivering optimal system performance while fulfilling Quality of Service (QoS) obligations is critical for maintaining a viably profitable business. This goal is often hard to attain given the irregular nature of cloud computing jobs. These jobs expect high QoS on an on-demand fashion, that is on random arrival. To optimize the response to such client demands, cloud service providers organize the cloud computing environment as a multi-tier architecture. Each tier executes its designated tasks and passes the job to the next tier; in a fashion similar, but not identical, to the traditional job-shop environments. An optimization process must take place to schedule the appropriate tasks of the job on the resources of the tier, so as to meet the QoS expectations of the job. Existing approaches employ scheduling strategies that consider the performance optimization at the individual resource level and produce optimal single-tier driven schedules. Due to the sequential nature of the multi-tier environment, the impact of such schedules on the performance of other resources and tiers tend to be ignored, resulting in a less than optimal performance when measured at the multi-tier level. In this paper, we propose a multi-tier-oriented job scheduling and allocation technique. The scheduling and allocation process is formulated as a problem of assigning jobs to the resource queues of the cloud computing environment, where each resource of the environment employs a queue to hold the jobs assigned to it. The scheduling problem is NP-hard, as such a biologically inspired genetic algorithm is proposed. The computing resources across all tiers of the environment are virtualized in one resource by means of a single queue virtualization. A chromosome that mimics the sequencing and allocation of the tasks in the proposed virtual queue is proposed

    階層型ピア・ツー・ピアファイル検索のための負荷管理の研究

    Get PDF
    In a Peer-to-Peer (P2P) system, multiple interconnected peers or nodes contribute a portion of their resources (e.g., files, disk storage, network bandwidth) in order to inexpensively handle tasks that would normally require powerful servers. Since the emergency of P2P file sharing, load balancing has been considered as a primary concern, as well as other issues such as autonomy, fault tolerance and security. In a process of file search, a heavily loaded peer may incur a long latency or failure in query forwarding or responding. If there are many such peers in a system, it may cause link congestion or path congestion, and consequently affect the performance of overall system. To avoid such situation, some of general techniques used in Web systems such as caching and paging are adopted into P2P systems. However, it is highly insufficient for load balancing since peers often exhibit high heterogeneity and dynamicity in P2P systems. To overcome such a difficulty, the use of super-peers is currently being the most promising approach in optimizing allocation of system load to peers, i.e., it allocates more system load to high capacity and stable super-peers by assigning task of index maintenance and retrieval to them. In this thesis, we focused on two kinds of super-peer based hierarchical architectures of P2P systems, which are distinguished by the organization of super-peers. In each of them, we discussed system load allocation, and proposed novel load balancing algorithms for alleviating load imbalance of super-peers, aiming to decrease average and variation of query response time during index retrieval process. More concretely, in this thesis, our contribution to load management solutions for hierarchical P2P file search are the following: • In Qin’s hierarchical architecture, indices of files held by the user peers in the bottom layer are stored at the super-peers in the middle layer, and the correlation of those two bottom layers is controlled by the central server(s) in the top layer using the notion of tags. In Qin’s system, a heavily loaded super-peer can move excessive load to a lightly loaded super-peer by using the notion of task migration. However, such a task migration approach is not sufficient to balance the load of super-peers if the size of tasks is highly imbalanced. To overcome such an issue, in this thesis, we propose two task migration schemes for this architecture, aiming to ensure an even load distribution over the super-peers. The first scheme controls the load of each task in order to decrease the total cost of task migration. The second scheme directly balances the load over tasks by reordering the priority of tags used in the query forwarding step. The effectiveness of the proposed schemes are evaluated by simulation. The result of simulations indicates that all the schemes can work in coordinate, in alleviating the bottleneck situation of super-peers. • In DHT-based super-peer architecture, indices of files held by the user peers in the lower layer are stored at the DHT connected super-peers in the upper layer. In DHT-based super-peer systems, the skewness of user’s preference regarding keywords contained in multi-keyword query causes query load imbalance of super-peers that combines both routing and response load. Although index replication has a great potential for alleviating this problem, existing schemes did not explicitly address it or incurred high cost. To overcome such an issue, in this thesis, we propose an integrated solution that consists of three replication schemes to alleviate query load imbalance while minimizing the cost. The first scheme is an active index replication in order to decrease routing load in the super-peer layer, and distribute response load of an index among super-peers that stored the replica. The second scheme is a proactive pointer replication that places location information of an index, for reducing maintenance cost between the index and its replicas. The third scheme is a passive index replication that guarantees the maximum query load of super-peers. The result of simulations indicates that the proposed schemes can help alleviating the query load imbalance of super-peers. Moreover, by comparison it was found that our schemes are more cost-effective on placing replicas than other approaches.広島大学(Hiroshima University)博士(工学)Doctor of Engineering in Information Engineeringdoctora

    From reactive to proactive load balancing for task‐based parallel applications in distributed memory machines

    Get PDF
    Load balancing is often a challenge in task-parallel applications. The balancing problems are divided into static and dynamic. “Static” means that we have some prior knowledge about load information and perform balancing before execution, while “dynamic” must rely on partial information of the execution status to balance the load at runtime. Conventionally, work stealing is a practical approach used in almost all shared memory systems. In distributed memory systems, the communication overhead can make stealing tasks too late. To improve, people have proposed a reactive approach to relax communication in balancing load. The approach leaves one dedicated thread per process to monitor the queue status and offload tasks reactively from a slow to a fast process. However, reactive decisions might be mistaken in high imbalance cases. First, this article proposes a performance model to analyze reactive balancing behaviors and understand the bound leading to incorrect decisions. Second, we introduce a proactive approach to improve further balancing tasks at runtime. The approach exploits task-based programming models with a dedicated thread as well, namely . Nevertheless, the main idea is to force not only to monitor load; it will characterize tasks and train load prediction models by online learning. “Proactive” indicates offloading tasks before each execution phase proactively with an appropriate number of tasks at once to a potential victim (denoted by an underloaded/fast process). The experimental results confirm speedup improvements from to in important use cases compared to the previous solutions. Furthermore, this approach can support co-scheduling tasks across multiple applications

    Service-Level-Driven Load Scheduling and Balancing in Multi-Tier Cloud Computing

    Get PDF
    Cloud computing environments often deal with random-arrival computational workloads that vary in resource requirements and demand high Quality of Service (QoS) obligations. A Service Level Agreement (SLA) is employed to govern the QoS obligations of the cloud service provider to the client. A service provider conundrum revolves around the desire to maintain a balance between the limited resources available for computing and the high QoS requirements of the varying random computing demands. Any imbalance in managing these conflicting objectives may result in either dissatisfied clients that can incur potentially significant commercial penalties, or an over-sourced cloud computing environment that can be significantly costly to acquire and operate. To optimize response to such client demands, cloud service providers organize the cloud computing environment as a multi-tier architecture. Each tier executes its designated tasks and passes them to the next tier, in a fashion similar, but not identical, to the traditional job-shop environments. Each tier consists of multiple computing resources, though an optimization process must take place to assign and schedule the appropriate tasks of the job on the resources of the tier, so as to meet the job’s QoS expectations. Thus, scheduling the clients’ workloads as they arrive at the multi-tier cloud environment to ensure their timely execution has been a central issue in cloud computing. Various approaches have been presented in the literature to address this problem: Join-Shortest-Queue (JSQ), Join-Idle-Queue (JIQ), enhanced Round Robin (RR) and Least Connection (LC), as well as enhanced MinMin and MaxMin, to name a few. This thesis presents a service-level-driven load scheduling and balancing framework for multi-tier cloud computing. A model is used to quantify the penalty a cloud service provider incurs as a function of the jobs’ total waiting time and QoS violations. This model facilitates penalty mitigation in situations of high demand and resource shortage. The framework accounts for multi-tier job execution dependencies in capturing QoS violation penalties as the client jobs progress through subsequent tiers, thus optimizing the performance at the multi-tier level. Scheduling and balancing operations are employed to distribute client jobs on resources such that the total waiting time and, hence, SLA violations of client jobs are minimized. Optimal job allocation and scheduling is an NP combinatorial problem. The dynamics of random job arrival make the optimality goal even harder to achieve and maintain as new jobs arrive at the environment. Thus, the thesis proposes a queue virtualization as an abstract that allows jobs to migrate between resources within a given tier, as well, altering the sequencing of job execution within a given resource, during the optimization process. Given the computational complexity of the job allocation and scheduling problem, a genetic algorithm is proposed to seek optimal solutions. The queue virtualization is proposed as a basis for defining chromosome structure and operations. As computing jobs tend to vary with respect to delay tolerance, two SLA scenarios are tackled, that is, equal cost of time delays and differentiated cost of time delays. Experimental work is conducted to investigate the performance of the proposed approach both at the tier and system level

    Towards Automatic Load Balancing of a Parallel File System with Subfile Based Migration

    Get PDF
    Nowadays scientific applications address complex problems in nature. As a consequence they have a high demand for computation power and for an I/O infrastructure providing performant access to data. These demands are satisfied by various supercomputers in order to engage grand challenges. From the operator's point of view it is important to keep the available resources of such a multi dollar machine busy, while the end-user is concerned about the runtime of the application. Evidently it is vital to avoid idle times in an application and congestion of network as well as of the I/O subsystem to ensure maximum concurrency and thus efficiency. Load balancing is a technique which tackles this issues. While load balancing has been evaluated in detail for computational parts of a program, analysis of load imbalance for complex storage environments in High Performance Computing has to be addressed as well. Often parallel file systems like Lustre, GPFS, or PVFS2 are deployed to meet the needs of a fast I/O infrastructure. This thesis evaluates the impact of unbalanced workloads in such parallel file systems exemplarily on PVFS2 and extends the environment to allow dynamic (and adaptive) load balancing. Some cases leading to unbalanced workloads are discussed, namely unbalanced access patterns, inhomogeneous hardware, and rebuilds after crashes in an environment promising high availability. Important factors related to the performance are described, this allows to build simple performance models on which the impact of such load imbalances can be assessed. Some potential countermeasures to fix these unbalanced workloads are discussed in the thesis. While most cases could be alleviated by static load balancing mechanisms a dynamic load balancing seems important to make up for environments with fluctuating performance characteristics. In the thesis extensions to the software environment are designed and realized that provide capabilities to detect bottlenecks and to fix them by moving data from higher loaded servers to lower loaded servers. Therefore, further mechanisms are integrated into PVFS2, which allow and support dynamic decisions to move data by a load-balancer. A first heuristics is implemented using the extensions to demonstrate how they can be used to build a dynamic load-balancer. Experiments are run with balanced as well as unbalanced workloads to show the server behavior.Also a few experiments with the developed load-balancer in a real environment are made. These results demonstrate problematic issues and demonstrate that load balancing techniques could be successfully applied to increase productivity

    Replication-based load balancing in distributed content-based publish/subscribe

    No full text
    In recent years, content-based publish/subscribe (pub/sub) has become a popular paradigm to decouple content producers and consumers for Internet-scale content services. Many real applications show that the content workloads frequently exhibit very skewed distribution, and incur unbalanced workloads. To balance the workloads, the literature of content-based pub/sub adopted a migration scheme (Mis) to move (a subset of) subscription filters from overloaded brokers to under loaded brokers. In this way, the publications that successfully match the moved filters are then offloaded, leading to balanced workloads. Unfortunately, the Mis scheme cannot reduce the overall matching workloads. In the worst case, suppose that all brokers suffer from heavy workloads. Mis cannot find available brokers to offload the heavy workloads of those overloaded brokers, and fails to balance the workloads. To overcome the issue, the contribution of this paper is to develop a set of novel load balancing algorithms, namely a similarity-based replication scheme (Sir). The novelty of Sir is that it not only balances the workloads of brokers but also reduces the overall workloads. Based on both simulation and emulation results, the extensive experiments verify that Sir can achieve much better performance than Mis, in terms of 43.1% higher entropy value (i.e., more balanced workloads) and 46.39 lower workloads. © 2013 IEEE

    Replication-based Load Balancing in Distributed Content-Based Publish /Subscribe Systems

    No full text
    In recent years, content-based Publish/Subscribe (pub/sub) has become a popular paradigm to decouple content producers and consumers for Internet-scale content services. Many real applications show that the content workloads frequently follow very skewed distribution, and incur unbalanced workloads. To balance the workloads, the current content-based Publish/Subscribe systems normally adopt a migration scheme (Mis) to move (a subset of) subscription filters from overloaded brokers to underloaded brokers. In this way, the publications that successfully match the moved filters are then o oaded, leading to balanced workloads. Unfortunately, the Mis scheme cannot reduce the overall matching workloads. In the worse case, suppose that all brokers su er from heavy workloads. Mis cannot find available brokers to o oad the heavy workloads of those overloaded brokers, and fail to balance the workloads of the overloaded brokers. To overcome the issue, we develop a set of novel load balancing algorithms, namely a similarity-based replication scheme (Sir). The novelty of Sir is that it not only balances the workloads of brokers but also reduces the overall workloads. Based on both simulation and emulation results, the extensive experiments verify that Sir can achieve much better performance than Mis, in terms of 43.10% higher entropy value (i.e., more balanced workloads) and 46.39% lower workloads

    Integrated Resource Management for Fog Networks

    No full text
    In this paper, we consider integrated resource management for fog networks inclusive of intelligent energy perception, service level agreement (SLA) planning and replication-based hotspot offload (RHO). In the beginning, we propose an intelligent energy perception scheme which dynamically classifies the fog nodes into a hot set, a warm set or a cold set, based on their load conditions. The fog nodes in the hot set are responsible for a quality of service (QoS) guarantee and the fog nodes in the cold set are maintained at a low-energy state to save energy consumption. Moreover, the fog nodes in the warm set are used to balance the QoS guarantee and energy consumption. Secondly, we propose an SLA mapping scheme which effectively identifies the SLA elements with the same semantics. Finally, we propose a replication-based load-balancing scheme, namely RHO. The RHO can leverage the skewed access pattern caused by the hotspot services. In addition, it greatly reduces communication overheads because the load conditions are updated only when the load variations exceed a specific threshold. Finally, we use computer simulations to compare the performance of the RHO with other schemes under a variety of load conditions. In a word, we propose a comprehensive and feasible solution that contributes to the integrated resource management of fog networks
    corecore