13,872 research outputs found

    DESIGN AND EVALUATION OF RESOURCE ALLOCATION AND JOB SCHEDULING ALGORITHMS ON COMPUTATIONAL GRIDS

    Get PDF
    Grid, an infrastructure for resource sharing, currently has shown its importance in many scientific applications requiring tremendously high computational power. Grid computing enables sharing, selection and aggregation of resources for solving complex and large-scale scientific problems. Grids computing, whose resources are distributed, heterogeneous and dynamic in nature, introduces a number of fascinating issues in resource management. Grid scheduling is the key issue in grid environment in which its system must meet the functional requirements of heterogeneous domains, which are sometimes conflicting in nature also, like user, application, and network. Moreover, the system must satisfy non-functional requirements like reliability, efficiency, performance, effective resource utilization, and scalability. Thus, overall aim of this research is to introduce new grid scheduling algorithms for resource allocation as well as for job scheduling for enabling a highly efficient and effective utilization of the resources in executing various applications. The four prime aspects of this work are: firstly, a model of the grid scheduling problem for dynamic grid computing environment; secondly, development of a new web based simulator (SyedWSim), enabling the grid users to conduct a statistical analysis of grid workload traces and provides a realistic basis for experimentation in resource allocation and job scheduling algorithms on a grid; thirdly, proposal of a new grid resource allocation method of optimal computational cost using synthetic and real workload traces with respect to other allocation methods; and finally, proposal of some new job scheduling algorithms of optimal performance considering parameters like waiting time, turnaround time, response time, bounded slowdown, completion time and stretch time. The issue is not only to develop new algorithms, but also to evaluate them on an experimental computational grid, using synthetic and real workload traces, along with the other existing job scheduling algorithms. Experimental evaluation confirmed that the proposed grid scheduling algorithms possess a high degree of optimality in performance, efficiency and scalability

    Predicting Scheduling Failures in the Cloud

    Full text link
    Cloud Computing has emerged as a key technology to deliver and manage computing, platform, and software services over the Internet. Task scheduling algorithms play an important role in the efficiency of cloud computing services as they aim to reduce the turnaround time of tasks and improve resource utilization. Several task scheduling algorithms have been proposed in the literature for cloud computing systems, the majority relying on the computational complexity of tasks and the distribution of resources. However, several tasks scheduled following these algorithms still fail because of unforeseen changes in the cloud environments. In this paper, using tasks execution and resource utilization data extracted from the execution traces of real world applications at Google, we explore the possibility of predicting the scheduling outcome of a task using statistical models. If we can successfully predict tasks failures, we may be able to reduce the execution time of jobs by rescheduling failed tasks earlier (i.e., before their actual failing time). Our results show that statistical models can predict task failures with a precision up to 97.4%, and a recall up to 96.2%. We simulate the potential benefits of such predictions using the tool kit GloudSim and found that they can improve the number of finished tasks by up to 40%. We also perform a case study using the Hadoop framework of Amazon Elastic MapReduce (EMR) and the jobs of a gene expression correlations analysis study from breast cancer research. We find that when extending the scheduler of Hadoop with our predictive models, the percentage of failed jobs can be reduced by up to 45%, with an overhead of less than 5 minutes

    A Proposed Scheduling Algorithm for IoT Applications in a Merged Environment of Edge, Fog, and Cloud

    Get PDF
    With the rapid increase of Internet of Things (IoT) devices and applications, the ordinary cloud computing paradigm soon becomes outdated. Fog computing paradigm extends services provided by a cloud to the edge of network in order to satisfy requirements of IoT applications such as low latency, locality awareness, low network traffic, mobility support, and so forth. Task scheduling in a Cloud-Fog environment plays a great role to assure diverse computational demands are met. However, the quest for an optimal solution for task scheduling in the such environment is exceedingly hard due to diversity of IoT applications, heterogeneity of computational resources, and multiple criteria. This study approaches the task scheduling problem with aims at improving service quality and load balancing in a merged computing system of Edge-Fog-Cloud. We propose a Multi-Objective Scheduling Algorithm (MOSA) that takes into account the job characteristics and utilization of different computational resources. The proposed solution is evaluated in comparison to other existing policies named LB, WRR, and MPSO. Numerical results show that the proposed algorithm improves the average response time while maintaining load balancing in comparison to three existing policies. Obtained results with the use of real workloads validate the outcomes

    Learning Scheduling Algorithms for Data Processing Clusters

    Full text link
    Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load
    corecore