5,908 research outputs found
A hyper-heuristic for adaptive scheduling in computational grids
In this paper we present the design and implementation of an hyper-heuristic for efficiently scheduling independent jobs in computational grids. An efficient scheduling of jobs to grid resources depends on many parameters, among others, the characteristics of the resources and jobs (such as computing capacity, consistency of computing, workload, etc.). Moreover, these characteristics change over time due to the dynamic nature of grid environment, therefore the planning of jobs to resources should be adaptively done. Existing ad hoc scheduling methods (batch and immediate mode) have shown their efficacy for certain types of resource and job characteristics. However, as stand alone methods, they are not able to produce the best planning of jobs to resources for different types of Grid resources and job characteristics. In this work we have designed and implemented a hyper-heuristic that uses a set of ad hoc (immediate and batch mode) scheduling methods to provide the scheduling of jobs to Grid resources according to the Grid and job characteristics. The hyper-heuristic is a high level algorithm, which examines the state and characteristics of the Grid system (jobs and resources), and selects and applies the ad hoc method that yields the best planning of jobs. The resulting hyper-heuristic based scheduler can be thus used to develop network-aware applications that need efficient planning of jobs to resources. The hyper-heuristic has been tested and evaluated in a dynamic setting through a prototype of a Grid simulator. The experimental evaluation showed the usefulness of the hyper-heuristic for planning of jobs to resources as compared to planning without knowledge of the resource and job characteristics.Peer ReviewedPostprint (author's final draft
Survey on job scheduling mechanisms in grid environment
Grid systems provide geographically distributed resources for both computational intensive and data-intensive applications.These applications generate large data sets.However, the high latency imposed by the underlying technologies; upon which the grid system is built (such as the Internet and WWW), induced impediment in the effective access to such huge and widely distributed data.To minimize this impediment, jobs need to be scheduled across grid environments to achieve efficient data access.Scheduling multiple data requests submitted by grid users onto the grid
environment is NP-hard.Thus, there is no best scheduling algorithm that cuts across all grids computing environments.Job scheduling is one of the key research area in grid computing.In the recent past many researchers have proposed different mechanisms to help scheduling of user jobs in grid systems.Some characteristic features of the grid components; such as machines types and nature of jobs at hand means that a choice needs to be made for an appropriate scheduling algorithm to march a given grid environment.The aim of scheduling is to achieve maximum possible system throughput and to match the application needs with the available computing resources.This paper is motivated by the need to explore the various
job scheduling techniques alongside their area of implementation.The paper will systematically analyze the strengths and weaknesses of some selected approaches in the area of grid jobs scheduling.This helps researchers better understand the concept of scheduling, and can contribute in developing more efficient and practical scheduling algorithms.This will also
benefit interested researchers to carry out further work in this dynamic research area
Fair Resource Sharing for Dynamic Scheduling of Workflows on Heterogeneous Systems
International audienceScheduling independent workflows on shared resources in a way that satisfy users Quality of Service is a significant challenge. In this study, we describe methodologies for off-line scheduling, where a schedule is generated for a set of knownworkflows, and on-line scheduling, where users can submit workflows at any moment in time. We consider the on-line scheduling problem in more detail and present performance comparisons of state-of-the-art algorithms for a realistic model of a heterogeneous system
Recommended from our members
Personal mobile grids with a honeybee inspired resource scheduler
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The overall aim of the thesis has been to introduce Personal Mobile Grids (PMGrids)
as a novel paradigm in grid computing that scales grid infrastructures to mobile devices and extends grid entities to individual personal users. In this thesis, architectural designs as well as simulation models for PM-Grids are developed.
The core of any grid system is its resource scheduler. However, virtually all current conventional grid schedulers do not address the non-clairvoyant scheduling problem, where job information is not available before the end of execution. Therefore, this thesis proposes a honeybee inspired resource scheduling heuristic for PM-Grids (HoPe) incorporating a radical approach to grid resource scheduling to tackle this problem. A detailed design and implementation of HoPe with a decentralised self-management and adaptive policy are initiated.
Among the other main contributions are a comprehensive taxonomy of grid systems as well as a detailed analysis of the honeybee colony and its nectar acquisition process (NAP), from the resource scheduling perspective, which have not been presented in any previous work, to the best of our knowledge.
PM-Grid designs and HoPe implementation were evaluated thoroughly through a strictly controlled empirical evaluation framework with a well-established heuristic in high throughput computing, the opportunistic scheduling heuristic (OSH), as a benchmark algorithm. Comparisons with optimal values and worst bounds are conducted to gain a clear insight into HoPe behaviour, in terms of stability, throughput, turnaround time and speedup, under different running conditions of number of jobs and grid scales.
Experimental results demonstrate the superiority of HoPe performance where it
has successfully maintained optimum stability and throughput in more than 95%
of the experiments, with HoPe achieving three times better than the OSH under
extremely heavy loads. Regarding the turnaround time and speedup, HoPe has
effectively achieved less than 50% of the turnaround time incurred by the OSH, while doubling its speedup in more than 60% of the experiments.
These results indicate the potential of both PM-Grids and HoPe in realising futuristic grid visions. Therefore considering the deployment of PM-Grids in real life scenarios and the utilisation of HoPe in other parallel processing and high throughput computing systems are recommended
A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment
We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function
Offline Scheduling of Map and Reduce Tasks on Hadoop Systems
International audienceMapReduce is a model to manage quantities massive of data. It is based on the distributed and parallel execution of tasks over the cluster of machines. Hadoop is an implementation of MapReduce model, it is used to offer BigData services on the cloud. In this paper, we expose the scheduling problem on Hadoop systems. We focus on the offline-scheduling, expose the problem in a mathematic model and use the time-indexed formulation. We aim consider the maximum of constraints of the MapReduce environment. Solutions for the presented model would be a reference for the on-line Schedules in the case of low and medium instances. Our work is useful in term of the problem definition: constraints are based on observations and take into account resources consumption, data locality, heterogeneous machines and workflow management; this paper defines boundaries references to evaluate the online model
Improving Real-Time Data Dissemination Performance by Multi Path Data Scheduling in Data Grids
The performance of data grids for data intensive, real-time applications is highly dependent on the data dissemination algorithm employed in the system. Motivated by this fact, this study first formally defines the real-time splittable data dissemination problem (RTS/DDP) where data transfer requests can be routed over multiple paths to maximize the number of data transfers to be completed before their deadlines. Since RTS/DDP is proved to be NP-hard, four different heuristic algorithms, namely kSP/ESMP, kSP/BSMP, kDP/ESMP, and kDP/BSMP are proposed. The performance of these heuristic algorithms is analyzed through an extensive set of data grid system simulation scenarios. The simulation results reveal that a performance increase up to 8 % as compared to a very competitive single path data dissemination algorithm is possible
Classification and Performance Study of Task Scheduling Algorithms in Cloud Computing Environment
Cloud computing is becoming very common in recent years and is growing rapidly due to its attractive benefits and features such as resource pooling, accessibility, availability, scalability, reliability, cost saving, security, flexibility, on-demand services, pay-per-use services, use from anywhere, quality of service, resilience, etc. With this rapid growth of cloud computing, there may exist too many users that require services or need to execute their tasks simultaneously by resources provided by service providers. To get these services with the best performance, and minimum cost, response time, makespan, effective use of resources, etc. an intelligent and efficient task scheduling technique is required and considered as one of the main and essential issues in the cloud computing environment. It is necessary for allocating tasks to the proper cloud resources and optimizing the overall system performance. To this end, researchers put huge efforts to develop several classes of scheduling algorithms to be suitable for the various computing environments and to satisfy the needs of the various types of individuals and organizations. This research article provides a classification of proposed scheduling strategies and developed algorithms in cloud computing environment along with the evaluation of their performance. A comparison of the performance of these algorithms with existing ones is also given. Additionally, the future research work in the reviewed articles (if available) is also pointed out. This research work includes a review of 88 task scheduling algorithms in cloud computing environment distributed over the seven scheduling classes suggested in this study. Each article deals with a novel scheduling technique and the performance improvement it introduces compared with previously existing task scheduling algorithms. Keywords: Cloud computing, Task scheduling, Load balancing, Makespan, Energy-aware, Turnaround time, Response time, Cost of task, QoS, Multi-objective. DOI: 10.7176/IKM/12-5-03 Publication date:September 30th 2022
- …