3,216 research outputs found
Hybrid Meta-heuristic Algorithms for Static and Dynamic Job Scheduling in Grid Computing
The term ’grid computing’ is used to describe an infrastructure that connects geographically
distributed computers and heterogeneous platforms owned by multiple organizations
allowing their computational power, storage capabilities and other resources to be selected
and shared. Allocating jobs to computational grid resources in an efficient manner is one
of the main challenges facing any grid computing system; this allocation is called job
scheduling in grid computing. This thesis studies the application of hybrid meta-heuristics
to the job scheduling problem in grid computing, which is recognized as being one of
the most important and challenging issues in grid computing environments. Similar to
job scheduling in traditional computing systems, this allocation is known to be an NPhard
problem. Meta-heuristic approaches such as the Genetic Algorithm (GA), Variable
Neighbourhood Search (VNS) and Ant Colony Optimisation (ACO) have all proven their
effectiveness in solving different scheduling problems. However, hybridising two or more
meta-heuristics shows better performance than applying a stand-alone approach. The new
high level meta-heuristic will inherit the best features of the hybridised algorithms, increasing
the chances of skipping away from local minima, and hence enhancing the overall
performance. In this thesis, the application of VNS for the job scheduling problem in grid
computing is introduced. Four new neighbourhood structures, together with a modified
local search, are proposed. The proposed VNS is hybridised using two meta-heuristic
methods, namely GA and ACO, in loosely and strongly coupled fashions, yielding four
new sequential hybrid meta-heuristic algorithms for the problem of static and dynamic
single-objective independent batch job scheduling in grid computing. For the static version
of the problem, several experiments were carried out to analyse the performance of the
proposed schedulers in terms of minimising the makespan using well known benchmarks.
The experiments show that the proposed schedulers achieved impressive results compared
to other traditional, heuristic and meta-heuristic approaches selected from the bibliography.
To model the dynamic version of the problem, a simple simulator, which uses
the rescheduling technique, is designed and new problem instances are generated, by
using a well-known methodology, to evaluate the performance of the proposed hybrid
schedulers. The experimental results show that the use of rescheduling provides significant
improvements in terms of the makespan compared to other non-rescheduling approaches
Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment
Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Maxmin, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing
Scheduling of Dependent Tasks Application using Random Search Technique
Since beginning of Grid computing, scheduling of dependent tasks application
has attracted attention of researchers due to NP-Complete nature of the
problem. In Grid environment, scheduling is deciding about assignment of tasks
to available resources. Scheduling in Grid is challenging when the tasks have
dependencies and resources are heterogeneous. The main objective in scheduling
of dependent tasks is minimizing make-span. Due to NP-complete nature of
scheduling problem, exact solutions cannot generate schedule efficiently.
Therefore, researchers apply heuristic or random search techniques to get
optimal or near to optimal solution of such problems. In this paper, we show
how Genetic Algorithm can be used to solve dependent task scheduling problem.
We describe how initial population can be generated using random assignment and
height based approaches. We also present design of crossover and mutation
operators to enable scheduling of dependent tasks application without violating
dependency constraints. For implementation of GA based scheduling, we explore
and analyze SimGrid and GridSim simulation toolkits. From results, we found
that SimGrid is suitable, as it has support of SimDag API for DAG applications.
We found that GA based approach can generate schedule for dependent tasks
application in reasonable time while trying to minimize make-span
QoS-aware predictive workflow scheduling
This research places the basis of QoS-aware predictive workflow scheduling. This research novel contributions will open up prospects for future research in handling complex big workflow applications with high uncertainty and dynamism. The results from the proposed workflow scheduling algorithm shows significant improvement in terms of the performance and reliability of the workflow applications
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
Recommended from our members
Personal mobile grids with a honeybee inspired resource scheduler
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The overall aim of the thesis has been to introduce Personal Mobile Grids (PMGrids)
as a novel paradigm in grid computing that scales grid infrastructures to mobile devices and extends grid entities to individual personal users. In this thesis, architectural designs as well as simulation models for PM-Grids are developed.
The core of any grid system is its resource scheduler. However, virtually all current conventional grid schedulers do not address the non-clairvoyant scheduling problem, where job information is not available before the end of execution. Therefore, this thesis proposes a honeybee inspired resource scheduling heuristic for PM-Grids (HoPe) incorporating a radical approach to grid resource scheduling to tackle this problem. A detailed design and implementation of HoPe with a decentralised self-management and adaptive policy are initiated.
Among the other main contributions are a comprehensive taxonomy of grid systems as well as a detailed analysis of the honeybee colony and its nectar acquisition process (NAP), from the resource scheduling perspective, which have not been presented in any previous work, to the best of our knowledge.
PM-Grid designs and HoPe implementation were evaluated thoroughly through a strictly controlled empirical evaluation framework with a well-established heuristic in high throughput computing, the opportunistic scheduling heuristic (OSH), as a benchmark algorithm. Comparisons with optimal values and worst bounds are conducted to gain a clear insight into HoPe behaviour, in terms of stability, throughput, turnaround time and speedup, under different running conditions of number of jobs and grid scales.
Experimental results demonstrate the superiority of HoPe performance where it
has successfully maintained optimum stability and throughput in more than 95%
of the experiments, with HoPe achieving three times better than the OSH under
extremely heavy loads. Regarding the turnaround time and speedup, HoPe has
effectively achieved less than 50% of the turnaround time incurred by the OSH, while doubling its speedup in more than 60% of the experiments.
These results indicate the potential of both PM-Grids and HoPe in realising futuristic grid visions. Therefore considering the deployment of PM-Grids in real life scenarios and the utilisation of HoPe in other parallel processing and high throughput computing systems are recommended
A distributed platform for the volunteer execution of workflows on a local area network
Thesis submitted in fulfilment of the requirements for the Degree of Master of Science in Computer ScienceAlbatroz Engineering has developed a framework for over-head power lines inspection data acquisition and analysis, which includes hardware and software. The framework’s software components include inspection data analysis and reporting tools, commonly known as PLMI2 application/platform.
In PLMI2, the analysis of over-head power line maintenance inspection data consists
of a sequence of Automatic Tasks (ATs) interleaved with Manual Tasks (MTs). An AT
consists of a set of algorithms that receives as input one or more datasets, processes them and returns new datasets. In turn, an MT enables human supervisors (also known as lines inspection operators) to correct, improve and validate the results of ATs. ATs run faster than MTs and in the overall work cycle, ATs take less than 10% of total processing time, but still take a few minutes. There is data flow dependency among tasks, which can be modelled with a workflow and even if MTs are omitted from this workflow, it is possible to carry the sequence of ATs, postponing MTs.
In fact, if the computing cost and waiting time are negligible, it may be advantageous
to run ATs earlier in the workflow, prior to validation. To address this opportunity, Albatroz Engineering has invested in a new procedure to stream the data through all ATs
fully unattended.
Considering these scenarios, it could be useful to have a system capable of detecting
available workstations at a given instant and subsequently distribute the ATs to them.
In this way, operators could schedule the execution of future ATs for a given inspection data, while they are performing MTs of another.
The requirements of the system to implement fall within the field Volunteer Computing
Systems and we will address some of the challenges posed by these kinds of systems,
namely the hosts volatility and failures. Volunteer Computing is a type of distributed
computing which exploits idle CPU cycles from computing resources donated by volunteers and connected through the Internet/Intranet to compute large-scale simulations.
This thesis proposes and designs a new distributed task scheduling system in the context of Volunteer Computing Systems, able to schedule the ATs of PLMI2 and exploit
idle CPU cycles from workstations within the company’s local area network (LAN) to
accelerate the data analysis, being aware of data flow interdependencies.
To evaluate the proposed system, a prototype has been implemented, and the simulations
results have shown that it is scalable and supports fault-tolerance of tasks execution,
by employing the rescheduling mechanism
- …