343 research outputs found
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
QoS-aware predictive workflow scheduling
This research places the basis of QoS-aware predictive workflow scheduling. This research novel contributions will open up prospects for future research in handling complex big workflow applications with high uncertainty and dynamism. The results from the proposed workflow scheduling algorithm shows significant improvement in terms of the performance and reliability of the workflow applications
Partitioning workflow applications over federated clouds to meet non-functional requirements
PhD ThesisWith cloud computing, users can acquire computer resources when they need them
on a pay-as-you-go business model. Because of this, many applications are now being
deployed in the cloud, and there are many di erent cloud providers worldwide. Importantly,
all these various infrastructure providers o er services with di erent levels
of quality. For example, cloud data centres are governed by the privacy and security
policies of the country where the centre is located, while many organisations have
created their own internal \private cloud" to meet security needs.
With all this varieties and uncertainties, application developers who decide to host their
system in the cloud face the issue of which cloud to choose to get the best operational
conditions in terms of price, reliability and security. And the decision becomes even
more complicated if their application consists of a number of distributed components,
each with slightly di erent requirements.
Rather than trying to identify the single best cloud for an application, this thesis
considers an alternative approach, that is, combining di erent clouds to meet users'
non-functional requirements. Cloud federation o ers the ability to distribute a single
application across two or more clouds, so that the application can bene t from the
advantages of each one of them. The key challenge for this approach is how to nd the
distribution (or deployment) of application components, which can yield the greatest
bene ts. In this thesis, we tackle this problem and propose a set of algorithms, and a
framework, to partition a work
ow-based application over federated clouds in order to
exploit the strengths of each cloud. The speci c goal is to split a distributed application
structured as a work
ow such that the security and reliability requirements of each
component are met, whilst the overall cost of execution is minimised.
To achieve this, we propose and evaluate a cloud broker for partitioning a work
ow
application over federated clouds. The broker integrates with the e-Science Central
cloud platform to automatically deploy a work
ow over public and private clouds.
We developed a deployment planning algorithm to partition a large work
ow appli-
- i -
cation across federated clouds so as to meet security requirements and minimise the
monetary cost.
A more generic framework is then proposed to model, quantify and guide the partitioning
and deployment of work
ows over federated clouds. This framework considers
the situation where changes in cloud availability (including cloud failure) arise during
work
ow execution
A Novel Workload Allocation Strategy for Batch Jobs
The distribution of computational tasks across a diverse set of geographically distributed heterogeneous resources is a critical issue in the realisation of true computational grids. Conventionally, workload allocation algorithms are divided into static and dynamic approaches. Whilst dynamic approaches frequently outperform static schemes, they usually require the collection and processing of detailed system information at frequent intervals - a task that can be both time consuming and unreliable in the real-world. This paper introduces a novel workload allocation algorithm for optimally distributing the workload produced by the arrival of batches of jobs. Results show that, for the arrival of batches of jobs, this workload allocation algorithm outperforms other commonly used algorithms in the static case. A hybrid scheduling approach (using this workload allocation algorithm), where information about the speed of computational resources is inferred from previously completed jobs, is then introduced and the efficiency of this approach demonstrated using a real world computational grid. These results are compared to the same workload allocation algorithm used in the static case and it can be seen that this hybrid approach comprehensively outperforms the static approach
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
Grid-centric scheduling strategies for workflow applications
Grid computing faces a great challenge because the resources are not localized, but distributed, heterogeneous and dynamic. Thus, it is essential to provide a set of programming tools that execute an application on the Grid resources with as little input from the user as possible. The thesis of this work is that Grid-centric scheduling techniques of workflow applications can provide good usability of the Grid environment by reliably executing the application on a large scale distributed system with good performance. We support our thesis with new and effective approaches in the following five aspects.
First, we modeled the performance of the existing scheduling approaches in a multi-cluster Grid environment. We implemented several widely-used scheduling algorithms and identified the best candidate. The study further introduced a new measurement, based on our experiments, which can improve the schedule quality of some scheduling algorithms as much as 20 fold in a multi-cluster Grid environment.
Second, we studied the scalability of the existing Grid scheduling algorithms. To deal with Grid systems consisting of hundreds of thousands of resources, we designed and implemented a novel approach that performs explicit resource selection decoupled from scheduling Our experimental evaluation confirmed that our decoupled approach can be scalable in such an environment without sacrificing the quality of the schedule by more than 10%.
Third, we proposed solutions to address the dynamic nature of Grid computing with a new cluster-based hybrid scheduling mechanism. Our experimental results collected from real executions on production clusters demonstrated that this approach produces programs running 30% to 100% faster than the other scheduling approaches we implemented on both reserved and shared resources.
Fourth, we improved the reliability of Grid computing by incorporating fault- tolerance and recovery mechanisms into the workow application execution. Our experiments on a simulated multi-cluster Grid environment demonstrated the effectiveness of our approach and also characterized the three-way trade-off between reliability, performance and resource usage when executing a workflow application.
Finally, we improved the large batch-queue wait time often found in production Grid clusters. We developed a novel approach to partition the workow application and submit them judiciously to achieve less total batch-queue wait time. The experimental results derived from production site batch queue logs show that our approach can reduce total wait time by as much as 70%.
Our approaches combined can greatly improve the usability of Grid computing while increasing the performance of workow applications on a multi-cluster Grid environment
- …