Search CORE

1,537 research outputs found

A Taxonomy of Workflow Management Systems for Grid Computing

Author: Buyya Rajkumar
Yu Jia
Publication venue
Publication date: 01/01/2005
Field of study

With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

Author: Buyya Rajkumar
Ramamohanarao Kotagiri
Venugopal Srikumar
Publication venue
Publication date: 10/06/2005
Field of study

Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

arXiv.org e-Print Archive

CiteSeerX

University of Melbourne Institutional Repository

Many-Task Computing and Blue Waters

Author: Armstrong Timothy G.
Katz Daniel S.
Wilde Michael
Wozniak Justin M.
Zhang Zhao
Publication venue
Publication date: 01/01/2012
Field of study

This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware

arXiv.org e-Print Archive

CiteSeerX

Efficient Parallel Application Execution on Opportunistic Desktop Grids

Author: Alfredo Goldman
Daniel Batista
Fabio Costa
Fabio Kon
Francisco Silva
Raphael Camargo
Publication venue: 'IntechOpen'
Publication date: 16/05/2012
Field of study

IntechOpen

Survey On Fault Tolerance In Grid Computing

Author
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date
Field of study

Crossref

DBRS: Directed Acyclic Graph based Reliable Scheduling Approach in Large Scale Computing

Author: Kumar Javalkar Dinesh
Singh Manjeet
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/11/2022
Field of study

In large scale environments, scheduling presents a significant challenge because it is an NP-hard problem. There are basically two types of task in execution- dependent task and independent task. The execution of dependent task must follow a strict order because output of one activity is typically the input of another. In this paper, a reliable fault tolerant approach is proposed for scheduling of dependent task in large scale computing environments. The workflow of dependent task is represented with the help of a DAG (directed acyclic graph). The proposed methodology is evaluated over various parameters by applying it in a large scale computing environment- ‘grid computing’. Grid computing is a high performance computing for solving complex, large and data intensive problems in various fields. The result analysis shows that the proposed DAG based reliable scheduling (DBRS) approach increases the performance of system by decreasing the makespan, number of failures and increasing performance improvement ratio (PIR)

International Journal on Recent and Innovation Trends in Computing and Communication

04451 Abstracts Collection -- Future Generation Grids

Author: Cosnard Michel
Getov Vladimir
Laforenza Domenico
Reinefeld Alexander
Publication venue: Dagstuhl Seminar Proceedings. 04451 - Future Generation Grids - FGG 2004
Publication date: 01/01/2005
Field of study

The Dagstuhl Seminar 04451 "Future Generation Grid" was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl from 1st to 5th November 2004. The focus of the seminar was on open problems and future challenges in the design of next generation Grid systems. A total of 45 participants presented their current projects, research plans, and new ideas in the area of Grid technologies. Several evening sessions with vivid discussions on future trends complemented the talks. This report gives an overview of the background and the findings of the seminar

Dagstuhl Research Online Publication Server