334 research outputs found

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

    Survey On Fault Tolerance In Grid Computing

    Full text link

    Grid-centric scheduling strategies for workflow applications

    Get PDF
    Grid computing faces a great challenge because the resources are not localized, but distributed, heterogeneous and dynamic. Thus, it is essential to provide a set of programming tools that execute an application on the Grid resources with as little input from the user as possible. The thesis of this work is that Grid-centric scheduling techniques of workflow applications can provide good usability of the Grid environment by reliably executing the application on a large scale distributed system with good performance. We support our thesis with new and effective approaches in the following five aspects. First, we modeled the performance of the existing scheduling approaches in a multi-cluster Grid environment. We implemented several widely-used scheduling algorithms and identified the best candidate. The study further introduced a new measurement, based on our experiments, which can improve the schedule quality of some scheduling algorithms as much as 20 fold in a multi-cluster Grid environment. Second, we studied the scalability of the existing Grid scheduling algorithms. To deal with Grid systems consisting of hundreds of thousands of resources, we designed and implemented a novel approach that performs explicit resource selection decoupled from scheduling Our experimental evaluation confirmed that our decoupled approach can be scalable in such an environment without sacrificing the quality of the schedule by more than 10%. Third, we proposed solutions to address the dynamic nature of Grid computing with a new cluster-based hybrid scheduling mechanism. Our experimental results collected from real executions on production clusters demonstrated that this approach produces programs running 30% to 100% faster than the other scheduling approaches we implemented on both reserved and shared resources. Fourth, we improved the reliability of Grid computing by incorporating fault- tolerance and recovery mechanisms into the workow application execution. Our experiments on a simulated multi-cluster Grid environment demonstrated the effectiveness of our approach and also characterized the three-way trade-off between reliability, performance and resource usage when executing a workflow application. Finally, we improved the large batch-queue wait time often found in production Grid clusters. We developed a novel approach to partition the workow application and submit them judiciously to achieve less total batch-queue wait time. The experimental results derived from production site batch queue logs show that our approach can reduce total wait time by as much as 70%. Our approaches combined can greatly improve the usability of Grid computing while increasing the performance of workow applications on a multi-cluster Grid environment

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    A Framework for an adaptive grid scheduling: an organizational perspective

    Get PDF
    Grid systems are complex computational organizations made of several interacting components evolving in an unpredictable and dynamic environment. In such context, scheduling is a key component and should be adaptive to face the numerous disturbances of the grid while guaranteeing its robustness and efficiency. In this context, much work remains at low-level focusing on the scheduling component taken individually. However, thinking the scheduling adaptiveness at a macro level with an organizational view, through its interactions with the other components, is also important. Following this view, in this paper we model a grid system as an agent-based organization and scheduling as a cooperative activity. Indeed, agent technology provides high level organizational concepts (groups, roles, commitments, interaction protocols) to structure, coordinate and ease the adaptation of distributed systems efficiently. More precisely, we make the following contributions. We provide a grid conceptual model that identifies the concepts and entities involved in the cooperative scheduling activity. This model is then used to define a typology of adaptation including perturbing events and actions to undertake in order to adapt. Then, we provide an organizational model, based on the Agent Group Role (AGR) meta-model of Freber, to support an adaptive scheduling at the organizational level. Finally, a simulator and an experimental evaluation have been realized to demonstrate the feasibility of our approach

    Workflow scheduling for service oriented cloud computing

    Get PDF
    Service Orientation (SO) and grid computing are two computing paradigms that when put together using Internet technologies promise to provide a scalable yet flexible computing platform for a diverse set of distributed computing applications. This practice gives rise to the notion of a computing cloud that addresses some previous limitations of interoperability, resource sharing and utilization within distributed computing. In such a Service Oriented Computing Cloud (SOCC), applications are formed by composing a set of services together. In addition, hierarchical service layers are also possible where general purpose services at lower layers are composed to deliver more domain specific services at the higher layer. In general an SOCC is a horizontally scalable computing platform that offers its resources as services in a standardized fashion. Workflow based applications are a suitable target for SOCC where workflow tasks are executed via service calls within the cloud. One or more workflows can be deployed over an SOCC and their execution requires scheduling of services to workflow tasks as the task become ready following their interdependencies. In this thesis heuristics based scheduling policies are evaluated for scheduling workflows over a collection of services offered by the SOCC. Various execution scenarios and workflow characteristics are considered to understand the implication of the heuristic based workflow scheduling

    BRAHMA : an intelligent framework for automated scaling of streaming and deadline-critical workflows

    Get PDF
    The prevalent use of multi-component, multi-tenant models for building novel Software-as-a-Service (SaaS) applications has resulted in wide-spread research on automatic scaling of the resultant complex application workflows. In this paper, we propose a holistic solution to Automatic Workflow Scaling under the combined presence of Streaming and Deadline-critical workflows, called AWS-SD. To solve the AWS-SD problem, we propose a framework BRAHMA, that learns workflow behavior to build a knowledge-base and leverages this info to perform intelligent automated scaling decisions. We propose and evaluate different resource provisioning algorithms through CloudSim. Our results on time-varying workloads show that the proposed algorithms are effective and produce good cost-quality trade-offs while preventing deadline violations. Empirically, the proposed hybrid algorithm combining learning and monitoring, is able to restrict deadline violations to a small fraction (3-5%), while only suffering a marginal increase in average cost per component of 1-2% over our baseline naive algorithm, which provides the least costly provisioning but suffers from a large number (35-45%) of deadline violations
    • …
    corecore