24,829 research outputs found

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

    Bulk Scheduling with the DIANA Scheduler

    Full text link
    Results from the research and development of a Data Intensive and Network Aware (DIANA) scheduling engine, to be used primarily for data intensive sciences such as physics analysis, are described. In Grid analyses, tasks can involve thousands of computing, data handling, and network resources. The central problem in the scheduling of these resources is the coordinated management of computation and data at multiple locations and not just data replication or movement. However, this can prove to be a rather costly operation and efficient sing can be a challenge if compute and data resources are mapped without considering network costs. We have implemented an adaptive algorithm within the so-called DIANA Scheduler which takes into account data location and size, network performance and computation capability in order to enable efficient global scheduling. DIANA is a performance-aware and economy-guided Meta Scheduler. It iteratively allocates each job to the site that is most likely to produce the best performance as well as optimizing the global queue for any remaining jobs. Therefore it is equally suitable whether a single job is being submitted or bulk scheduling is being performed. Results indicate that considerable performance improvements can be gained by adopting the DIANA scheduling approach.Comment: 12 pages, 11 figures. To be published in the IEEE Transactions in Nuclear Science, IEEE Press. 200

    Q-Strategy: A Bidding Strategy for Market-Based Allocation of Grid Services

    Get PDF
    The application of autonomous agents by the provisioning and usage of computational services is an attractive research field. Various methods and technologies in the area of artificial intelligence, statistics and economics are playing together to achieve i) autonomic service provisioning and usage of Grid services, to invent ii) competitive bidding strategies for widely used market mechanisms and to iii) incentivize consumers and providers to use such market-based systems. The contributions of the paper are threefold. First, we present a bidding agent framework for implementing artificial bidding agents, supporting consumers and providers in technical and economic preference elicitation as well as automated bid generation by the requesting and provisioning of Grid services. Secondly, we introduce a novel consumer-side bidding strategy, which enables a goal-oriented and strategic behavior by the generation and submission of consumer service requests and selection of provider offers. Thirdly, we evaluate and compare the Q-strategy, implemented within the presented framework, against the Truth-Telling bidding strategy in three mechanisms – a centralized CDA, a decentralized on-line machine scheduling and a FIFO-scheduling mechanisms

    A Distributed Economics-based Infrastructure for Utility Computing

    Full text link
    Existing attempts at utility computing revolve around two approaches. The first consists of proprietary solutions involving renting time on dedicated utility computing machines. The second requires the use of heavy, monolithic applications that are difficult to deploy, maintain, and use. We propose a distributed, community-oriented approach to utility computing. Our approach provides an infrastructure built on Web Services in which modular components are combined to create a seemingly simple, yet powerful system. The community-oriented nature generates an economic environment which results in fair transactions between consumers and providers of computing cycles while simultaneously encouraging improvements in the infrastructure of the computational grid itself.Comment: 8 pages, 1 figur

    A batch scheduler with high level components

    Get PDF
    In this article we present the design choices and the evaluation of a batch scheduler for large clusters, named OAR. This batch scheduler is based upon an original design that emphasizes on low software complexity by using high level tools. The global architecture is built upon the scripting language Perl and the relational database engine Mysql. The goal of the project OAR is to prove that it is possible today to build a complex system for ressource management using such tools without sacrificing efficiency and scalability. Currently, our system offers most of the important features implemented by other batch schedulers such as priority scheduling (by queues), reservations, backfilling and some global computing support. Despite the use of high level tools, our experiments show that our system has performances close to other systems. Furthermore, OAR is currently exploited for the management of 700 nodes (a metropolitan GRID) and has shown good efficiency and robustness

    Managing Uncertainty: A Case for Probabilistic Grid Scheduling

    Get PDF
    The Grid technology is evolving into a global, service-orientated architecture, a universal platform for delivering future high demand computational services. Strong adoption of the Grid and the utility computing concept is leading to an increasing number of Grid installations running a wide range of applications of different size and complexity. In this paper we address the problem of elivering deadline/economy based scheduling in a heterogeneous application environment using statistical properties of job historical executions and its associated meta-data. This approach is motivated by a study of six-month computational load generated by Grid applications in a multi-purpose Grid cluster serving a community of twenty e-Science projects. The observed job statistics, resource utilisation and user behaviour is discussed in the context of management approaches and models most suitable for supporting a probabilistic and autonomous scheduling architecture
    • …
    corecore