11 research outputs found

    Scalable dimensioning of resilient Lambda Grids

    Get PDF
    This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit

    DIANA Scheduling Hierarchies for Optimizing Bulk Job Scheduling

    Get PDF
    The use of meta-schedulers for resource management in large-scale distributed systems often leads to a hierarchy of schedulers. In this paper, we discuss why existing meta-scheduling hierarchies are sometimes not sufficient for Grid systems due to their inability to re-organise jobs already scheduled locally. Such a job re-organisation is required to adapt to evolving loads which are common in heavily used Grid infrastructures. We propose a peer-to-peer scheduling model and evaluate it using case studies and mathematical modelling. We detail the DIANA (Data Intensive and Network Aware) scheduling algorithm and its queue management system for coping with the load distribution and for supporting bulk job scheduling. We demonstrate that such a system is beneficial for dynamic, distributed and self-organizing resource management and can assist in optimizing load or job distribution in complex Grid infrastructures.Comment: 8 pages, 9 figures. Presented at the 2nd IEEE Int Conference on eScience & Grid Computing. Amsterdam Netherlands, December 200

    An efficient grid scheduling algorithm with fault tolerance and user satisfaction

    Get PDF
    Problem Statement. The advances in human civilization lead to more complications in problem solving. Grid computing serves as an efficient technology in solving those complicated problems. In computational grids, the grid scheduler schedules the task and finds the appropriate resource for each task. The scheduler must consider several factors such as user demand, communication time, failure handling mechanisms, and reduced makespan. Most of the existing algorithms do not consider user satisfaction. Thus a scheduling algorithm that handles failure of resources and achieves user satisfaction gains more importance. Approach. A new bicriteria scheduling algorithm (BSA) that considers user satisfaction along with fault tolerance has been introduced. The main contribution of this paper includes achieving user satisfaction along with fault tolerance and minimizing the makespan of jobs. Results. The performance of this proposed algorithm is evaluated using GridSim based on makespan and number of jobs completed successfully within user deadline. Conclusions/Recommendations. The proposed BSA algorithm achieves reduced makespan and better hit rate with higher user satisfaction and fault tolerance

    DIANA Scheduling Hierarchies for Optimizing Bulk Job Scheduling

    Get PDF
    The use of meta-schedulers for resource management in large-scale distributed systems often leads to a hierarchy of schedulers. In this paper, we discuss why existing meta-scheduling hierarchies are sometimes not sufficient for Grid systems due to their inability to re-organise jobs already scheduled locally. Such a job re-organisation is required to adapt to evolving loads which are common in heavily used Grid infrastructures. We propose a peer-topeer scheduling model and evaluate it using case studies and mathematical modelling. We detail the DIANA (Data Intensive and Network Aware) scheduling algorithm and its queue management system for coping with the load distribution and for supporting bulk job scheduling. We demonstrate that such a system is beneficial for dynamic, distributed and self-organizing resource management and can assist in optimizing load or job distribution in complex Grid infrastructures

    Paralelismo como “concern” en Java y su materialización en una herramienta de software

    Get PDF
    La computación está experimentando una revolución de hardware dada por la creciente disponibilidad de máquinas multinúcleo y ambientes distribuidos como clusters y Grids. Como consecuencia, el poder computacional está al alcance de la mano, pero muchos programadores de hoy en día no están completamente preparados para explotar paralelismo en sus aplicaciones de forma tal de sacar el máximo provecho a este nuevo hardware. En particular, el lenguaje Java ha ayudado a mitigar la heterogeneidad de software inherente a la programación de aplicaciones secuenciales sobre estos ambientes. De todas maneras, persiste aún la necesidad de herramientas para paralelizar aplicaciones de forma fácil y versátil, para que un programador con poca experiencia en programación paralela pueda rápidamente ejecutar una aplicación en paralelo en varios de estos ambientes. Recientemente, una alternativa que se ha propuesto para lograr esto la constituye el concepto de Paralelismo como “Concern” (PcC), el cual se basa en ideas de la programación orientada a aspectos. En este artículo se listan y analizan las herramientas para la programación paralela existentes, acotando a aquellas implementadas en Java, y se presenta una alternativa basada en PcC que apunta a resolver los problemas de las herramientas analizadas.Sociedad Argentina de Informática e Investigación Operativ

    Fault-tolerant Scheduling of Fine-grained Tasks in Grid Environments

    No full text
    Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system efficiently schedules the fine-grained tasks of a divide-andconquer application across multiple clusters in a grid. To accommodate long-running applications, we present a fault-tolerance mechanism for Satin that has negligible overhead during normal execution, while minimizing the amount of redundant work done after a crash of one or more nodes. We study the impact of our fault-tolerance mechanism on application efficiency, both on the Dutch DAS-2 system and using the European testbed of the EC-funded project GridLab. © 2006 SAGE Publications

    Fault-tolerant Scheduling of Fine-grained Tasks in Grid Environments

    No full text
    Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system e#ciently schedules the finegrained tasks of a divide-and-conquer application across multiple clusters in a grid. To accomodate long-running applications, we present a fault-tolerance mechanism for Satin that has negligible overhead during normal execution, while minimizing the amount of redundant work done after a crash of one or more nodes. We study the impact of our fault-tolerance mechanism on application e#ciency, both on the Dutch DAS-2 system and using the European testbed of the EC-funded project GridLab
    corecore