Search CORE

11 research outputs found

Scalable dimensioning of resilient Lambda Grids

Author: De Leenheer Marc
De Turck Filip
Demeester Piet
Dhoedt Bart
Thysebaert Pieter
Volckaert Bruno
Publication venue
Publication date: 01/01/2007
Field of study

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit

CiteSeerX

Ghent University Academic Bibliography

DIANA Scheduling Hierarchies for Optimizing Bulk Job Scheduling

Author: Ali A.
Alvi O.
Anjum A.
Hasham K.
McClatchey R.
Sagheer M.
Stockinger H.
Thomas M.
Willers I.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2006
Field of study

The use of meta-schedulers for resource management in large-scale distributed systems often leads to a hierarchy of schedulers. In this paper, we discuss why existing meta-scheduling hierarchies are sometimes not sufficient for Grid systems due to their inability to re-organise jobs already scheduled locally. Such a job re-organisation is required to adapt to evolving loads which are common in heavily used Grid infrastructures. We propose a peer-to-peer scheduling model and evaluate it using case studies and mathematical modelling. We detail the DIANA (Data Intensive and Network Aware) scheduling algorithm and its queue management system for coping with the load distribution and for supporting bulk job scheduling. We demonstrate that such a system is beneficial for dynamic, distributed and self-organizing resource management and can assist in optimizing load or job distribution in complex Grid infrastructures.Comment: 8 pages, 9 figures. Presented at the 2nd IEEE Int Conference on eScience & Grid Computing. Amsterdam Netherlands, December 200

arXiv.org e-Print Archive

Crossref

Caltech Authors

An efficient grid scheduling algorithm with fault tolerance and user satisfaction

Author: N Kasthuri
P Keerthika
Publication venue
Publication date: 24/04/2020
Field of study

Problem Statement. The advances in human civilization lead to more complications in problem solving. Grid computing serves as an efficient technology in solving those complicated problems. In computational grids, the grid scheduler schedules the task and finds the appropriate resource for each task. The scheduler must consider several factors such as user demand, communication time, failure handling mechanisms, and reduced makespan. Most of the existing algorithms do not consider user satisfaction. Thus a scheduling algorithm that handles failure of resources and achieves user satisfaction gains more importance. Approach. A new bicriteria scheduling algorithm (BSA) that considers user satisfaction along with fault tolerance has been introduced. The main contribution of this paper includes achieving user satisfaction along with fault tolerance and minimizing the makespan of jobs. Results. The performance of this proposed algorithm is evaluated using GridSim based on makespan and number of jobs completed successfully within user deadline. Conclusions/Recommendations. The proposed BSA algorithm achieves reduced makespan and better hit rate with higher user satisfaction and fault tolerance

CiteSeerX

DIANA Scheduling Hierarchies for Optimizing Bulk Job Scheduling

Author: Ali Arshad
Alvi Omer
Anjum Ashiq
Hasham Khawar
McClatchey Richard
Sagheer Muhammad
Stockinger Heinz
Thomas Michael
Willers Ian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2006
Field of study

The use of meta-schedulers for resource management in large-scale distributed systems often leads to a hierarchy of schedulers. In this paper, we discuss why existing meta-scheduling hierarchies are sometimes not sufficient for Grid systems due to their inability to re-organise jobs already scheduled locally. Such a job re-organisation is required to adapt to evolving loads which are common in heavily used Grid infrastructures. We propose a peer-topeer scheduling model and evaluate it using case studies and mathematical modelling. We detail the DIANA (Data Intensive and Network Aware) scheduling algorithm and its queue management system for coping with the load distribution and for supporting bulk job scheduling. We demonstrate that such a system is beneficial for dynamic, distributed and self-organizing resource management and can assist in optimizing load or job distribution in complex Grid infrastructures

Paralelismo como “concern” en Java y su materialización en una herramienta de software

Author: Fernández Mariano
Hirsch Matías
Publication venue
Publication date: 17/05/2023
Field of study

La computación está experimentando una revolución de hardware dada por la creciente disponibilidad de máquinas multinúcleo y ambientes distribuidos como clusters y Grids. Como consecuencia, el poder computacional está al alcance de la mano, pero muchos programadores de hoy en día no están completamente preparados para explotar paralelismo en sus aplicaciones de forma tal de sacar el máximo provecho a este nuevo hardware. En particular, el lenguaje Java ha ayudado a mitigar la heterogeneidad de software inherente a la programación de aplicaciones secuenciales sobre estos ambientes. De todas maneras, persiste aún la necesidad de herramientas para paralelizar aplicaciones de forma fácil y versátil, para que un programador con poca experiencia en programación paralela pueda rápidamente ejecutar una aplicación en paralelo en varios de estos ambientes. Recientemente, una alternativa que se ha propuesto para lograr esto la constituye el concepto de Paralelismo como “Concern” (PcC), el cual se basa en ideas de la programación orientada a aspectos. En este artículo se listan y analizan las herramientas para la programación paralela existentes, acotando a aquellas implementadas en Java, y se presenta una alternativa basada en PcC que apunta a resolver los problemas de las herramientas analizadas.Sociedad Argentina de Informática e Investigación Operativ

Servicio de Difusión de la Creación Intelectual

Fault-tolerant Scheduling of Fine-grained Tasks in Grid Environments

Author: Bal H.E.
Kielmann T.
Maassen J.
van Nieuwpoort R.V.
Wrzesinska G.
Publication venue
Publication date: 01/01/2006
Field of study

Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system efficiently schedules the fine-grained tasks of a divide-andconquer application across multiple clusters in a grid. To accommodate long-running applications, we present a fault-tolerance mechanism for Satin that has negligible overhead during normal execution, while minimizing the amount of redundant work done after a crash of one or more nodes. We study the impact of our fault-tolerance mechanism on application efficiency, both on the Dutch DAS-2 system and using the European testbed of the EC-funded project GridLab. © 2006 SAGE Publications

VU Research Portal

Fault-Tolerant Scheduling of Fine-Grained Tasks in Grid Environments

Author: Anderson T.
Baldeschwieler J.
Baratloo A.
Blumofe R.
Denis A.
Foster I.
Fredman S. I.
Gosia Wrzesińska
Goux J.-P.
Henri E. Bal
Jason Maassen
Lin F. C. H.
Litzkow M.
Rob V. van Nieuwpoort
Tamaki H.
Thilo Kielmann
van Nieuwpoort R. V.
van Nieuwpoort R. V.
van Nieuwpoort R. V.
Wrzesinska G.
Zhang L.
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Fault-tolerant Scheduling of Fine-grained Tasks in Grid Environments

Author: Gosia Wrzesinska
Henri E. Bal
Jason Maassen
Rob V. Van Nieuwpoort
Thilo Kielmann
Publication venue
Publication date
Field of study

Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system e#ciently schedules the finegrained tasks of a divide-and-conquer application across multiple clusters in a grid. To accomodate long-running applications, we present a fault-tolerance mechanism for Satin that has negligible overhead during normal execution, while minimizing the amount of redundant work done after a crash of one or more nodes. We study the impact of our fault-tolerance mechanism on application e#ciency, both on the Dutch DAS-2 system and using the European testbed of the EC-funded project GridLab

CiteSeerX