11 research outputs found

    Discovering Job Preemptions in the Open Science Grid

    Full text link
    The Open Science Grid(OSG) is a world-wide computing system which facilitates distributed computing for scientific research. It can distribute a computationally intensive job to geo-distributed clusters and process job's tasks in parallel. For compute clusters on the OSG, physical resources may be shared between OSG and cluster's local user-submitted jobs, with local jobs preempting OSG-based ones. As a result, job preemptions occur frequently in OSG, sometimes significantly delaying job completion time. We have collected job data from OSG over a period of more than 80 days. We present an analysis of the data, characterizing the preemption patterns and different types of jobs. Based on observations, we have grouped OSG jobs into 5 categories and analyze the runtime statistics for each category. we further choose different statistical distributions to estimate probability density function of job runtime for different classes.Comment: 8 page

    Effective Scheduling of Grid Resources Using Failure Prediction

    Get PDF
    In large-scale grid environments, accurate failure prediction is critical to achieve effective resource allocation while assuring specified QoS levels, such as reliability. Traditional methods, such as statistical estimation techniques, can be considered to predict the reliability of resources. However, naive statistical methods often ignore critical characteristic behavior of the resources. In particular, periodic behaviors of grid resources are not captured well by statistical methods. In this paper, we present an alternative mechanism for failure prediction. In our approach, the periodic pattern of resource failures are determined and actively exploited for resource allocation with better QoS guarantees. The proposed scheme is evaluated under a realistic simulation environment of computational grids. The availability of computing resources are simulated according to real trace that was collected from our large-scale monitoring experiment on campus computers. Our evaluation results show that the proposed approach enables significantly higher resource scheduling effectiveness under a variety of workloads compared to baseline approaches

    Adaps – A three-phase adaptive prediction system for the run-time of jobs based on user behaviour

    Get PDF
    AbstractIn heterogeneous and distributed environments it is necessary to create schedules for utilising resources in an efficient way. This generation often poses a problem for a scheduler, since several aspects have to be considered. One way of supporting a scheduler is to provide accurate predictions of the run-times of the submitted jobs. A large number of current techniques offer statistical models that are deployed on previously filtered data. As users have different jobs, and because the attributes of their jobs differ, filtering data and choosing an appropriate prediction method has to cover these aspects. This article describes Adaps, a system for run-time prediction that works in three phases. Each is independently adjusting to the jobs of a user, based on historical information. This leads to a user specific clustering of data and to a flexible utilisation of different prediction techniques in order to create a user-centred prediction model

    Una aproximación evolutiva a la planificación en entornos HPC basada en la incorporación de criterios subjetivos

    Get PDF
    [Resumen] En el contexto de un centro de supercomputación, por muy elevados que sean los recursos, la demanda será siempre superior. Por ello, los usuarios deben realizar solicitudes para la ejecución de sus trabajos, que se ponen en espera hasta que el planificador del sistema decide pasarlos a ejecución. Pero, por desconocimiento o temor a que los trabajos sean abortados, estas solicitudes son normalmente muy imprecisas, dificultando la labor del planificador. Además, los planificadores son difíciles de configurar y en todo momento asumen que una planificación dada va a satisfacer de igual manera a todos los usuarios. En este trabajo se propone un sistema de planificación que utiliza técnicas de computación evolutiva para permitir la definición de políticas de planificación de manera más natural y estimar las necesidades reales de recursos para lograr planificaciones más precisas. Adicionalmente, se considera el concepto de calidad de servicio percibida, posibilitando la incorporación de criterios subjetivos en el proceso de planificación para mantener un alto nivel de satisfacción en el conjunto de usuarios y en el propio centro de supercomputación. Finalmente, se modelan diversos aspectos de los propios recursos computacionales mejorando aún más la precisión en la planificación, especialmente en sistemas heterogéneos.[Abstract]In the context of a supercomputing center, no matter what its computational resources are, the demand will always be higher. Therefore, users must send their jobs to a queue, where they are put on hold until the scheduler decides to execute them. But, through ignorance or fear that jobs are aborted, these requests are usually very imprecise, hindering the performance of the scheduler. In addition, schedulers are difficult to configure and they assume that a given scheduling will satisfy equally to all users at all times. This thesis proposes a scheduler for high performance computing systems based on evolutionary computation techniques to allow the definition of scheduling policies more naturally and to estimate the real needs of resources in order to achieve more accurate schedules. Additionally, the concept of perceived quality of service is considered, enabling the incorporation of subjective criteria in the scheduling process to maintain a high level of satisfaction in the set of users and in the supercomputing center itself. Finally, various aspects of the computational resources are modeled to further improving accuracy in scheduling, especially in heterogeneous systems.[Resumo]No contexto dun centro de supercomputación, por moi elevados que sexan os recursos, a demanda será sempre superior. Por elo, os usuarios deben realizar solicitudes para a execución dos seus traballos, que se poñen en espera ata que o planificador do sistema decide pasalos a execución. Pero, por descoñecemento ou temor a que os traballos sexan abortados, estas solicitudes son normalmente moi imprecisas, dificultando o labor do planificador. Ademais, os planificadores son difíciles de configurar e en todo momento asumen que unha planificación dada vai satisfacer de igual maneira a todos os usuarios. Neste traballo proponse un sistema de planificación que utiliza técnicas de computación evolutiva para permitir a definición de políticas de planificación de maneira máis natural e estimar as necesidades reais de recursos para lograr planificacións máis precisas. Adicionalmente, considérase o concepto de calidade de servizo percibida, posibilitando a incorporación de criterios subxectivos no proceso de planificación para manter un alto nivel de satisfacción no conxunto de usuarios e no propio centro de supercomputación. Finalmente, se modelan diversos aspectos dos propios recursos computacionáis mellorando aínda máis a precisión na planificación, especialmente en sistemas heteroxéneos
    corecore