7 research outputs found

    DIRAC framework evaluation for the Fermi\boldsymbol{Fermi}-LAT and CTA experiments

    Full text link
    DIRAC (Distributed Infrastructure with Remote Agent Control) is a general framework for the management of tasks over distributed heterogeneous computing environments. It has been originally developed to support the production activities of the LHCb (Large Hadron Collider Beauty) experiment and today is extensively used by several particle physics and biology communities. Current (FermiFermi Large Area Telescope -- LAT) and planned (Cherenkov Telescope Array -- CTA) new generation astrophysical/cosmological experiments, with very large processing and storage needs, are currently investigating the usability of DIRAC in this context. Each of these use cases has some peculiarities: FermiFermi-LAT will interface DIRAC to its own workflow system to allow the access to the grid resources, while CTA is using DIRAC as workflow management system for Monte Carlo production and analysis on the grid. We describe the prototype effort that we lead toward deploying a DIRAC solution for some aspects of FermiFermi-LAT and CTA needs.Comment: proceedings to CHEP 2013 conference : http://www.chep2013.org

    Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform

    Get PDF
    International audienceThis paper presents the architecture of the Virtual Imaging Platform sup- porting the execution of medical image simulation workflows on multiple comput- ing infrastructures. The system relies on the MOTEUR engine for workflow execu- tion and on the DIRAC pilot-job system for workload management. The jGASW code wrapper is extended to describe applications running on multiple infrastruc- tures and a DIRAC cluster agent that can securely involve personal cluster re- sources with no administrator intervention is proposed. Grid data management is complemented with local storage used as a failover in case of file transfer errors. Between November 2010 and April 2011 the platform was used by 10 users to run 484 workflow instances representing 10.8 CPU years. Tests show that a small per- sonal cluster can significantly contribute to a simulation running on EGI and that the improved data manager can decrease the job failure rate from 7.7% to 1.5%

    Evaluation of Meta-scheduler Architectures and Task Assignment Policies for High Throughput Computing

    Get PDF
    In this paper we present a model and simulator for many clusters of heterogeneous PCs belonging to a local network. These clusters are assumed to be connected to each other through a global network and each cluster is managed via a local scheduler which is shared by many users. We validate our simulator by comparing the experimental and analytical results of a M/M/4 queuing system. These studies indicate that the simulator is consistent. Next, we do the comparison with a real batch system and we obtain an average error of 10.5% for the response time and 12% for the makespan. We conclude that the simulator is realistic and well describes the behaviour of a large-scale system. Thus we can study the scheduling of our system in a high throughput context. We justify our decentralized, adaptive and opportunistic approach in comparison to a centralized approach in such a context.Dans cet article, nous pr ́esentons une mod ́elisation et un simulateur de grandssyst`emes de calcul distribu ́e. Une telle plate-forme se compose de grappes dePCs h ́et ́erog`enes appartenant `aunr ́eseau local inter-connect ́ees entre elles parun r ́eseau global. Ces grappes sont accessibles via un ordonnanceur local etsont partag ́ees entre les utilisateurs. La confrontation du simulateur avec lesr ́esultats th ́eoriques d’un syst`eme M/M/4 nous permet de conclure qu’il estanalytiquement valide. Une deuxi`eme confrontation avec un syst`eme batch r ́eel,nous donne une diff ́erence moyenne de 10.5 % par rapport `alar ́ealit ́epourlestemps de r ́eponse et de 12% pour le makespan. Notre simulateur est donc r ́ea-liste et d ́ecrit le comportement d’un syst`eme de batch r ́eel. Fort de cet outil,nous avons analys ́e l’ordonnancement de notre syst`eme (appel ́eDIRAC)dansun contexte de calcul intensif. Nous avons justifi ́e l’approche distribu ́ee, adap-tative et opportuniste utilis ́ee dans notre syst`eme par rapport `a une approchecentralis ́ee

    A study of meta-scheduling architectures for high throughput computing

    No full text
    In this paper we present a model and a simulator for large-scale system. Such platforms are composed of heterogeneous clusters of PCs belonging to a local network. These clusters are then connected to each other through a global network. Moreover each cluster is managed via a local scheduler and is shared by many users. We validate our simulator by comparing the experimental results and the analytical results of a M/M/4 queuing system. These studies indicate that the simulator is consistent. After that we do the comparison with a real batch system and we obtain a mean error of 10.5 \% for the response time and 12 \% for the makespan. We conclude that our simulator is realistic and describes well the behavior of a large-scale system. Thus we can study the scheduling of our system called \dirac in a high throughput context. We justify our decentralized, adaptive and opportunistic approach in comparison to a centralized approach in such a context.Dans cet article, nous présentons une modélisation et un simulateur de grands systèmes de calcul distribué. Une telle plateforme se compose de grappes de PCs hétérogènes appartenant à un réseau local interconnectées entre elles par un réseau global. Ces grappes sont accessibles via un ordonnanceur local et sont partagées entre les utilisateurs. La confrontation du simulateur avec les résultats théoriques d’un système M/M/4 nous permet de conclure qu’il est analytiquement valide. Une deuxième confrontation avec un système batch réel,nous donne une différence moyenne de 10.5 % par rapport à la réalité pour les temps de réponse et de 12% pour le makespan. Notre simulateur est donc réa-liste et décrit le comportement d’un système de batch réel. Fort de cet outil,nous avons analysé l’ordonnancement de notre système (appel ́eDIRAC) dans un contexte de calcul intensif. Nous avons justifié l’approche distribuée, adaptative et opportuniste utilisée dans notre système par rapport à une approche centralisée

    Evaluation of Meta-scheduler Architectures and Task Assignment Policies for High Throughput Computing

    No full text
    In this paper we present a model and simulator for many clusters of heterogeneous PCs belonging to a local network. These clusters are assumed to be connected to each other through a global network and each cluster is managed via a local scheduler which is shared by many users. We validate our simulator by comparing the experimental and analytical results of a M/M/4 queuing system. These studies indicate that the simulator is consistent. Next, we do the comparison with a real batch system and we obtain an average error of 10.5 % for the response time and 12 % for the makespan. We conclude that the simulator is realistic and well describes the behaviour of a largescale system. Thus we can study the scheduling of our system called DIRAC in a high throughput context. We justify our decentralized, adaptive and opportunistic approach in comparison to a centralized approach in such a context
    corecore