49 research outputs found

    DIRAC - Distributed Infrastructure with Remote Agent Control

    Full text link
    This paper describes DIRAC, the LHCb Monte Carlo production system. DIRAC has a client/server architecture based on: Compute elements distributed among the collaborating institutes; Databases for production management, bookkeeping (the metadata catalogue) and software configuration; Monitoring and cataloguing services for updating and accessing the databases. Locally installed software agents implemented in Python monitor the local batch queue, interrogate the production database for any outstanding production requests using the XML-RPC protocol and initiate the job submission. The agent checks and, if necessary, installs any required software automatically. After the job has processed the events, the agent transfers the output data and updates the metadata catalogue. DIRAC has been successfully installed at 18 collaborating institutes, including the DataGRID, and has been used in recent Physics Data Challenges. In the near to medium term future we must use a mixed environment with different types of grid middleware or no middleware. We describe how this flexibility has been achieved and how ubiquitously available grid middleware would improve DIRAC.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 8 pages, Word, 5 figures. PSN TUAT00

    DIRAC - Distributed Infrastructure with Remote Agent Control

    No full text
    This paper describes DIRAC, the LHCb Monte Carlo production system. DIRAC has a client/server architecture based on: Compute elements distributed among the collaborating institutes; Databases for production management, bookkeeping (the metadata catalogue) and software configuration; Monitoring and cataloguing services for updating and accessing the databases. Locally installed software agents implemented in Python monitor the local batch queue, interrogate the production database for any outstanding production requests using the XML-RPC protocol and initiate the job submission. The agent checks and, if necessary, installs any required software automatically. After the job has processed the events, the agent transfers the output data and updates the metadata catalogue. DIRAC has been successfully installed at 18 collaborating institutes, including the DataGRID, and has been used in recent Physics Data Challenges. In the near to medium term future we must use a mixed environment with different types of grid middleware or no middleware. We describe how this flexibility has been achieved and how ubiquitously available grid middleware would improve DIRAC

    DIRAC framework evaluation for the Fermi\boldsymbol{Fermi}-LAT and CTA experiments

    Full text link
    DIRAC (Distributed Infrastructure with Remote Agent Control) is a general framework for the management of tasks over distributed heterogeneous computing environments. It has been originally developed to support the production activities of the LHCb (Large Hadron Collider Beauty) experiment and today is extensively used by several particle physics and biology communities. Current (FermiFermi Large Area Telescope -- LAT) and planned (Cherenkov Telescope Array -- CTA) new generation astrophysical/cosmological experiments, with very large processing and storage needs, are currently investigating the usability of DIRAC in this context. Each of these use cases has some peculiarities: FermiFermi-LAT will interface DIRAC to its own workflow system to allow the access to the grid resources, while CTA is using DIRAC as workflow management system for Monte Carlo production and analysis on the grid. We describe the prototype effort that we lead toward deploying a DIRAC solution for some aspects of FermiFermi-LAT and CTA needs.Comment: proceedings to CHEP 2013 conference : http://www.chep2013.org

    Cherenkov Telescope Array Data Management

    Get PDF
    Very High Energy gamma-ray astronomy with the Cherenkov Telescope Array (CTA) is evolving towards the model of a public observatory. Handling, processing and archiving the large amount of data generated by the CTA instruments and delivering scientific products are some of the challenges in designing the CTA Data Management. The participation of scientists from within CTA Consortium and from the greater worldwide scientific community necessitates a sophisticated scientific analysis system capable of providing unified and efficient user access to data, software and computing resources. Data Management is designed to respond to three main issues: (i) the treatment and flow of data from remote telescopes; (ii) "big-data" archiving and processing; (iii) and open data access. In this communication the overall technical design of the CTA Data Management, current major developments and prototypes are presented.Comment: 8 pages, 2 figures, In Proceedings of the 34th International Cosmic Ray Conference (ICRC2015), The Hague, The Netherlands. All CTA contributions at arXiv:1508.0589

    Cloud flexibility using DIRAC interware

    Get PDF
    Communities of different locations are running their computing jobs on dedicated infrastructures without the need to worry about software, hardware or even the site where their programs are going to be executed. Nevertheless, this usually implies that they are restricted to use certain types or versions of an Operating System because either their software needs an definite version of a system library or a specific platform is required by the collaboration to which they belong. On this scenario, if a data center wants to service software to incompatible communities, it has to split its physical resources among those communities. This splitting will inevitably lead to an underuse of resources because the data centers are bound to have periods where one or more of its subclusters are idle. It is, in this situation, where Cloud Computing provides the flexibility and reduction in computational cost that data centers are searching for. This paper describes a set of realistic tests that we ran on one of such implementations. The test comprise software from three different HEP communities (Auger, LHCb and QCD phenomelogists) and the Parsec Benchmark Suite running on one or more of three Linux flavors (SL5, Ubuntu 10.04 and Fedora 13). The implemented infrastructure has, at the cloud level, CloudStack that manages the virtual machines (VM) and the hosts on which they run, and, at the user level, the DIRAC framework along with a VM extension that will submit, monitorize and keep track of the user jobs and also requests CloudStack to start or stop the necessary VM's. In this infrastructure, the community software is distributed via the CernVM-FS, which has been proven to be a reliable and scalable software distribution system. With the resulting infrastructure, users are allowed to send their jobs transparently to the Data Center. The main purpose of this system is the creation of flexible cluster, multiplatform with an scalable method for software distribution for several VOs. Users from different communities do not need to care about the installation of the standard software that is available at the nodes, nor the operating system of the host machine, which is transparent to the userThis work was supported by projects FPA2007-66437- C02-01/02 assigned to UB and FPA2010-21885-C02- 01/02 and TIN 2010-17541 USC. FPA2007-66152-C02-01/02 and FPA2010-21816-C02-01/02, assigned to PICS

    Control and Monitoring of the Online Computer Farm for Offline Processing in LHCb

    No full text
    ISBN 978-3-95450-139-7 - http://accelconf.web.cern.ch/AccelConf/ICALEPCS2013/papers/tuppc063.pdfInternational audienceLHCb, one of the 4 experiments at the LHC accelerator at CERN, uses approximately 1500 PCs (averaging 12 cores each) for processing the High Level Trigger (HLT) during physics data taking. During periods when data acquisition is not required most of these PCs are idle. In these periods it is possible to profit from the unused processing capacity to run offline jobs, such as Monte Carlo simulation. The LHCb offline computing environment is based on LHCbDIRAC (Distributed Infrastructure with Remote Agent Control). In LHCbDIRAC, job agents are started on Worker Nodes, pull waiting tasks from the central WMS (Workload Management System) and process them on the available resources. A Control System was developed which is able to launch, control and monitor the job agents for the offline data processing on the HLT Farm. This control system is based on the existing Online System Control infrastructure, the PVSS SCADA and the FSM toolkit. It has been extensively used launching and monitoring 22.000+ agents simultaneously and more than 850.000 jobs have already been processed in the HLT Farm. This paper describes the deployment and experience with the Control System in the LHCb experiment
    corecore