32 research outputs found

    A monitoring tool for a GRID operation center

    Full text link
    WorldGRID is an intercontinental testbed spanning Europe and the US integrating architecturally different Grid implementations based on the Globus toolkit. The WorldGRID testbed has been successfully demonstrated during the WorldGRID demos at SuperComputing 2002 (Baltimore) and IST2002 (Copenhagen) where real HEP application jobs were transparently submitted from US and Europe using "native" mechanisms and run where resources were available, independently of their location. To monitor the behavior and performance of such testbed and spot problems as soon as they arise, DataTAG has developed the EDT-Monitor tool based on the Nagios package that allows for Virtual Organization centric views of the Grid through dynamic geographical maps. The tool has been used to spot several problems during the WorldGRID operations, such as malfunctioning Resource Brokers or Information Servers, sites not correctly configured, job dispatching problems, etc. In this paper we give an overview of the package, its features and scalability solutions and we report on the experience acquired and the benefit that a GRID operation center would gain from such a tool.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 3 pages, PDF. PSN MOET00

    Large-scale ATLAS simulated production on EGEE

    Get PDF
    In preparation for first data at the LHC, a series of Data Challenges, of increasing scale and complexity, have been performed. Large quantities of simulated data have been produced on three different Grids, integrated into the ATLAS production system. During 2006, the emphasis moved towards providing stable continuous production, as is required in the immediate run-up to first data, and thereafter. Here, we discuss the experience of the production done on EGEE resources, using submission based on the gLite WMS, CondorG and a system using Condor Glide-ins. The overall walltime efficiency of around 90% is largely independent of the submission method, and the dominant source of wasted cpu comes from data handling issues. The efficiency of grid job submission is significantly worse than this, and the glide-in method benefits greatly from factorising this out

    Improvements of LHC data analysis techniques at Italian WLCG sites. Case-study of the transfer of this technology to other research areas

    Get PDF
    In 2012, 14 Italian institutions participating in LHC Experiments won a grant from the Italian Ministry of Research (MIUR), with the aim of optimising analysis activities, and in general the Tier2/Tier3 infrastructure. We report on the activities being researched upon, on the considerable improvement in the ease of access to resources by physicists, also those with no specific computing interests. We focused on items like distributed storage federations, access to batch-like facilities, provisioning of user interfaces on demand and cloud systems. R&D on next-generation databases, distributed analysis interfaces, and new computing architectures was also carried on. The project, ending in the first months of 2016, will produce a white paper with recommendations on best practices for data-analysis support by computing centers

    Experience with the gLite Workload Management System in ATLAS Monte Carlo production on LCG

    No full text
    The ATLAS experiment has been running continuous simulated events production since more than two years. A considerable fraction of the jobs is daily submitted and handled via the gLite Workload Management System, which overcomes several limitationsof the previous LCG Resource Broker. The gLite WMS has been tested very intensively for the LHC experiments use cases for more than six months, both in terms of performance and reliability. The tests were carried out by the LCG Experiment Integration Support team (in close contact with the experiments) together with the EGEE integration and certification team and the gLite middleware developers. A pragmatic iterative and interactive approach allowed a very quick rollout of fixes and their rapid deployment, together with new functionalities, for the ATLAS production activities. The same approach is being adopted for other middleware components like the gLite and CREAM Computing Elements. In this contribution we will summarize the learning from the gLite WMS testing activity, pointing out the most important achievements and the open issues. In addition, we will present the current situation of the ATLAS simulated event production activity on the EGEE infrastructure based on the gLite WMS, showing the main improvements and benefits from the new middleware. Finally, some preliminary results on the new flavors of Computing Elements usage will be shown, trying to identify possible advantag es not only in terms of robustness and performance, but also functionality for the experiment activities

    Experimental evaluation of job provenance in ATLAS environment

    No full text
    Grid middleware stacks, including gLite, matured into the state of being able to process up to millions of jobs per day. Logging and Bookkeeping, the gLite job-tracking service, keeps pace with this rate; however, it is not designed to provide a long-term archive of information on executed jobs

    Analysis of the ATLAS Rome production experience on the LHC computing grid

    No full text
    The Large Hadron Collider at CERN will start data acquisition in 2007. The ATLAS (A Toroidal LHC ApparatuS) experiment is preparing for the data handling and analysis via a series of Data Challenges and production exercises to validate its computing model and to provide useful samples of data for detector and physics studies. The last Data Challenge, begun in June 2004 and ended in early 2005, was the first performed completely in a Grid environment. Immediately afterwards, a new production activity was necessary in order to provide the event samples for the ATLAS physics workshop, taking place in June 2005 in Rome. This exercise offered a unique opportunity to estimate the reached improvements and to continue the validation of the computing model. In this paper we discuss the experience of the "Rome production" on the LHC Computing Grid infrastructure, describing the achievements, the improvements with respect to the previous Data Challenge and the problems observed, together with the lessons learned and future plans
    corecore