735 research outputs found

    LCG-1 Deployment and usage experience

    Full text link
    LCG-1 is the second release of the software framework for the LHC Computing Grid project. In our work we describe the installation process, arising problems and their solutions, and configuration tuning details of the complete LCG-1 site, including all LCG elements required for the self-sufficient site.Comment: To be published in the proceedings of the XIX International Symposium on Nuclear Electronics and Computing (NEC'2003), Bulgaria, Varna, 15-20 September, 200

    Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project

    Full text link
    In the first phase of the EU DataGrid (EDG) project, a Data Management System has been implemented and provided for deployment. The components of the current EDG Testbed are: a prototype of a Replica Manager Service built around the basic services provided by Globus, a centralised Replica Catalogue to store information about physical locations of files, and the Grid Data Mirroring Package (GDMP) that is widely used in various HEP collaborations in Europe and the US for data mirroring. During this year these services have been refined and made more robust so that they are fit to be used in a pre-production environment. Application users have been using this first release of the Data Management Services for more than a year. In the paper we present the components and their interaction, our implementation and experience as well as the feedback received from our user communities. We have resolved not only issues regarding integration with other EDG service components but also many of the interoperability issues with components of our partner projects in Europe and the U.S. The paper concludes with the basic lessons learned during this operation. These conclusions provide the motivation for the architecture of the next generation of Data Management Services that will be deployed in EDG during 2003.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 9 pages, LaTeX, PSN: TUAT007 all figures are in the directory "figures

    Running a Production Grid Site at the London e-Science Centre

    Get PDF
    This paper describes how the London e-Science Centre cluster MARS, a production 400+ Opteron CPU cluster, was integrated into the production Large Hadron Collider Compute Grid. It describes the practical issues that we encountered when deploying and maintaining this system, and details the techniques that were applied to resolve them. Finally, we provide a set of recommendations based on our experiences for grid software development in general that we believe would make the technology more accessible. © 2006 IEEE

    Apollo experience report: Assessment of metabolic expenditures

    Get PDF
    A significant effort was made to assess the metabolic expenditure for extravehicular activity on the lunar surface. After evaluation of the real-time data available to the flight controller during extravehicular activity, three independent methods of metabolic assessment were chosen based on the relationship between heart rate and metabolic production, between oxygen consumption and metabolic production, and between the thermodynamics of the liquid-cooled garment and metabolic production. The metabolic assessment procedure is analyzed and discussed. Real-time use of this information by the Apollo flight surgeon is discussed. Results and analyses of the Apollo missions and comments concerning future applications are included

    HEP Applications Evaluation of the EDG Testbed and Middleware

    Full text link
    Workpackage 8 of the European Datagrid project was formed in January 2001 with representatives from the four LHC experiments, and with experiment independent people from five of the six main EDG partners. In September 2002 WP8 was strengthened by the addition of effort from BaBar and D0. The original mandate of WP8 was, following the definition of short- and long-term requirements, to port experiment software to the EDG middleware and testbed environment. A major additional activity has been testing the basic functionality and performance of this environment. This paper reviews experiences and evaluations in the areas of job submission, data management, mass storage handling, information systems and monitoring. It also comments on the problems of remote debugging, the portability of code, and scaling problems with increasing numbers of jobs, sites and nodes. Reference is made to the pioneeering work of Atlas and CMS in integrating the use of the EDG Testbed into their data challenges. A forward look is made to essential software developments within EDG and to the necessary cooperation between EDG and LCG for the LCG prototype due in mid 2003.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics Conference (CHEP03), La Jolla, CA, USA, March 2003, 7 pages. PSN THCT00

    June-August 2005

    Get PDF

    Developing Resource Usage Service in WLCG

    No full text
    According to the Memorandum of Understanding (MoU) of the World-wide LHC Computing Grid (WLCG) project, participating sites are required to provide resource usage or accounting data to the Grid Operational Centre (GOC) to enrich the understanding of how shared resources are used, and to provide information for improving the effectiveness of resource allocation. As a multi-grid environment, the accounting process of WLCG is currently enabled by four accounting systems, each of which was developed independently by constituent grid projects. These accounting systems were designed and implemented based on project-specific local understanding of requirements, and therefore lack interoperability. In order to automate the accounting process in WLCG, three transportation methods are being introduced for streaming accounting data metered by heterogeneous accounting systems into GOC at Rutherford Appleton Laboratory (RAL) in the UK, where accounting data are aggregated and accumulated throughout the year. These transportation methods, however, were introduced on a per accounting-system basis, i.e. targeting at a particular accounting system, making them hard to reuse and customize to new requirements. This paper presents the design of WLCG-RUS system, a standards-compatible solution providing a consistent process for streaming resource usage data across various accounting systems, while ensuring interoperability, portability, and customization

    CMS Monte Carlo production in the WLCG computing Grid

    Get PDF
    Monte Carlo production in CMS has received a major boost in performance and scale since the past CHEP06 conference. The production system has been re-engineered in order to incorporate the experience gained in running the previous system and to integrate production with the new CMS event data model, data management system and data processing framework. The system is interfaced to the two major computing Grids used by CMS, the LHC Computing Grid (LCG) and the Open Science Grid (OSG). Operational experience and integration aspects of the new CMS Monte Carlo production system is presented together with an analysis of production statistics. The new system automatically handles job submission, resource monitoring, job queuing, job distribution according to the available resources, data merging, registration of data into the data bookkeeping, data location, data transfer and placement systems. Compared to the previous production system automation, reliability and performance have been considerably improved. A more efficient use of computing resources and a better handling of the inherent Grid unreliability have resulted in an increase of production scale by about an order of magnitude, capable of running in parallel at the order of ten thousand jobs and yielding more than two million events per day

    Lessons Learnt from WLCG Service Deployment

    Get PDF
    This paper summarises the main lessons learnt from deploying WLCG production services, with a focus on Reliability, Scalability, Accountability, which lead to both manageability and usability. Each topic is analysed in turn. Techniques for zero-user-visible downtime for the main service interventions are described, together with pathological cases that need special treatment. The requirements in terms of scalability are analysed, calling for as much robustness and automation in the service as possible. The different aspects of accountability - which covers measuring / tracking / logging / monitoring what is going on -- and has gone on - is examined, with the goal of attaining a manageable service. Finally, a simple analogy is drawn with the Web in terms of usability - what do we need to achieve to cross the chasm from small-scale adoption to ubiquity
    • …
    corecore