364 research outputs found

    Scalability tests of R-GMA-based grid job monitoring system for CMS Monte Carlo data production

    Get PDF
    Copyright @ 2004 IEEEHigh-energy physics experiments, such as the compact muon solenoid (CMS) at the large hadron collider (LHC), have large-scale data processing computing requirements. The grid has been chosen as the solution. One important challenge when using the grid for large-scale data processing is the ability to monitor the large numbers of jobs that are being executed simultaneously at multiple remote sites. The relational grid monitoring architecture (R-GMA) is a monitoring and information management service for distributed resources based on the GMA of the Global Grid Forum. We report on the first measurements of R-GMA as part of a monitoring architecture to be used for batch submission of multiple Monte Carlo simulation jobs running on a CMS-specific LHC computing grid test bed. Monitoring information was transferred in real time from remote execution nodes back to the submitting host and stored in a database. In scalability tests, the job submission rates supported by successive releases of R-GMA improved significantly, approaching that expected in full-scale production

    A study of publish/subscribe systems for real-time grid monitoring

    Get PDF
    Monitoring and controlling a large number of geographically distributed scientific instruments is a challenging task. Some operations on these instruments require real-time (or quasi real-time) response which make it even more difficult. In this paper, we describe the requirements of distributed monitoring for a possible future electrical power grid based on real-time extensions to grid computing. We examine several standards and publish/subscribe middleware candidates, some of which were specially designed and developed for grid monitoring. We analyze their architecture and functionality, and discuss the advantages and disadvantages. We report on a series of tests to measure their real-time performance and scalability

    Performance of R-GMA for monitoring grid jobs for CMS data production

    Get PDF
    High energy physics experiments, such as the Compact Muon Solenoid (CMS) at the CERN laboratory in Geneva, have large-scale data processing requirements, with data accumulating at a rate of 1 Gbyte/s. This load comfortably exceeds any previous processing requirements and we believe it may be most efficiently satisfied through grid computing. Furthermore the production of large quantities of Monte Carlo simulated data provides an ideal test bed for grid technologies and will drive their development. One important challenge when using the grid for data analysis is the ability to monitor transparently the large number of jobs that are being executed simultaneously at multiple remote sites. R-GMA is a monitoring and information management service for distributed resources based on the grid monitoring architecture of the Global Grid Forum. We have previously developed a system allowing us to test its performance under a heavy load while using few real grid resources. We present the latest results on this system running on the LCG 2 grid test bed using the LCG 2.6.0 middleware release. For a sustained load equivalent to 7 generations of 1000 simultaneous jobs, R-GMA was able to transfer all published messages and store them in a database for 98% of the individual jobs. The failures experienced were at the remote sites, rather than at the archiver's MON box as had been expected

    Sharing a conceptual model of grid resources and services

    Full text link
    Grid technologies aim at enabling a coordinated resource-sharing and problem-solving capabilities over local and wide area networks and span locations, organizations, machine architectures and software boundaries. The heterogeneity of involved resources and the need for interoperability among different grid middlewares require the sharing of a common information model. Abstractions of different flavors of resources and services and conceptual schemas of domain specific entities require a collaboration effort in order to enable a coherent information services cooperation. With this paper, we present the result of our experience in grid resources and services modelling carried out within the Grid Laboratory Uniform Environment (GLUE) effort, a joint US and EU High Energy Physics projects collaboration towards grid interoperability. The first implementation-neutral agreement on services such as batch computing and storage manager, resources such as the hierarchy cluster, sub-cluster, host and the storage library are presented. Design guidelines and operational results are depicted together with open issues and future evolutions.Comment: 4 pages, 0 figures, CHEP 200

    Developing Resource Usage Service in WLCG

    No full text
    According to the Memorandum of Understanding (MoU) of the World-wide LHC Computing Grid (WLCG) project, participating sites are required to provide resource usage or accounting data to the Grid Operational Centre (GOC) to enrich the understanding of how shared resources are used, and to provide information for improving the effectiveness of resource allocation. As a multi-grid environment, the accounting process of WLCG is currently enabled by four accounting systems, each of which was developed independently by constituent grid projects. These accounting systems were designed and implemented based on project-specific local understanding of requirements, and therefore lack interoperability. In order to automate the accounting process in WLCG, three transportation methods are being introduced for streaming accounting data metered by heterogeneous accounting systems into GOC at Rutherford Appleton Laboratory (RAL) in the UK, where accounting data are aggregated and accumulated throughout the year. These transportation methods, however, were introduced on a per accounting-system basis, i.e. targeting at a particular accounting system, making them hard to reuse and customize to new requirements. This paper presents the design of WLCG-RUS system, a standards-compatible solution providing a consistent process for streaming resource usage data across various accounting systems, while ensuring interoperability, portability, and customization

    Job Monitoring in an Interactive Grid Analysis Environment

    Get PDF
    The grid is emerging as a great computational resource but its dynamic behavior makes the Grid environment unpredictable. Systems and networks can fail, and the introduction of more users can result in resource starvation. Once a job has been submitted for execution on the grid, monitoring becomes essential for a user to see that the job is completed in an efficient way, and to detect any problems that occur while the job is running. In current environments once a user submits a job he loses direct control over the job and the system behaves like a batch system: the user submits the job and later gets a result back. The only information a user can obtain about a job is whether it is scheduled, running, cancelled or finished. Today users are becoming increasingly interested in such analysis grid environments in which they can check the progress of the job, obtain intermediate results, terminate the job based on the progress of job or intermediate results, steer the job to other nodes to achieve better performance and check the resources consumed by the job. In order to fulfill their requirements of interactivity a mechanism is needed that can provide the user with real time access to information about different attributes of a job. In this paper we present the design of a Job Monitoring Service, a web service that will provide interactive remote job monitoring by allowing users to access different attributes of a job once it has been submitted to the interactive Grid Analysis Environment

    Polish grid infrastructure for science and research

    Full text link
    Structure, functionality, parameters and organization of the computing Grid in Poland is described, mainly from the perspective of high-energy particle physics community, currently its largest consumer and developer. It represents distributed Tier-2 in the worldwide Grid infrastructure. It also provides services and resources for data-intensive applications in other sciences.Comment: Proceeedings of IEEE Eurocon 2007, Warsaw, Poland, 9-12 Sep. 2007, p.44
    • …
    corecore