16 research outputs found

    Data Grid tutorials with hands-on experience

    Get PDF
    Grid technologies are more and more used in scientific as well as in industrial environments but often documentation and the correct usage are either not sufficient or not too well understood. Comprehensive training with hands-on experience helps people first to understand the technology and second to use it in a correct and efficient way. We have organised and run several training sessions in different locations all over the world and provide our experience. The major factors of success are a solid base of theoretical lectures and, more dominantly, a facility that allows for practical Grid exercises during and possibly after tutorial sessions

    Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project

    Full text link
    In the first phase of the EU DataGrid (EDG) project, a Data Management System has been implemented and provided for deployment. The components of the current EDG Testbed are: a prototype of a Replica Manager Service built around the basic services provided by Globus, a centralised Replica Catalogue to store information about physical locations of files, and the Grid Data Mirroring Package (GDMP) that is widely used in various HEP collaborations in Europe and the US for data mirroring. During this year these services have been refined and made more robust so that they are fit to be used in a pre-production environment. Application users have been using this first release of the Data Management Services for more than a year. In the paper we present the components and their interaction, our implementation and experience as well as the feedback received from our user communities. We have resolved not only issues regarding integration with other EDG service components but also many of the interoperability issues with components of our partner projects in Europe and the U.S. The paper concludes with the basic lessons learned during this operation. These conclusions provide the motivation for the architecture of the next generation of Data Management Services that will be deployed in EDG during 2003.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 9 pages, LaTeX, PSN: TUAT007 all figures are in the directory "figures

    Matchmaking, Datasets and Physics Analysis

    No full text
    Grid enabled physics analysis requires a Workload Management System (WMS) that takes care of finding suitable computing resources to execute data intensive jobs. A typical example is the WMS available in the LCG2 (also referred to as EGEE-0) software system, used by several scientific experiments. Like many other current Grid systems, LCG2 provides a file level granularity for accessing and analysing data. However, application scientists such as High Energy Physicists often require a higher abstraction level for accessing data, i.e. they prefer to use datasets rather than files in their physics analysis

    Replica consistency in a Data Grid

    No full text
    A Data Grid is a wide area computing infrastructure that employs Grid technologies to provide storage capacity and processing power to applications that handle very large quantities of data. Data Grids rely on data replication to achieve better performance and reliability by storing copies of data sets on different Grid nodes. When a data set can be modified by applications, the problem of maintaining consistency among existing copies arises. The consistency problem also concerns metadata, i.e., additional information about application data sets such as indices, directories, or catalogues. This kind of metadata is used both by the applications and by the Grid middleware to manage the data. For instance, the Replica Management Service (the Grid middleware component that controls data replication) uses catalogues to find the replicas of each data set. Such catalogues can also be replicated and their consistency is crucial to the correct operation of the Grid. Therefore, metadata consistency generally poses stricter requirements than data consistency. In this paper we report on the development of a Replica Consistency Service based on the middleware mainly developed by the European Data Grid Project. The paper summarises the main issues in the replica consistency problem, and lays out a high-level architectural design for a Replica Consistency Service. Finally, results from simulations of different consistency models are presented

    Relaxed Data Consistency with CONStanza

    No full text
    Data replication is an important aspect in a Data Grid for increasing fault tolerance and availability. Many Grid replication tools or middleware systems deal with read-only files which implies that replicated data items are always consistent. However, there are several applications that do require updates to existing data and the respective replicas. In this article we present a replica consistency service that allows for replica updates in a single-master scenario with lazy update synchronisation. The system allows for updates of (heterogeneous) relational databases, and it is designed to support flat files as well. It keeps remote replicas synchronised and partially (“lazily”) consistent. We report on the design and implementation of a novel “relaxed” replica consistency service and show its usefulness in a typical application use case
    corecore