38,493 research outputs found

    DIRAC Data Management System

    No full text
    International audienceDIRAC Project is developing software for building distributed computing systems for the needs of research communities. It provides a complete solution covering both Workload Management and Data Management tasks of accessing computing and storage resources. The Data Management subsystem (DMS) of DIRAC includes all the necessary components to organize distributed data of a given scientific community. The central component of the DMS is the File Catalog (DFC) service. It allows to build a logical File System of DIRAC presenting all the distributed storage elements as a single entity for the users with transparent access to the data. The Metadata functionality of the DFC service is provided to classify data with user defined tags. This can be used for an efficient search of the data necessary for a particular analysis. The DMS supports all the usual data management tasks of uploading and downloading, replication, removal files, etc. A special attention is paid to the bulk data operations involving large numbers of files. Automation of data operations driven by new data registrations is also possible. In this contribution we will make an overview of the DIRAC Data Management System and will give examples of its usage by several research communities

    Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

    Get PDF
    This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Enabling Inter-Repository Access Management between iRODS and Fedora

    Get PDF
    4th International Conference on Open RepositoriesThis presentation was part of the session : Conference PresentationsDate: 2009-06-04 08:30 AM – 10:00 AMMany digital repositories have been built using different technologies such as Fedora and the integrated Rule-Oriented Data System (iRODS). This paper analyzes both the Fedora and iRODS technologies to understand how to integrate the two systems to enable cross-repository data sharing. The areas considered include the digital object model, services, management of distributed storage, external data resources, and policy enforcement.National Science Foundatio
    • …
    corecore