27 research outputs found

    Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project

    Full text link
    In the first phase of the EU DataGrid (EDG) project, a Data Management System has been implemented and provided for deployment. The components of the current EDG Testbed are: a prototype of a Replica Manager Service built around the basic services provided by Globus, a centralised Replica Catalogue to store information about physical locations of files, and the Grid Data Mirroring Package (GDMP) that is widely used in various HEP collaborations in Europe and the US for data mirroring. During this year these services have been refined and made more robust so that they are fit to be used in a pre-production environment. Application users have been using this first release of the Data Management Services for more than a year. In the paper we present the components and their interaction, our implementation and experience as well as the feedback received from our user communities. We have resolved not only issues regarding integration with other EDG service components but also many of the interoperability issues with components of our partner projects in Europe and the U.S. The paper concludes with the basic lessons learned during this operation. These conclusions provide the motivation for the architecture of the next generation of Data Management Services that will be deployed in EDG during 2003.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 9 pages, LaTeX, PSN: TUAT007 all figures are in the directory "figures

    Data Transfer Management In Grid-Based Mass Storage Environment.

    Get PDF
    The drastic increase in the data requirements of scientific applications and collaborative research has resulted of transferring a large amount of data among participating sites

    Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies

    Full text link
    Grid is an infrastructure that involves the integrated and collaborative use of computers, networks, databases and scientific instruments owned and managed by multiple organizations. Grid applications often involve large amounts of data and/or computing resources that require secure resource sharing across organizational boundaries. This makes Grid application management and deployment a complex undertaking. Grid middlewares provide users with seamless computing ability and uniform access to resources in the heterogeneous Grid environment. Several software toolkits and systems have been developed, most of which are results of academic research projects, all over the world. This chapter will focus on four of these middlewares--UNICORE, Globus, Legion and Gridbus. It also presents our implementation of a resource broker for UNICORE as this functionality was not supported in it. A comparison of these systems on the basis of the architecture, implementation model and several other features is included.Comment: 19 pages, 10 figure

    Desarrollo de un grid interuniversitario: aplicaciones al estudio de problemas medioambientales

    Get PDF
    Este trabajo presenta una línea de desarrollo entre 4 Universidades Nacionales, que se integra con esfuerzos de otros países para configurar un Grid interuniversitario que permita compartir recursos de supercómputo a costos relativamente bajos. Las aplicaciones que serán utilizadas como banco de pruebas de la tecnología Grid en este proyecto son: - Algoritmos de simulación de incendios forestales. - Algoritmos de estudio de modelos de inundaciones en ríos de llanura. En una etapa posterior se agregarán: - Algoritmos de reconstrucción 3D (en particular aplicables en Medicina). - Algoritmos de reconocimiento de secuencias (en particular de ADN o genómicas). En el marco del proyecto CyTED, se ha realizado un análisis comparativo de diferentes middlewares para la construcción y puesta en marcha de sistemas GRID. En particular, se han realizado comparaciones y análisis entre Globus Toolkit y GLite. Analizando GLite, se llegó a la conclusión de que este middleware posee altos requerimientos de hardware y/o software; debido a su arquitectura distribuida en diversos equipos (característica no siempre soportada en algunos casos del ámbito académico). Por otra parte, GLite es un proyecto centralizado (liderado por el CERN) que torna dificultoso que los grupos realicen aportes. Por estos motivos, se ha tomado la decisión de (en un principio) realizar la instalación de GRID utilizando el middleware Globus Toolkit 4.0.Eje: Redes académicas. Sistemas distribuidos y redesFacultad de Informátic

    Replica Creation Algorithm for Data Grids

    Get PDF
    Data grid system is a data management infrastructure that facilitates reliable access and sharing of large amount of data, storage resources, and data transfer services that can be scaled across distributed locations. This thesis presents a new replication algorithm that improves data access performance in data grids by distributing relevant data copies around the grid. The new Data Replica Creation Algorithm (DRCM) improves performance of data grid systems by reducing job execution time and making the best use of data grid resources (network bandwidth and storage space). Current algorithms focus on number of accesses in deciding which file to replicate and where to place them, which ignores resources’ capabilities. DRCM differs by considering both user and resource perspectives; strategically placing replicas at locations that provide the lowest transfer cost. The proposed algorithm uses three strategies: Replica Creation and Deletion Strategy (RCDS), Replica Placement Strategy (RPS), and Replica Replacement Strategy (RRS). DRCM was evaluated using network simulation (OptorSim) based on selected performance metrics (mean job execution time, efficient network usage, average storage usage, and computing element usage), scenarios, and topologies. Results revealed better job execution time with lower resource consumption than existing approaches. This research contributes replication strategies embodied in one algorithm that enhances data grid performance, capable of making a decision on creating or deleting more than one file during same decision. Furthermore, dependency-level-between-files criterion was utilized and integrated with the exponential growth/decay model to give an accurate file evaluation

    A replicated file system for Grid computing

    Full text link
    To meet the rigorous demands of large-scale data sharing in global collaborations, we present a replication scheme for NFSv4 that supports mutable replication without sacrificing strong consistency guarantees. Experimental evaluation indicates a substantial performance advantage over a single-server system. With the introduction of a hierarchical replication control protocol, the overhead of replication is negligible even when applications mostly write and replication servers are widely distributed. Evaluation with the NAS Grid Benchmarks demonstrates that our system provides comparable and often better performance than GridFTP, the de facto standard for Grid data sharing. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60228/1/1286_ftp.pd

    VLAM-G: Interactive Data Driven Workflow Engine for Grid-Enabled Resources

    Get PDF
    corecore