27 research outputs found
Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project
In the first phase of the EU DataGrid (EDG) project, a Data Management System
has been implemented and provided for deployment. The components of the current
EDG Testbed are: a prototype of a Replica Manager Service built around the
basic services provided by Globus, a centralised Replica Catalogue to store
information about physical locations of files, and the Grid Data Mirroring
Package (GDMP) that is widely used in various HEP collaborations in Europe and
the US for data mirroring. During this year these services have been refined
and made more robust so that they are fit to be used in a pre-production
environment. Application users have been using this first release of the Data
Management Services for more than a year. In the paper we present the
components and their interaction, our implementation and experience as well as
the feedback received from our user communities. We have resolved not only
issues regarding integration with other EDG service components but also many of
the interoperability issues with components of our partner projects in Europe
and the U.S. The paper concludes with the basic lessons learned during this
operation. These conclusions provide the motivation for the architecture of the
next generation of Data Management Services that will be deployed in EDG during
2003.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 9 pages, LaTeX, PSN: TUAT007 all
figures are in the directory "figures
Data Transfer Management In Grid-Based Mass Storage Environment.
The drastic increase in the data requirements of scientific applications and collaborative research has resulted of transferring a large amount of data among participating sites
Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies
Grid is an infrastructure that involves the integrated and collaborative use
of computers, networks, databases and scientific instruments owned and managed
by multiple organizations. Grid applications often involve large amounts of
data and/or computing resources that require secure resource sharing across
organizational boundaries. This makes Grid application management and
deployment a complex undertaking. Grid middlewares provide users with seamless
computing ability and uniform access to resources in the heterogeneous Grid
environment. Several software toolkits and systems have been developed, most of
which are results of academic research projects, all over the world. This
chapter will focus on four of these middlewares--UNICORE, Globus, Legion and
Gridbus. It also presents our implementation of a resource broker for UNICORE
as this functionality was not supported in it. A comparison of these systems on
the basis of the architecture, implementation model and several other features
is included.Comment: 19 pages, 10 figure
Desarrollo de un grid interuniversitario: aplicaciones al estudio de problemas medioambientales
Este trabajo presenta una línea de desarrollo entre 4 Universidades Nacionales, que se integra con esfuerzos de otros países para configurar un Grid interuniversitario que permita compartir recursos de supercómputo a costos relativamente bajos.
Las aplicaciones que serán utilizadas como banco de pruebas de la tecnología Grid en este proyecto son:
- Algoritmos de simulación de incendios forestales.
- Algoritmos de estudio de modelos de inundaciones en ríos de llanura.
En una etapa posterior se agregarán:
- Algoritmos de reconstrucción 3D (en particular aplicables en Medicina).
- Algoritmos de reconocimiento de secuencias (en particular de ADN o genómicas).
En el marco del proyecto CyTED, se ha realizado un análisis comparativo de diferentes middlewares para la construcción y puesta en marcha de sistemas GRID. En particular, se han realizado comparaciones y análisis entre Globus Toolkit y GLite.
Analizando GLite, se llegó a la conclusión de que este middleware posee altos requerimientos de hardware y/o software; debido a su arquitectura distribuida en diversos equipos (característica no siempre soportada en algunos casos del ámbito académico). Por otra parte, GLite es un proyecto centralizado (liderado por el CERN) que torna dificultoso que los grupos realicen aportes. Por estos motivos, se ha tomado la decisión de (en un principio) realizar la instalación de GRID utilizando el middleware Globus Toolkit 4.0.Eje: Redes académicas. Sistemas distribuidos y redesFacultad de Informátic
Replica Creation Algorithm for Data Grids
Data grid system is a data management infrastructure that facilitates reliable access and sharing of large amount of data, storage resources, and data transfer services that can be scaled across distributed locations. This thesis presents a new replication algorithm that improves data access performance in data grids by distributing relevant data copies around the grid. The new Data Replica Creation Algorithm (DRCM) improves performance of data grid systems by reducing job execution time and making the best use of data grid resources (network bandwidth and storage space). Current algorithms focus on number of accesses in deciding which file to replicate and where to place them, which ignores resources’ capabilities. DRCM differs by considering both user and resource perspectives; strategically placing replicas at locations that provide the lowest transfer cost. The proposed algorithm uses three strategies: Replica Creation and Deletion Strategy (RCDS), Replica Placement Strategy (RPS), and Replica Replacement Strategy (RRS). DRCM was evaluated using network simulation (OptorSim) based on selected performance metrics (mean job execution time, efficient network usage, average storage usage, and computing element usage), scenarios, and topologies. Results revealed better job execution time with lower resource consumption than existing approaches. This research contributes replication strategies embodied in one algorithm that enhances data grid performance, capable of making a decision on creating or deleting more than one file during same decision. Furthermore, dependency-level-between-files criterion was utilized and integrated with the exponential growth/decay model to give an accurate file evaluation
A replicated file system for Grid computing
To meet the rigorous demands of large-scale data sharing in global collaborations, we present a replication scheme for NFSv4 that supports mutable replication without sacrificing strong consistency guarantees. Experimental evaluation indicates a substantial performance advantage over a single-server system. With the introduction of a hierarchical replication control protocol, the overhead of replication is negligible even when applications mostly write and replication servers are widely distributed. Evaluation with the NAS Grid Benchmarks demonstrates that our system provides comparable and often better performance than GridFTP, the de facto standard for Grid data sharing. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60228/1/1286_ftp.pd