42 research outputs found

    An enhanced dynamic replica creation and eviction mechanism in data grid federation environment

    Get PDF
    Data Grid Federation system is an infrastructure that connects several grid systems, which facilitates sharing of large amount of data, as well as storage and computing resources. The existing mechanisms on data replication focus on finding file values based on the number of files access in deciding which file to replicate, and place new replicas on locations that provide minimum read cost. DRCEM finds file values based on logical dependencies in deciding which file to replicate, and allocates new replicas on locations that provide minimum replica placement cost. This thesis presents an enhanced data replication strategy known as Dynamic Replica Creation and Eviction Mechanism (DRCEM) that utilizes the usage of data grid resources, by allocating appropriate replica sites around the federation. The proposed mechanism uses three schemes: 1) Dynamic Replica Evaluation and Creation Scheme, 2) Replica Placement Scheme, and 3) Dynamic Replica Eviction Scheme. DRCEM was evaluated using OptorSim network simulator based on four performance metrics: 1) Jobs Completion Times, 2) Effective Network Usage, 3) Storage Element Usage, and 4) Computing Element Usage. DRCEM outperforms ELALW and DRCM mechanisms by 30% and 26%, in terms of Jobs Completion Times. In addition, DRCEM consumes less storage compared to ELALW and DRCM by 42% and 40%. However, DRCEM shows lower performance compared to existing mechanisms regarding Computing Element Usage, due to additional computations of files logical dependencies. Results revealed better jobs completion times with lower resource consumption than existing approaches. This research produces three replication schemes embodied in one mechanism that enhances the performance of Data Grid Federation environment. This has contributed to the enhancement of the existing mechanism, which is capable of deciding to either create or evict more than one file during a particular time. Furthermore, files logical dependencies were integrated into the replica creation scheme to evaluate data files more accurately

    A Prediction-Based Replication Algorithm for Improving Data Availability in Frid Environment

    Get PDF
    Data replication is a key optimization technique for reducing access latency and managing large data by storing replica of data in a wisely manner. In this paper, we propose a data replication algorithm, called the Prediction-Base Dynamic Replication (PBDR) algorithm that improves file access time. Restricted by the storage capacity, it is essential to design an effective strategy for the replication replacement task. PBDR deletes files by considering four important factors: the number of requests for the replica in the future times, availability, the size of the replica and the last time the replica was requested. Also, it can minimize access latency by selecting the best replica when various sites hold replicas of datasets. The algorithm is simulated using a data grid simulator, OptorSim, developed by European Data Grid projects. The experiment results show that PBDR strategy gives better performance compared to the other algorithms and prevents unnecessary creation of replica which leads to efficient storage usage

    Replica Creation Algorithm for Data Grids

    Get PDF
    Data grid system is a data management infrastructure that facilitates reliable access and sharing of large amount of data, storage resources, and data transfer services that can be scaled across distributed locations. This thesis presents a new replication algorithm that improves data access performance in data grids by distributing relevant data copies around the grid. The new Data Replica Creation Algorithm (DRCM) improves performance of data grid systems by reducing job execution time and making the best use of data grid resources (network bandwidth and storage space). Current algorithms focus on number of accesses in deciding which file to replicate and where to place them, which ignores resources’ capabilities. DRCM differs by considering both user and resource perspectives; strategically placing replicas at locations that provide the lowest transfer cost. The proposed algorithm uses three strategies: Replica Creation and Deletion Strategy (RCDS), Replica Placement Strategy (RPS), and Replica Replacement Strategy (RRS). DRCM was evaluated using network simulation (OptorSim) based on selected performance metrics (mean job execution time, efficient network usage, average storage usage, and computing element usage), scenarios, and topologies. Results revealed better job execution time with lower resource consumption than existing approaches. This research contributes replication strategies embodied in one algorithm that enhances data grid performance, capable of making a decision on creating or deleting more than one file during same decision. Furthermore, dependency-level-between-files criterion was utilized and integrated with the exponential growth/decay model to give an accurate file evaluation

    Reliability and Availability Improvement in Economic Data Grid Environment Based On Clustering Approach

    Get PDF
    Abstract - One of the important problems in grid environments is data replication in grid sites. Reliability and availability of data replication in some cases is considered low. To separate sites with high reliability and high availability of sites with low availability and low reliability, clustering can be used. In this study, the data grid dynamically evaluate and predict the condition of the sites. The reliability and availability of sites were calculated and it was used to make decisions to replicate data. With these calculations, we have information on the locations of users in grid with reliability and availability or cost commensurate with the value of the work they did. This information can be downloaded from users who are willing to send them data with suitable reliability and availability. Simulation results show that the addition of the two parameters, reliability and availability, assessment criteria have been improved in certain access patterns

    Simulazione di modelli di consistenza per file replicati su sistemi Grid

    Get PDF
    Lo scopo di questa tesi è quello di analizzare i problemi di consistenza dei file replicati su sistemi Grid e di progettare, simulare e mettere a confronto alcune possibili soluzioni. Una delle caratteristiche dei sistemi Grid è quella di consentire l'accesso semplice,sicuro e coordinato, ad una quantità di dati enorme (dell'ordine dei PetaByte) distribuiti nei vari nodi del sistema e di mettere a disposizione un'adeguata potenza di calcolo per elaborarli. Per migliorare l'accesso ai dati si ricorre a tecniche di replicazione che, se da un lato producono grossi vantaggi, dall'altro incrementano la mole di dati da gestire e introducono nuove problematiche di gestione, come quello della consistenza. Se infatti ammettiamo che un utente possa modificare una replica di un dato, abbiamo bisogno anche di meccanismi per sincronizzare le altre repliche. Dopo uno approfondito studio del problema, vedremo alcune possibili soluzioni ed effettueremo delle simulazioni per valutare il loro impatto sul sistema

    A dynamic replication strategy based on exponential growth/decay rate

    Get PDF
    Data Grid is an infrastructure that manages huge amount of data files, and provides intensive computational resources across geographically distributed collaboration.To increase resource availability and to ease resource sharing in such environment, there is a need for replication services.Data replication is one of the methods used to improve the performance of data access in distributed systems.In this paper, we include issues arising in data replication domain and also we propose a dynamic replication strategy that is based on exponential growth or decay rate. The purpose of the proposed strategy is to identify which files to be replicated.This is achieved by estimating number of accessed of a file in the upcoming time interval.The greater the value, the more popular the file is and therefore will be selected to be replicate

    Implementation of Sub-Grid-Federation Model for Performance Improvement in Federated Data Grid

    Get PDF
    In this work, a new model for federation data grid system called Sub-Grid-Federation was designed to improve access latency by accessing data from the nearest possible sites. The strategy in optimising data access was based on the process of searching into the area identified as ‘Network Core Area’ (NCA). The performance of access latency in Sub-Grid-Federation was tested based on the mathematical proving and simulated using OptorSim simulator. Four case studies were carried out and tested in Optimal Downloading Replication Strategy (ODRS) and the Sub-Grid-Federation. The results show that Sub-Grid-Federation is 20% better in terms of access latency and 21% better in terms of reducing remotes sites access compared to ODRS. The results indicate that the Sub-Grid-Federation is a better alternative for the implementation of collaboration and data sharing in data grid system.                                                                                    Keywords: Data grid, replication, scheduling, access latenc

    Relationship based replication algorithm for data grid

    Get PDF
    Data Grid is an infrastructure that manages huge amount of data files and provides intensive computational resources across geographically distributed systems.To increase resource availability and to ease resource sharing in such environment, there is a need for replication services.This research proposes a replication algorithm, termed as Relationship based Replication (RBR) that integrates users, grid and system perspective.In particular, the RBR includes information of three different relationships in identifying file(s) that requires replication; file-to-user, file-to-file and file-to-grid. Such an approach overcomes existing algorithms that is based either on users request or resource capabilities as an individual. The Relationship based Replication algorithm aims to improve the Data Grid performance by reducing the job execution time, bandwidth and storage usage.The RBR was realized using a network simulation (OptorSim) and experiment results revealed that it offers better performance than existing replication algorithms

    An Effective Weighted Data Replication Strategy for Data Grid

    Get PDF
    Data Grid is a good solution to large scale data management problems including efficient file transfer and replication. Dynamic data replication in Data Grid aims to improve data access time and to utilize network and storage resources efficiently. Since the data files are very large and the Grid storages are limited, managing replicas in storages for the purpose of more effective utilization of them require more attention.In this paper, a dynamic data replication strategy, called Modified Latest Access Largest Weight (MLALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy.MLALW deletes files by considering three important factors: least frequently used replicas, least recently used replicas and the size of the replica. MLALW stores each replica in an appropriate site i.e. appropriate site in the region that has the highest number of access in future for that particular replica. The algorithm is simulated using a Data Grid simulator, OptorSim, developed by European Data Grid projects. The experiment results show that MLALW strategy gives better performance compared to the other algorithms and prevents unnecessary creation of replica which leads to efficient storage usage
    corecore