5,334 research outputs found

    A dynamic replication strategy based on exponential growth/decay rate

    Get PDF
    Data Grid is an infrastructure that manages huge amount of data files, and provides intensive computational resources across geographically distributed collaboration.To increase resource availability and to ease resource sharing in such environment, there is a need for replication services.Data replication is one of the methods used to improve the performance of data access in distributed systems.In this paper, we include issues arising in data replication domain and also we propose a dynamic replication strategy that is based on exponential growth or decay rate. The purpose of the proposed strategy is to identify which files to be replicated.This is achieved by estimating number of accessed of a file in the upcoming time interval.The greater the value, the more popular the file is and therefore will be selected to be replicate

    A dynamic replica creation: Which file to replicate?

    Get PDF
    Data Grid is an infrastructure that manages huge amount of data files and provides intensive computational resources across geographically distributed collaboration.To increase resource availability and to ease resource sharing in such environment, there is a need for replication services.Data replication is one of the methods used to improve the performance of data access in distributed systems.In this paper, we propose a dynamic replication strategy that is based on exponential growth or decay rate and dependency level of data files (EXPM).Simulation results (via Optorsim) show that EXPM outperformed LALW in the measured metrics – mean job execution time, effective network usage and average storage usage

    A Prediction-Based Replication Algorithm for Improving Data Availability in Frid Environment

    Get PDF
    Data replication is a key optimization technique for reducing access latency and managing large data by storing replica of data in a wisely manner. In this paper, we propose a data replication algorithm, called the Prediction-Base Dynamic Replication (PBDR) algorithm that improves file access time. Restricted by the storage capacity, it is essential to design an effective strategy for the replication replacement task. PBDR deletes files by considering four important factors: the number of requests for the replica in the future times, availability, the size of the replica and the last time the replica was requested. Also, it can minimize access latency by selecting the best replica when various sites hold replicas of datasets. The algorithm is simulated using a data grid simulator, OptorSim, developed by European Data Grid projects. The experiment results show that PBDR strategy gives better performance compared to the other algorithms and prevents unnecessary creation of replica which leads to efficient storage usage

    An enhanced dynamic replica creation and eviction mechanism in data grid federation environment

    Get PDF
    Data Grid Federation system is an infrastructure that connects several grid systems, which facilitates sharing of large amount of data, as well as storage and computing resources. The existing mechanisms on data replication focus on finding file values based on the number of files access in deciding which file to replicate, and place new replicas on locations that provide minimum read cost. DRCEM finds file values based on logical dependencies in deciding which file to replicate, and allocates new replicas on locations that provide minimum replica placement cost. This thesis presents an enhanced data replication strategy known as Dynamic Replica Creation and Eviction Mechanism (DRCEM) that utilizes the usage of data grid resources, by allocating appropriate replica sites around the federation. The proposed mechanism uses three schemes: 1) Dynamic Replica Evaluation and Creation Scheme, 2) Replica Placement Scheme, and 3) Dynamic Replica Eviction Scheme. DRCEM was evaluated using OptorSim network simulator based on four performance metrics: 1) Jobs Completion Times, 2) Effective Network Usage, 3) Storage Element Usage, and 4) Computing Element Usage. DRCEM outperforms ELALW and DRCM mechanisms by 30% and 26%, in terms of Jobs Completion Times. In addition, DRCEM consumes less storage compared to ELALW and DRCM by 42% and 40%. However, DRCEM shows lower performance compared to existing mechanisms regarding Computing Element Usage, due to additional computations of files logical dependencies. Results revealed better jobs completion times with lower resource consumption than existing approaches. This research produces three replication schemes embodied in one mechanism that enhances the performance of Data Grid Federation environment. This has contributed to the enhancement of the existing mechanism, which is capable of deciding to either create or evict more than one file during a particular time. Furthermore, files logical dependencies were integrated into the replica creation scheme to evaluate data files more accurately

    Replica Creation Algorithm for Data Grids

    Get PDF
    Data grid system is a data management infrastructure that facilitates reliable access and sharing of large amount of data, storage resources, and data transfer services that can be scaled across distributed locations. This thesis presents a new replication algorithm that improves data access performance in data grids by distributing relevant data copies around the grid. The new Data Replica Creation Algorithm (DRCM) improves performance of data grid systems by reducing job execution time and making the best use of data grid resources (network bandwidth and storage space). Current algorithms focus on number of accesses in deciding which file to replicate and where to place them, which ignores resources’ capabilities. DRCM differs by considering both user and resource perspectives; strategically placing replicas at locations that provide the lowest transfer cost. The proposed algorithm uses three strategies: Replica Creation and Deletion Strategy (RCDS), Replica Placement Strategy (RPS), and Replica Replacement Strategy (RRS). DRCM was evaluated using network simulation (OptorSim) based on selected performance metrics (mean job execution time, efficient network usage, average storage usage, and computing element usage), scenarios, and topologies. Results revealed better job execution time with lower resource consumption than existing approaches. This research contributes replication strategies embodied in one algorithm that enhances data grid performance, capable of making a decision on creating or deleting more than one file during same decision. Furthermore, dependency-level-between-files criterion was utilized and integrated with the exponential growth/decay model to give an accurate file evaluation

    Network and Energy-Aware Resource Selection Model for Opportunistic Grids

    Get PDF
    Due to increasing hardware capacity, computing grids have been handling and processing more data. This has led to higher amount of energy being consumed by grids; hence the necessity for strategies to reduce their energy consumption. Scheduling is a process carried out to define in which node tasks will be executed in the grid. This process can significantly impact the global system performance, including energy consumption. This paper focuses on a scheduling model for opportunistic grids that considers network traffic, distance between input files and execution node as well as the execution node status. The model was tested in a simulated environment created using GreenCloud. The simulation results of this model compared to a usual approach show a total power consumption savings of 7.10%

    Geoprocessing Optimization in Grids

    Get PDF
    Geoprocessing is commonly used in solving problems across disciplines which feature geospatial data and/or phenomena. Geoprocessing requires specialized algorithms and more recently, due to large volumes of geospatial databases and complex geoprocessing operations, it has become data- and/or compute-intensive. The conventional approach, which is predominately based on centralized computing solutions, is unable to handle geoprocessing efficiently. To that end, there is a need for developing distributed geoprocessing solutions by taking advantage of existing and emerging advanced techniques and high-performance computing and communications resources. As an emerging new computing paradigm, grid computing offers a novel approach for integrating distributed computing resources and supporting collaboration across networks, making it suitable for geoprocessing. Although there have been research efforts applying grid computing in the geospatial domain, there is currently a void in the literature for a general geoprocessing optimization. In this research, a new optimization technique for geoprocessing in grid systems, Geoprocessing Optimization in Grids (GOG), is designed and developed. The objective of GOG is to reduce overall response time with a reasonable cost. To meet this objective, GOG contains a set of algorithms, including a resource selection algorithm and a parallelism processing algorithm, to speed up query execution. GOG is validated by comparing its optimization time and estimated costs of generated execution plans with two existing optimization techniques. A proof of concept based on an application in air quality control is developed to demonstrate the advantages of GOG
    • …