46 research outputs found

    Analysis and selection of the simulation environment

    Get PDF
    This document provides the initial report of the Simulation work package (Work Package 4,WP4) of the CATNETS project. It contains an analisys of the requirements for a simulation tool to be used in CATNETS and an evaluation of a number of grid and general purpose simulators with respect to the selected requirements. A reasoned choice of a suitable simulator is performed based on the evaluation conducted. -- Diese Arbeit analysiert die Anforderungen an eine Simulationsumgebung für die Analyse der Katallaxie. Anhand von Kennzahlen wird die Auswahl der Simulationsumgebung bestimmt.Grid Computing

    An enhanced dynamic replica creation and eviction mechanism in data grid federation environment

    Get PDF
    Data Grid Federation system is an infrastructure that connects several grid systems, which facilitates sharing of large amount of data, as well as storage and computing resources. The existing mechanisms on data replication focus on finding file values based on the number of files access in deciding which file to replicate, and place new replicas on locations that provide minimum read cost. DRCEM finds file values based on logical dependencies in deciding which file to replicate, and allocates new replicas on locations that provide minimum replica placement cost. This thesis presents an enhanced data replication strategy known as Dynamic Replica Creation and Eviction Mechanism (DRCEM) that utilizes the usage of data grid resources, by allocating appropriate replica sites around the federation. The proposed mechanism uses three schemes: 1) Dynamic Replica Evaluation and Creation Scheme, 2) Replica Placement Scheme, and 3) Dynamic Replica Eviction Scheme. DRCEM was evaluated using OptorSim network simulator based on four performance metrics: 1) Jobs Completion Times, 2) Effective Network Usage, 3) Storage Element Usage, and 4) Computing Element Usage. DRCEM outperforms ELALW and DRCM mechanisms by 30% and 26%, in terms of Jobs Completion Times. In addition, DRCEM consumes less storage compared to ELALW and DRCM by 42% and 40%. However, DRCEM shows lower performance compared to existing mechanisms regarding Computing Element Usage, due to additional computations of files logical dependencies. Results revealed better jobs completion times with lower resource consumption than existing approaches. This research produces three replication schemes embodied in one mechanism that enhances the performance of Data Grid Federation environment. This has contributed to the enhancement of the existing mechanism, which is capable of deciding to either create or evict more than one file during a particular time. Furthermore, files logical dependencies were integrated into the replica creation scheme to evaluate data files more accurately

    Replica Creation Algorithm for Data Grids

    Get PDF
    Data grid system is a data management infrastructure that facilitates reliable access and sharing of large amount of data, storage resources, and data transfer services that can be scaled across distributed locations. This thesis presents a new replication algorithm that improves data access performance in data grids by distributing relevant data copies around the grid. The new Data Replica Creation Algorithm (DRCM) improves performance of data grid systems by reducing job execution time and making the best use of data grid resources (network bandwidth and storage space). Current algorithms focus on number of accesses in deciding which file to replicate and where to place them, which ignores resources’ capabilities. DRCM differs by considering both user and resource perspectives; strategically placing replicas at locations that provide the lowest transfer cost. The proposed algorithm uses three strategies: Replica Creation and Deletion Strategy (RCDS), Replica Placement Strategy (RPS), and Replica Replacement Strategy (RRS). DRCM was evaluated using network simulation (OptorSim) based on selected performance metrics (mean job execution time, efficient network usage, average storage usage, and computing element usage), scenarios, and topologies. Results revealed better job execution time with lower resource consumption than existing approaches. This research contributes replication strategies embodied in one algorithm that enhances data grid performance, capable of making a decision on creating or deleting more than one file during same decision. Furthermore, dependency-level-between-files criterion was utilized and integrated with the exponential growth/decay model to give an accurate file evaluation

    Replica maintenance strategy for data grid

    Get PDF
    Data Grid is an infrastructure that manages huge amount of data files, and provides intensive computational resources across geographically distributed collaboration.Increasing the performance of such system can be achieved by improving the overall resource usage, which includes network and storage resources.Improving network resource usage is achieved by good utilization of network bandwidth that is considered as an important factor affecting job execution time.Meanwhile, improving storage resource usage is achieved by good utilization of storage space usage. Data replication is one of the methods used to improve the performance of data access in distributed systems by replicating multiple copies of data files in the distributed sites.Having distributed the replicas to various locations, they need to be monitored.As a result of dynamic changes in the data grid environment, some of the replicas need to be relocated.In this paper we proposed a maintenance replica placement strategy termed as Unwanted Replica Deletion Strategy (URDS) as a part of Replica maintenance service.The main purpose of the proposed strategy is to find the placement of unwanted replicas to be deleted.OptorSim is used to evaluate the performance of the proposed strategy. The simulation results show that URDS requires less execution time and consumes less network usage and has a best utilization of storage space usage compared to existing approaches

    Replication in data grid: Determining important resources

    Get PDF
    Replication is an important activity in determining the availability of resources in data grid.Nevertheless, due to high computational and storage cost, having replicas for all existing resources may not be an efficient practice. Existing approach in data replication have been focusing on utilizing information on the resource itself or network capability in order to determine replication of resources.In this paper, we present the integration of three types of relationships for the mentioned purpose. The undertaken approach combines the viewpoint of user, file system and the grid itself in identifying important resource that requires replication.Experimental work has been done via OptorSim and evaluation is made based on the job execution time.Results suggested that the proposed strategy produces a better outcome compared to existing approaches

    Relationship based replication algorithm for data grid

    Get PDF
    Data Grid is an infrastructure that manages huge amount of data files and provides intensive computational resources across geographically distributed systems.To increase resource availability and to ease resource sharing in such environment, there is a need for replication services.This research proposes a replication algorithm, termed as Relationship based Replication (RBR) that integrates users, grid and system perspective.In particular, the RBR includes information of three different relationships in identifying file(s) that requires replication; file-to-user, file-to-file and file-to-grid. Such an approach overcomes existing algorithms that is based either on users request or resource capabilities as an individual. The Relationship based Replication algorithm aims to improve the Data Grid performance by reducing the job execution time, bandwidth and storage usage.The RBR was realized using a network simulation (OptorSim) and experiment results revealed that it offers better performance than existing replication algorithms

    An Effective Weighted Data Replication Strategy for Data Grid

    Get PDF
    Data Grid is a good solution to large scale data management problems including efficient file transfer and replication. Dynamic data replication in Data Grid aims to improve data access time and to utilize network and storage resources efficiently. Since the data files are very large and the Grid storages are limited, managing replicas in storages for the purpose of more effective utilization of them require more attention.In this paper, a dynamic data replication strategy, called Modified Latest Access Largest Weight (MLALW) is proposed. This strategy is an enhanced version of Latest Access Largest Weight strategy.MLALW deletes files by considering three important factors: least frequently used replicas, least recently used replicas and the size of the replica. MLALW stores each replica in an appropriate site i.e. appropriate site in the region that has the highest number of access in future for that particular replica. The algorithm is simulated using a Data Grid simulator, OptorSim, developed by European Data Grid projects. The experiment results show that MLALW strategy gives better performance compared to the other algorithms and prevents unnecessary creation of replica which leads to efficient storage usage

    Grid Federation: Number of Jobs and File Size Effects on Jobs Time

    Get PDF
    Grid federation is fast emerging as an alternative solution to the problems posed by the large data handling and computational needs of the existing numerous worldwide scientific projects. Efficient access to such extensively distributed data sets has become a fundamental challenge in grid computing. Creating and placing replicas to suitable sites, using data replication mechanisms can increase the system’s performance. Data Replication reduces data access time, ensures load balancing as well as narrows bandwidth consumption. In this paper, an enhanced data replication mechanism called EDR is proposed. EDR applies the principle of exponential growth/decay to both file size and file access history, based on the Latest Access Largest Weight (LALW) mechanism. The mechanism selects a popular file and determines an appropriate number of replicas as well as suitable grid sites for replication. It establishes the popularity of each file by associating a different weight to each historical data access record. Typically, recent data access record has a larger weight, which signifies that the record is more relevant to the current situation of data access. By varying the number of jobs as well as file sizes, the proposed EDR mechanism was simulated using file size and job completion time as the variable metrics. Optorsim simulator was used to evaluate the proposed mechanism alongside the existing Least Recently Used (LRU), and Least Frequently Used (LFU) Mechanisms. The simulation results showed that job completion time increases by the growth in both file size and number of jobs. EDR shows improved performance on the mean job completion time, compared to LRU and LFU mechanisms
    corecore