1,964 research outputs found

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Analyse the Performance of Mobile Peer to Peer Network using Ant Colony Optimization

    Get PDF
    A mobile peer-to-peer computer network is the one in which each computer in the network can act as a client or server for the other computers in the network. The communication process among the nodes in the mobile peer to peer network requires more no of messages. Due to this large number of messages passing, propose an interconnection structure called distributed Spanning Tree (DST) and it improves the efficiency of the mobile peer to peer network. The proposed method improves the data availability and consistency across the entire network and also reduces the data latency and the required number of message passes for any specific application in the network. Further to enhance the effectiveness of the proposed system, the DST network is optimized with the Ant Colony Optimization method. It gives the optimal solution of the DST method and increased availability, enhanced consistency and scalability of the network. The simulation results shows that reduces the number of message sent for any specific application and average delay and increases the packet delivery ratio in the network

    A Demand Based Load Balanced Service Replication Model

    Get PDF
    Cloud computing allows service users and providers to access the applications, logical resources and files on any computer with ease. A cloud service has three distinct characteristics that differentiate it from traditional hosting. It is sold on demand, typically by the minute or the hour; it is elastic. It is a way to increase capacity or add capabilities on the fly without investing in new infrastructure, training new personnel, or licensing new software. It not only promises reliable services delivered through next-generation data centers that are built on compute and storage virtualization technologies but also addresses the key issues such as scalability, reliability, fault tolerance and file load balancing. The one way to achieve this is through service replication across different machines coupled with load balancing. Though replication potentially improves fault tolerance, it leads to the problem of ensuring consistency of replicas when certain service is updated or modified. However, fewer replicas also decrease concurrency and the level of service availability. A balanced synchronization between replication mechanism and consistency not only ensures highly reliable and fault tolerant system but also improves system performance significantly. This paper presents a load balancing based service replication model that creates a replica on other servers on the basis of number of service accesses. The simulation results indicate that the proposed model reduces the number of messages exchanged for service replication by 25-55% thus improving the overall system performance significantly. Also in case of CPU load based file replication, it is observed that file access time reduces by 5.56%-7.65%

    Maintaining Replica Consistency Over Large-Scale Data Grid Using Update Propagation Technique

    Get PDF
    A Data Grid is an organized collection of nodes in a wide area network which contributes to various computation, storage data, and application. In Data Grid high numbers of users are distributed in a wide area environment which is dynamic and heterogeneous. Data management is one of the current issues where data transparency, consistency, fault-tolerance, automatic management and the performance are the user parameters in grid environment. Data management techniques must scale up while addressing autonomy, dynamicity and heterogeneity of the data resource. Data replication is a well known technique used to reduce accesses latency, improve availability and performance in a distributed computing environment. Replication introduces the problem of maintaining consistency among the replicas when files are allowed to be updated. The update information should be propagated to all replicas to guarantee correct read of the remote replicas. An asynchronous replication is a commonly agreed solution for the problem in consistency of replicas. A few studies have been done to maintain replica consistency in Data Grid. However, the introduced techniques are neither efficient nor scalable. They cannot be used in real Data Grid since the issues of large number of replica sites, large scale distribution, load balancing and site autonomy where the capability of grid site to join and leave the grid community at any time have not been addressed. This thesis proposes a new asynchronous replication protocol called Update Propagation Grid (UPG) to maintain replica consistency over a large scale data grid. In UPG the updates reach all on-line secondary replicas using a propagation technique based on nodes organized into a logical structure network in the form of two-dimensional grid structure. The proposed update propagation technique is a hybrid push-pull and dynamic technique that addresses the issues of site autonomy, efficiency, scalability, load balancing and fairness. A two performance analysis studies have been conducted to study the performance of the proposed technique in comparison with other techniques. First study involves mathematical and simulation analysis. Second study is based on Queuing Network Model. The result of the performance analysis shows that the proposed technique scales well with high number of replica sites and with high request loads. The result also shows the reduction on the average update reach time by 5% to 97%. Moreover the result shows that the proposed technique is capable of reaching load balancing while providing update propagation fairnes

    An enhanced dynamic replica creation and eviction mechanism in data grid federation environment

    Get PDF
    Data Grid Federation system is an infrastructure that connects several grid systems, which facilitates sharing of large amount of data, as well as storage and computing resources. The existing mechanisms on data replication focus on finding file values based on the number of files access in deciding which file to replicate, and place new replicas on locations that provide minimum read cost. DRCEM finds file values based on logical dependencies in deciding which file to replicate, and allocates new replicas on locations that provide minimum replica placement cost. This thesis presents an enhanced data replication strategy known as Dynamic Replica Creation and Eviction Mechanism (DRCEM) that utilizes the usage of data grid resources, by allocating appropriate replica sites around the federation. The proposed mechanism uses three schemes: 1) Dynamic Replica Evaluation and Creation Scheme, 2) Replica Placement Scheme, and 3) Dynamic Replica Eviction Scheme. DRCEM was evaluated using OptorSim network simulator based on four performance metrics: 1) Jobs Completion Times, 2) Effective Network Usage, 3) Storage Element Usage, and 4) Computing Element Usage. DRCEM outperforms ELALW and DRCM mechanisms by 30% and 26%, in terms of Jobs Completion Times. In addition, DRCEM consumes less storage compared to ELALW and DRCM by 42% and 40%. However, DRCEM shows lower performance compared to existing mechanisms regarding Computing Element Usage, due to additional computations of files logical dependencies. Results revealed better jobs completion times with lower resource consumption than existing approaches. This research produces three replication schemes embodied in one mechanism that enhances the performance of Data Grid Federation environment. This has contributed to the enhancement of the existing mechanism, which is capable of deciding to either create or evict more than one file during a particular time. Furthermore, files logical dependencies were integrated into the replica creation scheme to evaluate data files more accurately
    corecore