12 research outputs found

    Incremento de prestaciones en el acceso en Grid de datos

    Get PDF
    Ponencias de las Decimosextas Jornadas de Paralelismo celebradas del 13 al 16 de septiembre de 2005 en GranadaEl modelo de computación Grid ha evolucionado en los últimos años para proporcionar un entorno de computación de altas prestaciones en redes de área amplia. Sin embargo, uno de los mayores problemas se encuentra en las aplicaciones que hacen uso intensivo y masivo de datos. Como solución a los problemas de estas aplicaciones se ha utilizado la replicación. Sin embargo, la replicación clásica adolece de ciertos problemas como la adaptabilidad y la alta latencia del nuevo entorno. Por ello se propone un nuevo algoritmo de replicación y organización de datos que proporciona un acceso de altas prestaciones en un Data Grid.Publicad

    Economy-based data replication broker

    Full text link
    Data replication is one of the key components in data grid architecture as it enhances data access and reliability and minimises the cost of data transmission. In this paper, we address the problem of reducing the overheads of the replication mechanisms that drive the data management components of a data grid. We propose an approach that extends the resource broker with policies that factor in user quality of service as well as service costs when replicating and transferring data. A realistic model of the data grid was created to simulate and explore the performance of the proposed policy. The policy displayed an effective means of improving the performance of the grid network traffic and is indicated by the improvement of speed and cost of transfers by brokers.<br /

    Replica Creation Algorithm for Data Grids

    Get PDF
    Data grid system is a data management infrastructure that facilitates reliable access and sharing of large amount of data, storage resources, and data transfer services that can be scaled across distributed locations. This thesis presents a new replication algorithm that improves data access performance in data grids by distributing relevant data copies around the grid. The new Data Replica Creation Algorithm (DRCM) improves performance of data grid systems by reducing job execution time and making the best use of data grid resources (network bandwidth and storage space). Current algorithms focus on number of accesses in deciding which file to replicate and where to place them, which ignores resources’ capabilities. DRCM differs by considering both user and resource perspectives; strategically placing replicas at locations that provide the lowest transfer cost. The proposed algorithm uses three strategies: Replica Creation and Deletion Strategy (RCDS), Replica Placement Strategy (RPS), and Replica Replacement Strategy (RRS). DRCM was evaluated using network simulation (OptorSim) based on selected performance metrics (mean job execution time, efficient network usage, average storage usage, and computing element usage), scenarios, and topologies. Results revealed better job execution time with lower resource consumption than existing approaches. This research contributes replication strategies embodied in one algorithm that enhances data grid performance, capable of making a decision on creating or deleting more than one file during same decision. Furthermore, dependency-level-between-files criterion was utilized and integrated with the exponential growth/decay model to give an accurate file evaluation

    Binary vote assignment on grid quorum replication technique with association rule

    Get PDF
    One of the biggest challenges that data grids users have to face today relates to the improvement of the data management. Organizations need to provide current data to users who may be geographically remote and to handle a volume of requests of data distributed around multiple sites in distributed environment. Therefore, the storage, availability, and consistency are important issues to be addressed to allow efficient and safe data access from many different sites. One way to effectively cope with these challenges is to rely on the replication technique. Replication is a useful technique for distributed database systems. Through this technique, a data can be accessed from multiple locations. Thus, replication increases data availability and accessibility to users. When one site fails, user still can access the same data at another site. Techniques such as Read-One-Write-All (ROWA), Hierarchical Replication Scheme (HRS) and Branch Replication Scheme (BRS) are the popular techniques being used for replication and data management. However, these techniques have its weaknesses in terms of communication costs that is the total replication servers needed to replicate the data. Furthermore, these techniques also do not consider the correlation between data during the fragmentation process. The knowledge about data correlation can be extracted from historical data using techniques of the data mining field. Without proper strategies, replication increases job execution time. In this research, the some-data-to-some-sites scheme called Binary Vote Assignment on Grid Quorum with Association (BV AGQAR) is proposed to manage replication for meaningful fragmented data in distributed database environment with low communication cost and processing time for a transaction. The main feature of BV AGQ-AR is that the technique integrates replication and data mining technique allowing meaningful extraction of knowledge from large data sets. Performance of the BVAGQ-AR technique comprised the following steps. First step is mining the data by using Apriori algorithm from Association Rules. It is used to discover the correlation between data. For the second step, the database is fragmented based on the data mining analysis results. This technique is executed to make sure data replication can be effectively done while saving cost. Then, the databases that are resulted after the fragmentation process are allocated at their assigned sites. Finally, after allocation process, each site has a database file and ready for any transaction and replication process. Finally, the result of the experiments shows that BV AGQ-AR can preserve the data consistency with the lowest communication cost and processing time for a transaction as compared to BCSA, PRA, ROW A, HRS and BRS

    A consistency service for replicated heterogeneous databases in a Grid environment

    Get PDF
    The present work, written during a period of scientific collaboration with INFN (National Institute of Nuclear Physics), proposes a mechanism for maintaining consistency among replicated heterogeneous databases. This mechanism is integrated in a Replica Consistency Service for Data Grid environments. Nowadays, an increasing number of applications, manage an enormous quantity of data distributed on a very large scale. In a Data Grid the data files are replicated to different sites. The replication of data among multiple databases is the only way to ensure that the data are available where and when they are needed. The catalogs, used for maintaining information about the state of file replicas in a Data Grid, use replication to improve availability, to reduce data access latency, and to improve the performances of distributed applications. Moreover it is useful for fault tolerance. A Data Grid may serve different user communities (Virtual Organizations), with different requirements. Therefore different types of data may be stored in different formats and in storage devices of different vendors. A mechanism is needed to maintain consistency among the replicas of a certain catalog that uses heterogeneous back-end databases. In a Data Grid, heterogeneous data storage involves the problem of a uniform interface to access heterogeneous information; this problem requires tools for data integration. Our goal for this work is to provide the ability to manage and to execute a replica synchronization; in particular we have a replica on an Oracle server and another replica on a MySQL server with the same logical schema. With the word "synchronization" we mean that an update of the target database will be done. Only the changes that happen on the source database will be propagated to the target database

    Role-Based Access Control for the Open Grid Services Architecture - Data Access and Integration (OGSA-DAI)

    Get PDF
    Grid has emerged recently as an integration infrastructure for the sharing and coordinated use of diverse resources in dynamic, distributed virtual organizations (VOs). A Data Grid is an architecture for the access, exchange, and sharing of data in the Grid environment. In this dissertation, role-based access control (RBAC) systems for heterogeneous data resources in Data Grid systems are proposed. The Open Grid Services Architecture - Data Access and Integration (OGSA-DAI) is a widely used framework for the integration of heterogeneous data resources in Grid systems. However, in the OGSA-DAI system, access control causes substantial administration overhead for resource providers in VOs because each of them has to manage the authorization information for individual Grid users. Its identity-based access control mechanisms are severely inefficient and too complicated to manage because the direct mapping between users and privileges is transitory. To solve this problem, (1) the Community Authorization Service (CAS), provided by the Globus toolkit, and (2) the Shibboleth, an attribute authorization service, are used to support RBAC in the OGSA-DAI system. The Globus Toolkit is widely used software for building Grid systems. Access control policies need to be specified and managed across multiple VOs. For this purpose, the Core and Hierarchical RBAC profile of the eXtensible Access Control Markup Language (XACML) is used; and for distributed administration of those policies, the Object, Metadata and Artifacts Registry (OMAR) is used. OMAR is based on the e-business eXtensible Markup Language (ebXML) registry specifications developed to achieve interoperable registries and repositories. The RBAC systems allow quick and easy deployments, privacy protection, and the centralized and distributed management of privileges. They support scalable, interoperable and fine-grain access control services; dynamic delegation of rights; and user-role assignments. They also reduce the administration overheads for resource providers because they need to maintain only the mapping information from VO roles to local database roles. Resource providers maintain the ultimate authority over their resources. Moreover, unnecessary mapping and connections can be avoided by denying invalid requests at the VO level. Performance analysis shows that our RBAC systems add only a small overhead to the existing security infrastructure of OGSA-DAI

    Simulatore per un servizio di consistenza su architetture Grid

    Get PDF
    Integrazione di CONStanza e OptorSim al fine di ottenere un simulatore per il servizio di consistenza per la replicazione dei dati

    Integrazione della Grid Security Infrastructure in un servizio di consistenza

    Get PDF
    La tesi, scritta durante un periodo di collaborazione scientifica presso l’INFN (Istituto Nazionale di Fisica Nucleare) sezione di Pisa, ha come scopo quello di integrare in un Servizio di Consistenza per ambiente Grid la GSI (Grid Security Infrastructure). Attualmente, c’è un numero sempre crescente di applicazioni che devono lavorare e gestire enormi quantità di dati eterogenei distribuiti su scala mondiale. L’architettura Grid è una valida soluzione per le esigenze di queste applicazioni. La gestione di grandi quantità di dati richiede la presenza di un meccanismo di replicazione per diminuire i tempi di accesso ai dati stessi e incrementare le performance delle applicazioni. E' quindi necessario avere un servizio che garantisca la consistenza delle repliche dei dati. Il servizio di Consistenza su cui il lavoro è stato svolto è il Replica Consistency Service Constanza, sviluppato nell'ambito del progetto Grid.it per entrare a far parte della INFN Production Grid. Constanza è accessibile come Web Service. Per applicazioni basate sui Web Service, la comunicazione avviene attraverso il protocollo SOAP, e quindi gli aspetti di sicurezza devono essere gestiti a livello di questo protocollo. Si descrive quindi come viene creata l’infrastruttura di comunicazione del servizio e come questa infrastruttura viene resa sicura sfruttando le linee guida della GSI

    GREEDY SINGLE USER AND FAIR MULTIPLE USERS REPLICA SELECTION DECISION IN DATA GRID

    Get PDF
    Replication in data grids increases data availability, accessibility and reliability. Replicas of datasets are usually distributed to different sites, and the choice of any replica locations has a significant impact. Replica selection algorithms decide the best replica places based on some criteria. To this end, a family of efficient replica selection systems has been proposed (RsDGrid). The problem presented in this thesis is how to select the best replica location that achieve less time, higher QoS, consistency with users' preferences and almost equal users' satisfactions. RsDGrid consists of three systems: A-system, D-system, and M-system. Each of them has its own scope and specifications. RsDGrid switches among these systems according to the decision maker
    corecore