22 research outputs found

    Testing the dependability and performance of group communication based database replication protocols

    Get PDF
    Database replication based on group communication systems has recently been proposed as an efficient and resilient solution for large-scale data management. However, its evaluation has been conducted either on simplistic simulation models, which fail to assess concrete implementations, or on complete system implementations which are costly to test with realistic large-scale scenarios. This paper presents a tool that combines implementations of replication and communication protocols under study with simulated network, database engine, and traffic generator models. Replication components can therefore be subjected to realistic large scale loads in a variety of scenarios, including fault-injection, while at the same time providing global observation and control. The paper shows first how the model is configured and validated to closely reproduce the behavior of a real system, and then how it is applied, allowing us to derive interesting conclusions both on replication and communication protocols and on their implementationsFundação para a Ciência e a Tecnologia (FCT) - Project STRONGREP (POSI/CHS/41285/2001)

    Middleware-based Database Replication: The Gaps between Theory and Practice

    Get PDF
    The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, June 200

    Evaluating certification protocols in the partial database state machine

    Get PDF
    Partial replication is an alluring technique to ensure the reliability of very large and geographically distributed databases while, at the same time, offering good performance. By correctly exploiting access locality most transactions become confined to a small subset of the database replicas thus reducing processing, storage access and communication overhead associated with replication. The advantages of partial replication have however to be weighted against the added complexity that is required to manage it. In fact, if the chosen replica configuration prevents the local execution of transactions or if the overhead of consistency protocols offsets the savings of locality, potential gains cannot be realized. These issues are heavily dependent on the application used for evaluation and render simplistic benchmarks useless. In this paper, we present a detailed analysis of Partial Database State Machine (PDBSM) replication by comparing alternative partial replication protocols with full replication. This is done using a realistic scenario based on a detailed network simulator and access patterns from an industry standard database benchmark. The results obtained allow us to identify the best configuration for typical on-line transaction processing applications.União Europeia - GORDA Project (FP6-IST/004758)

    AKARA: A flexible clustering protocol for demanding transactional workloads

    Get PDF
    Shared-nothing clusters are a well known and cost-effective approach to database server scalability, in particular, with highly intensive read-only workloads typical of many 3-tier web-based applications. The common reliance on a centralized component and a simplistic propagation strategy employed by mainstream solutions however conduct to poor scalability with traditional on-line transaction processing (OLTP), where the update ratio is high. Such approaches also pose an additional obstacle to high availability while introducing a single point of failure. More recently, database replication protocols based on group communication have been shown to overcome such limitations, expanding the applicability of shared-nothing clusters to more demanding transactional workloads. These take simultaneous advantage of total order multicast and transactional semantics to improve on mainstream solutions. However, none has already been widely deployed in a general purpose database management system. In this paper, we argue that a major hurdle for their acceptance is that these proposals have disappointing performance with specific subsets of real-world workloads. Such limitations are deep-rooted and working around them requires in-depth understanding of protocols and changes to applications. We address this issue with a novel protocol that combines multiple transaction execution mechanisms and replication techniques and then show how it avoids the identified pitfalls. Experimental results are obtained with a workload based on the industry standard TPC-C benchmark

    Two-tier replication based on Eager Group – Lazy Master model

    Get PDF
    А scheme of two-tier replication based on Eager Group and Lazy Master models is presented. Initial transactions used permit remote nodes to read and update database. The algorithm of optimisation and commitment of initial transactions into form of base transaction are realised by initial transactions manager of master node.Предложена двухуровневая схема репликации данных, основанная на моделях Eager Group и Lazy Master. Нижний уровень (асинхронный) предназначен для мобильных клиентов и реализует модель Lazy Master. Верхний уровень (синхронный) предназначен для согласования серверов, содержащих реплики баз данных и реализует модель Eager Group. Рассмотрен алгоритм оптимизации входных транзакций на этих уровнях.Запропоновано дворівневу схему реплікації даних, що заснована на моделях Eager Group і Lazy Master. Нижній рівень (асинхронний) призначений для мобільних клієнтів і реалізує модель Lazy Master. Верхній рівень (синхронний) призначений для узгодження серверів, що містять репліки баз даних і реалізує модель Eager Group. Розглянуто алгоритм оптимізації вхідних транзакцій на цих рівнях

    Processing Transactions over Optimistic Atomic Broadcast Protocols

    Get PDF
    Atomic broadcast primitives allow fault-tolerant cooperation between sites in adistributed system. Unfortunately, the delay incurred before a message can be delivered makes it difficult to implement high performance, scalable applications on top of atomic broadcast primitives. Recently, a new approach has been proposed which, based on optimistic assumptions about the communication system, reduces the average delay for message delivery. In this paper, we develop this idea further and present a replicated database architecture that employs the new atomic broadcast primitive in such a way that the coordination phase of the atomic broadcast is fully overlapped with th

    Distributed transaction processing in the Escada protocol

    Get PDF
    Replicação é uma técnica essencial para a implementação de bases de dados tolerantes a faltas, sendo também frequentemente utilizada para melhorar o seu desempenho. Infelizmente, quando critérios de consistência forte e a capacidade de actualização a partir de qualquer réplica são consideradas, os protocolos de replicação actualmente disponíveis nos gestores de bases de dados comerciais não apresentam um bom desempenho. O problema está relacionado ao custo produzido pelas interacções entre as réplicas no intuito de garantir a consistência, e pelos protocolos de terminação que procuram assegurar que todas as réplicas concordam com o resultado da transacção. De uma maneira geral, o número de “aborts”, “deadlocks” e mensagens trocadas cresce de maneira drástica, ao aumentar o número de réplicas. Em outros trabalhos, foi provado que a replicação de base de dados num cenário desses é impraticável. No intuito de resolver esses problemas, diversos estudos têm sido desenvolvidos. Inicialmente, a maioria deles deixou de lado os requisitos de consistência forte ou a capacidade de actualização a partir de qualquer réplica para conseguir soluções viáveis. Recentemente, protocolos de replicação baseados em comunicação em grupo foram propostos, nos quais os requisitos de consistência forte e actualização a partir de qualquer réplica são preservados e os problemas contornados. Neste contexto encontra-se o projecto Escada. Sucintamente, ele tem como objectivo estudar, projectar e implementar mecanismos de replicação transaccionais adequados para sistemas distribuídos de larga escala. Em particular, o projecto explora as técnicas de replicação parcial para fornecer critérios de consistência forte sem introduzir pesos significantes de sincronização e sem prejudicar o desempenho. Nesta dissertação, extendemos o projecto Escada com um modelo e um mecanismo de processamento de consultas distribuído, o que é um requisito inevitável num ambiente de replicação parcial. Além disso, explorando características dos protocolos, propomos um cache semântico para reduzir o peso gerado ao aceder a réplicas remotas. Também melhoramos o processo de certificação, ao procurar reduzir os “aborts”, utilizando informação semântica presente nas transacções. Finalmente, para avaliar os protocolos desenvolvidos pelo projecto Escada, o cache semântico e o processo de certificação utilizamos um modelo de simulação que combina código simulado e real, o que nos permite avaliar nossas propostas em diferentes cenários e configurações. Mais do que isso, ao invés de usar cargas fictícias, submetemos nossas propostas a cargas baseadas nos “benchmarks” TPC-W e TPC-C.Database replication is an invaluable technique to implement fault-tolerant databases, being also frequently used to improve database performance. Unfortunately, when strong consistency among the replicas and the ability to update the database at any of the replicas are considered, the replication protocols do not scale up. The problem is related to the number of interactions among the replicas in order to guarantee consistency and to the protocols used to ensure that all the replicas agree on transactions’ result. Roughly, the number of aborts, deadlocks and messages exchanged among the replicas grows drastically, when the number of replicas increases. In related works, it has been proved that database replication in such a scenario is impractical. In order to overcome these problems, several studies have been developed. Initially, most of them released the strong consistency and the update-anywhere requirements to achieve feasible solutions. Recently, replication protocols based on group communication were proposed, in which the strong consistency and update-anywhere requirements are preserved and the problems circumvented. This is the context of the Escada project. Briefly, it aims to study, design and implement transaction replication mechanisms suited to large scale distributed systems. In particular, the project exploits partial replication techniques to provide strong consistency criteria without introducing significant synchronization and performance overheads. In this thesis, we augment the Escada with a distributed query processing model and mechanism, which is an inevitable requirement in a partially replicated environment. Moreover, exploiting characteristics of its protocols, we propose a semantic cache to reduce the overhead generated while accessing remote replicas. We also improve the certification process, while attempting to reduce aborts using the semantic information available in the transactions. Finally, to evaluate the Escada protocols, the semantic caching and the certification process, we use a simulation model that combines simulated and real code, which allows to evaluate our proposals under distinct scenarios and configurations. Furthermore, instead of using unrealistic workloads, we test our proposals using workloads based on the TPC-W and TPC-C benchmarks.Fundação para a Ciência e a Tecnologia - POSI/CHS/41285/2001

    Distributed Versioning: Consistent Replication for Scaling Back-end Databases of Dynamic Content Sites

    Get PDF
    Dynamic content Web sites consist of a front-end Web server, an application server and a back-end database. In this paper we introduce distributed versioning, a new method for scaling the back-end database through replication. Distributed versioning provides both the consistency guarantees of eager replication and the scaling properties of lazy replication. It does so by combining a novel concurrency control method based on explicit versions with conflict-aware query scheduling that reduces the number of lock conflicts. We evaluate distributed versioning using three dynamic content applications: the TPC-W e-commerce benchmark with its three workload mixes, an auction site benchmark, and a bulletin board benchmark. We demonstrate that distributed versioning scales better than previous methods that provide consistency. Furthermore, we demonstrate that the benefits of relaxing consistency are limited, except for the conflict-heavy TPC-W ordering mix

    Sincronización de la información de tasación en entornos convergentes

    Get PDF
     The services and systems are in the midst of a major transformation. Users are becoming more diverse in their needs and requirements, resulting in an increased demand for sophisticated, individualized communications services that improve their experience. In this context, information management is an important aspect, so charging information is a key element when we want to provide services in a converged environment. Currently, charging data are distributed between different network elements involved in service provisioning. Therefore, it is a challenge to implement a mechanism to keep data consistency that allows for a real services convergence. This article proposes a mechanism to ensure data synchronization and consistency in IMS environments. To achieve this goal, we adopted databases synchronization techniques already tested in the information technology field.Los servicios y los sistemas de telecomunicaciones se encuentran actualmente en un proceso de transformación. Los usuarios son cada vez más diversos en sus necesidades y requerimientos, que demandan más sofisticación y servicios de comunicación con características individualizadas que mejoren su experiencia. En este contexto, la gestión de la información toma un papel preponderante, y la información de tasación es un elemento clave en la provisión de servicios en un entorno de convergencia.  Actualmente, los datos de tasación se encuentran distribuidos entre los diferentes elementos de red envueltos en el despliegue de un servicio, lo que implica contar con un mecanismo que mantenga la consistencia de la información en la prestación de los servicios convergentes. Este artículo propone un mecanismo para asegurar la sincronización y la consistencia de datos en un entorno IMS (IP Multimedia Subsystem), mediante técnicas de replicación de bases de datos ya evaluadas en el campo de las tecnologías de la información

    Sincronización de la información de tasación en entornos convergentes

    Get PDF
     The services and systems are in the midst of a major transformation. Users are becoming more diverse in their needs and requirements, resulting in an increased demand for sophisticated, individualized communications services that improve their experience. In this context, information management is an important aspect, so charging information is a key element when we want to provide services in a converged environment. Currently, charging data are distributed between different network elements involved in service provisioning. Therefore, it is a challenge to implement a mechanism to keep data consistency that allows for a real services convergence. This article proposes a mechanism to ensure data synchronization and consistency in IMS environments. To achieve this goal, we adopted databases synchronization techniques already tested in the information technology field.Los servicios y los sistemas de telecomunicaciones se encuentran actualmente en un proceso de transformación. Los usuarios son cada vez más diversos en sus necesidades y requerimientos, que demandan más sofisticación y servicios de comunicación con características individualizadas que mejoren su experiencia. En este contexto, la gestión de la información toma un papel preponderante, y la información de tasación es un elemento clave en la provisión de servicios en un entorno de convergencia.  Actualmente, los datos de tasación se encuentran distribuidos entre los diferentes elementos de red envueltos en el despliegue de un servicio, lo que implica contar con un mecanismo que mantenga la consistencia de la información en la prestación de los servicios convergentes. Este artículo propone un mecanismo para asegurar la sincronización y la consistencia de datos en un entorno IMS (IP Multimedia Subsystem), mediante técnicas de replicación de bases de datos ya evaluadas en el campo de las tecnologías de la información
    corecore