Search CORE

1,140 research outputs found

Middleware-based Database Replication: The Gaps between Theory and Practice

Author: Ailamaki Anastasia
Candea George
Cecchet Emmanuel
Publication venue
Publication date: 01/01/2008
Field of study

The need for high availability and performance in data management systems has been fueling a long running interest in database replication from both academia and industry. However, academic groups often attack replication problems in isolation, overlooking the need for completeness in their solutions, while commercial teams take a holistic approach that often misses opportunities for fundamental innovation. This has created over time a gap between academic research and industrial practice. This paper aims to characterize the gap along three axes: performance, availability, and administration. We build on our own experience developing and deploying replication systems in commercial and academic settings, as well as on a large body of prior related work. We sift through representative examples from the last decade of open-source, academic, and commercial database replication systems and combine this material with case studies from real systems deployed at Fortune 500 customers. We propose two agendas, one for academic research and one for industrial R&D, which we believe can bridge the gap within 5-10 years. This way, we hope to both motivate and help researchers in making the theory and practice of middleware-based database replication more relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on Management of Data, Vancouver, Canada, June 200

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

pH1: a transactional middleware for NoSQL

Author: Coelho Fábio André Castanheira Luís
Cruz Francisco
Oliveira Rui Carlos Mendes de
Pereira José
Vilaça Ricardo Manuel Pereira
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

NoSQL databases opt not to offer important abstractions traditionally found in relational databases in order to achieve high levels of scalability and availability: transactional guarantees and strong data consistency. In this work we propose pH1, a generic middleware layer over NoSQL databases that offers transactional guarantees with Snapshot Isolation. This is achieved in a non-intrusive manner, requiring no modifications to servers and no native support for multiple versions. Instead, the transactional context is achieved by means of a multiversion distributed cache and an external transaction certifier, exposed by extending the client’s interface with transaction bracketing primitives. We validate and evaluate pH1 with Apache Cassandra and Hyperdex. First, using the YCSB benchmark, we show that the cost of providing ACID guarantees to these NoSQL databases amounts to 11% decrease in throughput. Moreover, using the transaction intensive TPC-C workload, pH1 presented an impact of 22% decrease in throughput. This contrasts with OMID, a previous proposal that takes advantage of HBase’s support for multiple versions, with a throughput penalty of 76% in the same conditions

Universidade do Minho: RepositoriUM

Recommended from our members

Improvements Relating to Database Replication Protocols

Author: Popov P. T.
Stankovic V.
Publication venue
Publication date: 09/10/2013
Field of study

The present invention concerns improvements relating to database replication. More specifically, aspects of the present invention relate to a fault-tolerant node and a method for avoiding non-deterministic behaviour in the management of synchronous database systems

City Research Online

Implementation and test of transactional primitives over Cassandra

Author: Coelho Fábio André Castanheira Luís
Publication venue
Publication date: 01/01/2013
Field of study

Dissertação de mestrado em Engenharia InformáticaNoSQL databases opt not to offer important abstractions traditionally found in relational databases in order to achieve high levels of scalability and availability: transactional guarantees and strong data consistency. These limitations bring considerable complexity to the development of client applications and are therefore an obstacle to the broader adoption of the technology. In this work we propose a middleware layer over NoSQL databases that offers transactional guarantees with Snapshot Isolation. The proposed solution is achieved in a non-intrusive manner, providing to the clients the same interface as a NoSQL database, simply adding the transactional context. The transactional context is the focus of our contribution and is modularly based on a Non Persistent Version Store that holds several versions of elements and interacts with an external transaction certifier. In this work, we present an implementation of our system over Apache Cassandra and by using two representative benchmarks, YCSB and TPC-C, we measure the cost of adding transactional support with ACID guarantees.As bases de dados NoSQL optam por não oferecer importantes abstrações tradicionalmente encontradas nas bases de dados relacionais, de modo a atingir elevada escalabilidade e disponibilidade: garantias transacionais e critérios de coerência de dados fortes. Estas limitações resultam em maior complexidade no desenvolvimento de aplicações e são por isso um obstáculo à ampla adoção do paradigma. Neste trabalho, propomos uma camada de middleware sobre bases de dados NoSQL que oferece garantias transacionais com Snapshot Isolation. A abordagem proposta e não-intrusiva, apresentando aos clientes a mesma interface NoSQL, acrescendo o contexto transacional. Este contexto transacional e o cerne da nossa contribuição e assenta modularmente num repositório de versões não-persistente e num certificador externo de transações concorrentes. Neste trabalho, apresentamos uma implementação do nosso sistema sobre Apache Cassandra e, recorrendo a dois benchmarks representativos, YCBS e TPC-C, medimos o custo do suporte do paradigma transacional com garantias transacionais ACID.Fundação para a Ciência e a Tecnologia (FCT) - Project Stratus/FCOMP-01-0124-FEDER-015020; within project Pest/ FCOMP-01-0124-FEDER-022701.ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness).European Union Seventh Framework Programme (FP7) under grant agreement no 257993 (CumuloNimbo)

Universidade do Minho: RepositoriUM

Efficient middleware for database replication

Author: Ferreira André Abecasis Gomes
Publication venue: FCT-UNL
Publication date: 01/01/2008
Field of study

Dissertação de Mestrado em Engenharia InformáticaDatabase systems are used to store data on the most varied applications, like Web applications, enterprise applications, scientific research, or even personal applications. Given the large use of database in fundamental systems for the users, it is necessary that database systems are efficient e reliable. Additionally, in order for these systems to serve a large number of users, databases must be scalable, to be able to process large numbers of transactions. To achieve this, it is necessary to resort to data replication. In a replicated system, all nodes contain a copy of the database. Then, to guarantee that replicas converge, write operations must be executed on all replicas. The way updates are propagated leads to two different replication strategies. The first is known as asynchronous or optimistic replication, and the updates are propagated asynchronously after the conclusion of an update transaction. The second is known as synchronous or pessimistic replication, where the updates are broadcasted synchronously during the transaction. In pessimistic replication, contrary to the optimistic replication, the replicas remain consistent. This approach simplifies the programming of the applications, since the replication of the data is transparent to the applications. However, this approach presents scalability issues, caused by the number of exchanged messages during synchronization, which forces a delay to the termination of the transaction. This leads the user to experience a much higher latency in the pessimistic approach. On this work is presented the design and implementation of a database replication system, with snapshot isolation semantics, using a synchronous replication approach. The system is composed by a primary replica and a set of secondary replicas that fully replicate the database- The primary replica executes the read-write transactions, while the remaining replicas execute the read-only transactions. After the conclusion of a read-write transaction on the primary replica the updates are propagated to the remaining replicas. This approach is proper to a model where the fraction of read operations is considerably higher than the write operations, allowing the reads load to be distributed over the multiple replicas. To improve the performance of the system, the clients execute some operations speculatively, in order to avoid waiting during the execution of a database operation. Thus, the client may continue its execution while the operation is executed on the database. If the result replied to the client if found to be incorrect, the transaction will be aborted, ensuring the correctness of the execution of the transactions

Repositório da Universidade Nova de Lisboa

Extending DBMSs with satellite databases

Author: Alonso Gustavo
Plattner Christian
Özsu M.
Publication venue
Publication date: 18/06/2018
Field of study

In this paper, we propose an extensible architecture for database engines where satellite databases are used to scale out and implement additional functionality for a centralized database engine. The architecture uses a middleware layer that offers consistent views and a single system image over a cluster of machines with database engines. One of these engines acts as a master copy while the others are read-only snapshots which we call satellites. The satellites are lightweight DBMSs used for scalability and to provide functionality difficult or expensive to implement in the main engine. Our approach also supports the dynamic creation of satellites to be able to autonomously adapt to varying loads. The paper presents the architecture, discusses the research problems it raises, and validates its feasibility with extensive experimental result

RERO DOC Digital Library

PaRiS: Causally Consistent Transactions with Non-blocking Reads and Partial Replication

Author: Didona Diego
Spirovska Kristina
Zwaenepoel Willy
Publication venue
Publication date: 25/02/2019
Field of study

Geo-replicated data platforms are at the backbone of several large-scale online services. Transactional Causal Consistency (TCC) is an attractive consistency level for building such platforms. TCC avoids many anomalies of eventual consistency, eschews the synchronization costs of strong consistency, and supports interactive read-write transactions. Partial replication is another attractive design choice for building geo-replicated platforms, as it increases the storage capacity and reduces update propagation costs. This paper presents PaRiS, the first TCC system that supports partial replication and implements non-blocking parallel read operations, whose latency is paramount for the performance of read-intensive applications. PaRiS relies on a novel protocol to track dependencies, called Universal Stable Time (UST). By means of a lightweight background gossip process, UST identifies a snapshot of the data that has been installed by every DC in the system. Hence, transactions can consistently read from such a snapshot on any server in any replication site without having to block. Moreover, PaRiS requires only one timestamp to track dependencies and define transactional snapshots, thereby achieving resource efficiency and scalability. We evaluate PaRiS on a large-scale AWS deployment composed of up to 10 replication sites. We show that PaRiS scales well with the number of DCs and partitions, while being able to handle larger data-sets than existing solutions that assume full replication. We also demonstrate a performance gain of non-blocking reads vs. a blocking alternative (up to 1.47x higher throughput with 5.91x lower latency for read-dominated workloads and up to 1.46x higher throughput with 20.56x lower latency for write-heavy workloads)

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Supporting multiple isolation levels in replicated environments

Author: Adya
Agrawal
Berenson
Bernabé-Gisbert
Bernabé-Gisbert
Bernstein
Cecchet
Charron-Bost
El Abbadi
Elnikety
Fekete
Francesc D. Muñoz-Escoí
Gray
Hadzilacos
INCITS 135-1992 (R1998)
Jiménez-Peris
Josep M. Bernabé-Gisbert
Juárez-Rodríguez
Kemme
Kemme
Lin
Lin
MacCormick
Mak
Microsoft Corp.
Oracle Corp.
Pedone
Pedone
PostgreSQL Global Development Group
Ruiz-Fuertes
Ruiz-Fuertes
Salinas-Monteagudo
Transaction Processing Performance Council (TPC)
Verma
Wiesmann
Wiesmann
Publication venue: 'Elsevier BV'
Publication date: 01/10/2012
Field of study

Replication is used by databases to implement reliability and provide scalability. However, achieving transparent replication is not an easy task. A replicated database is transparent if it can seamlessly replace a standard stand-alone database without requiring any changes to the components of the system. Database replication transparency can be achieved if: (a) replication protocols remain hidden for all other components of the system; and (b) the functionality of a stand-alone database is provided. The ability to simultaneously execute transactions under different isolation levels is a functionality offered by all stand-alone databases but not by their replicated counterparts. Allowing different isolation levels may improve overall system performance. For example, the TPC-C benchmark specification tolerates execution of some transactions at weaker isolation levels in order to increase throughput of committed transactions. In this paper, we show how replication protocols can be extended to enable transactions to be executed under different isolation levels. © 2012 Elsevier B.V. All rights reserved.This work has been supported by the Spanish Ministerio de Ciencia e Innovation (MICINN) and the European Regional Development Fund (ERDF/FEDER) under research grants TIN2009-14460-C03-01 and TIN2010-17193. The translation of this paper was funded by the Universitat Politecnica de Valencia, Spain.Bernabe Gisbert, JM.; Muñoz Escoí, FD. (2012). Supporting multiple isolation levels in replicated environments. Data and Knowledge Engineering. 79-80:1-16. doi:10.1016/j.datak.2012.05.001S11679-8

Crossref

RiuNet

Recommended from our members

Fault tolerance via diversity for off-the-shelf products: A study with SQL database servers

Author: Gashi I.
Popov P. T.
Strigini L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2007
Field of study

If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present

City Research Online

Crossref