Search CORE

10 research outputs found

The leganet system: Freshness-aware transaction routing in a database cluster

Author: Alonso
Amza
Bernstein
Esther Pacitti
Hubert Naacke
Jiménez-Peris
Kemme
Nikolaou
Pacitti
Pacitti
Patrick Valduriez
Rahm
Stacey
Stéphane Gançarski
Valduriez
Waas
Özsu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Ensuring Serializable Executions with Snapshot Isolation DBMS

Author: Alomari Mohammad
Publication venue: FACULTY OF ENGINEERING AND INFORMATION TECHNOLOGIES
Publication date: 01/01/2009
Field of study

Snapshot Isolation (SI) is a multiversion concurrency control that has been implemented by open source and commercial database systems such as PostgreSQL and Oracle. The main feature of SI is that a read operation does not block a write operation and vice versa, which allows higher degree of concurrency than traditional two-phase locking. SI prevents many anomalies that appear in other isolation levels, but it still can result in non-serializable execution, in which database integrity constraints can be violated. Several techniques have been proposed to ensure serializable execution with engines running SI; these techniques are based on modifying the applications by introducing conflicting SQL statements. However, with each of these techniques the DBA has to make a difficult choice among possible transactions to modify. This thesis helps the DBA’s to choose between these different techniques and choices by understanding how the choices affect system performance. It also proposes a novel technique called ’External Lock Manager’ (ELM) which introduces conflicts in a separate lock-manager object so that every execution will be serializable. We build a prototype system for ELM and we run experiments to demonstrate the robustness of the new technique compare to the previous techniques. Experiments show that modifying the application code for some transactions has a high impact on performance for some choices, which makes it very hard for DBA’s to choose wisely. However, ELM has peak performance which is similar to SI, no matter which transactions are chosen for modification. Thus we say that ELM is a robust technique for ensure serializable execution

Estudo Geral

Sydney eScholarship

Ensuring Serializable Executions with Snapshot Isolation DBMS

Author: Alomari Mohammad
Publication venue: Faculty of Engineering and Information Technologies
Publication date: 01/01/2009
Field of study

Sydney eScholarship

Integrated approach to recovery and high availability in an updatable, distributed data warehouse

Author: Lau Edmond, M. Eng. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 99-105).Any highly available data warehouse will use some form of data replication to ensure that it can continue to service queries despite machine failures. In this thesis, I demonstrate that it is possible to leverage the data replication available in these environments to build a simple yet efficient crash recovery mechanism that revives a crashed site by querying remote replicas for missing updates. My new integrated approach to recovery and high availability, called HARBOR (High Availability and Replication-Based Online Recovery), targets updatable data warehouses and offers an attractive alternative to the widely used log-based crash recovery algorithms found in existing database systems. Aside from its simplicity over log-based approaches, HARBOR also avoids the runtime overhead of maintaining an on-disk log, accomplishes recovery without quiescing the system, allows replicated data to be stored in non-identical formats, and supports the parallel recovery of multiple sites and database objects. To evaluate HARBOR's feasibility, I compare HARBOR's runtime overhead and recovery performance with those of two-phase commit and ARIES, the gold standard for log-based recovery, on a four-node distributed database system that I have implemented.(cont.) My experiments show that HARBOR incurs lower runtime overhead because it does not require log writes to be forced to disk during transaction commit. Furthermore, they indicate that HARBOR's recovery performance is comparable to ARIES's performance on many workloads and even surpasses it on characteristic warehouse workloads with few updates to historical data. The results are highly encouraging and suggest that my integrated approach is quite tenable.by Edmond Lau.M.Eng

DSpace@MIT

Serializable Isolation for Snapshot Databases

Author: Cahill Michael James
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2009
Field of study

Many popular database management systems implement a multiversion concurrency control algorithm called snapshot isolation rather than providing full serializability based on locking. There are well-known anomalies permitted by snapshot isolation that can lead to violations of data consistency by interleaving transactions that would maintain consistency if run serially. Until now, the only way to prevent these anomalies was to modify the applications by introducing explicit locking or artificial update conflicts, following careful analysis of conflicts between all pairs of transactions. This thesis describes a modification to the concurrency control algorithm of a database management system that automatically detects and prevents snapshot isolation anomalies at runtime for arbitrary applications, thus providing serializable isolation. The new algorithm preserves the properties that make snapshot isolation attractive, including that readers do not block writers and vice versa. An implementation of the algorithm in a relational database management system is described, along with a benchmark and performance study, showing that the throughput approaches that of snapshot isolation in most cases

Sydney eScholarship

Consistency Models in Distributed Systems with Physical Clocks

Author: Du Jiaqing
Publication venue: Lausanne, EPFL
Publication date: 09/10/2014
Field of study

Most existing distributed systems use logical clocks to order events in the implementation of various consistency models. Although logical clocks are straightforward to implement and maintain, they may affect the scalability, availability, and latency of the system when being used to totally order events in strong consistency models. They can also incur considerable overhead when being used to track and check the causal relationships among events in some weak consistency models. In this thesis we explore how to efficiently implement different consistency models using loosely synchronized physical clocks. Compared with logical clocks, physical clocks move forward at approximately the same speed and can be loosely synchronized with well-known standard protocols. Hence a group of physical clocks located at different servers can be used to order events in a distributed system at very low cost. We first describe Clock-SI, a fully distributed implementation of snapshot isolation for partitioned data stores. It uses the local physical clock at each partition to assign snapshot and commit timestamps to transactions. By avoiding a centralized service for timestamp management, Clock-SI improves the throughput, latency, and availability of the system. We then introduce Clock-RSM, which is a low-latency state machine replication protocol that provides linearizability. It totally orders state machine commands by assigning them physical timestamps obtained from the local replica. By eliminating the message step for command ordering in existing solutions, Clock-RSM reduces the latency of consistent geo-replication across multiple data centers. Finally, we present Orbe, which provides an efficient and scalable implementation of causal consistency for both partitioned and replicated data stores. Orbe builds an explicit total order, consistent with causality, among all operations using physical timestamps. It reduces the number of dependencies that have to be carried in update replication messages and checked on installation of replicated updates. As a result, Orbe improves the throughput of the system

Infoscience - École polytechnique fédérale de Lausanne

Detecting and Tolerating Byzantine Faults in Database Systems

Author: Barbara Liskov
Benjamin Mead Vandiver
Benjamin Mead Vandiver
Benjamin Mead Vandiver
Publication venue
Publication date: 01/01/2008
Field of study

This thesis describes the design, implementation, and evaluation of a replication scheme to handle Byzantine faults in transaction processing database systems. The scheme compares answers from queries and updates on multiple replicas which are off-the-shelf database systems, to provide a single database that is Byzantine fault tolerant. The scheme works when the replicas are homogeneous, but it also allows heterogeneous replication in which replicas come from different vendors. Heterogeneous replicas reduce the impact of bugs and security compromises because they are implemented independently and are thus less likely to suffer correlated failures. A final component of the scheme is a repair mechanism that can correct the state of a faulty replica, ensuring the longevity of the scheme.The main challenge in designing a replication scheme for transaction processingsystems is ensuring that the replicas state does not diverge while allowing a high degree of concurrency. We have developed two novel concurrency control protocols, commit barrier scheduling (CBS) and snapshot epoch scheduling (SES) that provide strong consistency and good performance. The two protocols provide different types of consistency: CBS provides single-copy serializability and SES provides single-copy snapshot isolation. We have implemented both protocols in the context of a replicated SQL database. Our implementation has been tested with production versions of several commercial and open source databases as replicas. Our experiments show a configuration that can tolerate one faulty replica has only a modest performance overhead (about 10-20% for the TPC-C benchmark). Our implementation successfully masks several Byzantine faults observed in practice and we have used it to find a new bug in MySQL

CiteSeerX

DSpace@MIT

Recommended from our members

Performance Implications of Using Diverse Redundancy for Database Replication

Author: Stankovic V.
Publication venue
Publication date
Field of study

Using diverse redundancy for database replication is the focus of this thesis. Traditionally, database replication solutions have been built on the fail-stop failure assumption, i.e. that crashes are believed to cause a majority of failures. However, recent findings refuted this common assumption, showing that many of the faults cause systematic non-crash failures. These findings demonstrate that the existing, non-diverse database replication solutions, which use the same database server products, are ineffective fault-tolerant mechanisms. At the same time, the findings motivated the use of diverse redundancy (when different database server products are used) as a promising way of improving dependability. It seems that using a fault-tolerant server, built with diverse database servers, would deliver improvements in availability and failure rates compared with the individual database servers or their replicated, non-diverse configurations. Besides the potential for improving dependability, one would like to evaluate the performance implications of using diverse redundancy in the context of database replication. This is the focal point of the research. The work performed to that end can be summarised as follows: - We conducted a substantial performance evaluation of database replication using diverse redundancy. We compared its performance to the ones of various non-diverse configurations as well as non-replicated databases. The experiments revealed systematic differences in behaviour of diverse servers. They point to the potential for performance improvement when diverse servers are used. Under particular workloads diverse servers performed better than both non-diverse and non-replicated configurations. - We devised a middleware-based database replication protocol, which provides dependability assurance and guarantees database consistency. It uses an eager update everywhere approach for replica control. Although we focus on the use of diverse database servers, the protocol can be used with the database servers from the same vendor too. We provide the correctness criteria of the protocol. Different regimes of operation of the protocol are defined, which would allow it to be dynamically optimised for either dependability or performance improvements. Additionally, it can be used in conjunction with high-performance replication solutions. - We developed an experimental test harness for performance evaluation of different database replication solutions. It enabled us to evaluate the performance of the diverse database replication protocol, e.g. by comparing it against known replication solutions. We show that, as expected, the improved dependability exhibited by our replication protocol carries a performance overhead. Nevertheless, when optimised for performance improvement our protocol shows good performance. - In order to minimise the performance penalty introduced by the replication we propose a scheme whereby the database server processes are prioritised to deliver performance improvements in cases of low to modest resource utilisation by the database servers. - We performed an uncertainty-explicit assessment of database server products. Using an integrated approach, where both performance and reliability are considered, we rank different database server products to aid selection of the components for the fault-tolerant server built out of diverse databases

City Research Online

Dynamic Content Web Applications: Crash, Failover, And Recovery Analysis

Author: Buzato L.E.
Vieira G.M.D.
Zwaenepoel W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/11/2015
Field of study

This work assesses how crashes and recoveries affect the performance of a replicated dynamic content web application. RobustStore is the result of retrofitting TPC-W's on-line bookstore with Treplica, a middleware for building dependable applications. Implementations of Paxos and Fast Paxos are at the core of Treplica's efficient and programmer-friendly support for replication and recovery. The TPC-W benchmark, augmented with faultloads and dependability measures, is used to evaluate the behaviour of RobustStore. Experiments apply faultloads that cause sequential and concurrent replica crashes. RobustStore's performance drops by less than 13% during the recovery from two simultaneous replica crashes. When subject to an identical faultload and a shopping workload, a five-replicas RobustStore maintains an accuracy of 99.999%. Our results display not only good performance, total autonomy and uninterrupted availability, they also show that it is simple to develop efficient recovery-oriented applications using Treplica. ©2009 IEEE.229238Amza, C., Cox, A.L., Zwaenepoel, W., Distributed versioning: Consistent replication for scaling back-end databases of dynamic content web sites (2003) MiddlewareBurrows, M., The Chubby lock service for loosely-coupled distributed systems (2006) 7th USENIX Symp. on Operating Systems Design and ImplementationCain, H.W., Rajwar, R., Marden, M., Lipasti, M.H., An architectural evaluation of Java TPC-W (2001) Proc. of 7th Int. Symp. on High-Performance Computer ArchitectureCamargos, L., Pedone, F., Weiloch, M., Sprint: A middleware for high-performance transaction processing (2007) Proc. of 2nd European Conf. on Computer SystemsChang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Gruber, R.E., Bigtable: A distributed storage system for structured data (2008) ACM Trans. Comput. Syst, 26 (2), pp. 1-26T. P. Council. TPC-W Specification, Feb. 2002DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vogels, W., Dynamo: Amazon's highly available key-value store (2007) Proc. of 21st ACM SIGOPS Symp. on Operating Systems Principles, pp. 205-220X. De7acute;fago, A. Schiper, and P. Urbán. Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Comput. Surv., 36(4):372-421, 2004Durães, J., Vieira, M., Madeira, H., Dependability bench-marking of web-servers (2004) Proc. of 23rd Computer Safety, Reliability, and Security Int. Conf, pp. 297-310Elnikety, S., Dropsho, S., Pedone, F., Tashkent: Uniting durability with transaction ordering for high-performance scalable database replication (2006) Proc. of 1st European Conference on Computer Systems (EuroSys)Elnikety, S., Dropsho, S., Zwaenepoel, W., Tashkent+: Memory-aware load balancing and update filtering in replicated databases (2007) Proc. of the 2nd European Conference on Computer Systems (EuroSys)Isard, M., Autopilot: Automatic data center management (2007) Oper. Syst. Rev, 41, pp. 60-67Jiang, Y., Xue, G., You, J., Toward fault-tolerant atomic data access in mutable distributed hash tables (2006) Proc. of First Int. Multi-Symp. on Computer and Computational SciencesLamport, L., Time, Clocks, and the Ordering of Events in a Distributed System (1978) CACM, 21 (7), pp. 558-565Lamport, L., The part-time parliament (1998) ACM Trans. Comput. Syst, 16 (2), pp. 133-169Lamport, L., Paxos, F., (2006) Distrib. Comput, 19 (2), pp. 79-103. , OctLiang, W., Kemme, B., Online recovery in cluster databases (2008) EDBT '08: Proceedings of the 11th international conference on Extending database technology, pp. 121-132. , New York, NY, USA, ACMMacCormick, J., Murphy, N., Najork, M., Thekkath, C.A., Zhou, L., Boxwood: Abstractions as the foundation for storage infrastructure (2004) Proc. of 6th USENIX Symp. on Operating Systems Design and ImplementationManassiev, K., Amza, C., Scaling and continuous availability in database server clusters through multiversion replication (2007) Int. Conf. on Dependable Systems and NetworksOstell, J., Databases of discovery (2005) Queue, 3 (3), pp. 40-48Saito, Y., Frolund, S., Veitch, A., Merchant, A., Spence, S., Fab: Building distributed enterprise disk arrays from commodity components (2004) SIGPLAN Not, 39 (11). , 48-58Vieira, G.M.D., Buzato, L.E., Treplica: Ubiquitous replication (2008) Proc. of 26th Brazilian Symp. on Computer Networks and Distributed SystemsWu, S., Kemme, B., Postgres-R (SI): Combining Replica Control with Concurrency Control Based on Snapshot Isolation (2005) Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on, pp. 422-43

Repositorio da Producao Cientifica e Intelectual da Unicamp