118,011 research outputs found
Performance analysis of static locking in replicated distributed database systems
Data replication and transaction deadlocks can severely affect the performance of distributed database systems. Many current evaluation techniques ignore these aspects, because it is difficult to evaluate through analysis and time consuming to evaluate through simulation. A technique is used that combines simulation and analysis to closely illustrate the impact of deadlock and evaluate performance of replicated distributed database with both shared and exclusive locks
Effects of distributed database modeling on evaluation of transaction rollbacks
Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. The effect is studied of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks, in a partitioned distributed database system. Six probabilistic models and expressions are developed for the numbers of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results so obtained are compared to results from simulation. From here, it is concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughout is also grossly undermined when such models are employed
Effects of distributed database modeling on evaluation of transaction rollbacks
Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. Here, researchers investigate the effect of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks in a partitioned distributed database system. The researchers developed six probabilistic models and expressions for the number of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results obtained are compared to results from simulation. It was concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughput is also grossly undermined when such models are employed
Recommended from our members
Stochastic modeling for performance evaluation of database replication protocols
Performance is often the most important non-functional property for database systems and associated replication solutions. This is true at least in in-dustrial contexts. Evaluating performance using real systems, however, is com-putationally demanding and costly. In many cases, choosing between several competing replication protocols poses a difficulty in ranking these protocols meaningfully: the ranking is determined not so much by the quality of the com-peting protocols but, instead, by the quality of the available implementations. Addressing this difficulty requires a level of abstraction in which the impact on the comparison of the implementations is reduced, or entirely eliminated. We propose a stochastic model for performance evaluation of database replication protocols, paying particular attention to: i) empirical validation of a number of assumptions used in the stochastic model, and ii) empirical validation of model accuracy for a chosen replication protocol. For the empirical validations we used the TPC-C benchmark. Our implementation of the model is based on Stochastic Activity Networks (SAN), extended by bespoke code. The model may reduce the cost of performance evaluation in comparison with empirical measurements, while keeping the accuracy of the assessment to an acceptable level
Group communications and database replication:techniques, issues and performance
Databases are an important part of today's IT infrastructure: both companies and state institutions rely on database systems to store most of their important data. As we are more and more dependent on database systems, securing this key facility is now a priority. Because of this, research on fault-tolerant database systems is of increasing importance. One way to ensure the fault-tolerance of a system is by replicating it. Replication is a natural way to deal with failures: if one copy is not available, we use another one. However implementing consistent replication is not easy. Database replication is hardly a new area of research: the first papers on the subject are more than twenty years old. Yet how to build an efficient, consistent replicated database is still an open research question. Recently, a new approach to solve this problem has been proposed. The idea is to rely on some communication infrastructure called group communications. This infrastructure offers some high-level primitives that can help in the design and the implementation of a replicated database. While promising, this approach to database replication is still in its infancy. This thesis focuses on group communication-based database replication and strives to give an overall understanding of this topic. This thesis has three major contributions. In the structural domain, it introduces a classification of replication techniques. In the qualitative domain, an analysis of fault-tolerance semantics is proposed. Finally, in the quantitative domain, a performance evaluation of group communication-based database replication is presented. The classification gives an overview of the different means to implement database replication. Techniques described in the literature are sorted using this classification. The classification highlights structural similarities of techniques originating from different communities (database community and distributed system community). For each category of the classification, we also analyse the requirements imposed on the database component and group communication primitives that are needed to enforce consistency. Group communication-based database replication implies building a system from two different components: a database system and a group communication system. Fault-tolerance is an end-to-end property: a system built from two components tends to be as fault-tolerant as the weakest component. The analysis of fault-tolerance semantics show what fault-tolerance guarantee is ensured by group communication based replication techniques. Additionally a new faulttolerance guarantee, group-safety, is proposed. Group-safety is better suited to group communication-based database replication. We also show that group-safe replication techniques can offer improved performance. Finally, the performance evaluation offers a quantitative view of group communication based replication techniques. The performance of group communication techniques and classical database replication techniques is compared. The way those different techniques react to different loads is explored. Some optimisation of group communication techniques are also described and their performance benefits evaluated
Self-organizing strategies for a column-store database
Column-store database systems open new vistas for improved maintenance through self-organization. Individual columns are the focal point, which simplify balancing conflicting requirements. This work presents two workload-driven self-organizing techniques in a column-store, i.e. adaptive segmentation and adaptive replication. Adaptive segmentation splits a column into non-overlapping segments based on the actual query load. Likewise, adaptive replication creates segment replicas. The strategies can support different application requirements by trading off the reorganization overhead for storage cost. Both techniques can significantly improve system performance as demonstrated in an evaluation of different scenarios
Database High Availability using SHADOW Systems
Various High Availability DataBase systems (HADB) are used to provide high availability. Pairing an active database system with a standby system is one commonly used HADB techniques. The active system serves read/write workloads. One or more standby systems replicate the active and serve read-only workloads. Though widely used, this technique has some significant drawbacks: The active system becomes the bottleneck under heavy write workloads. Replicating changes synchronously from the active to the standbys further reduces the performance of the active system. Asynchronous replication, however, risk the loss of updates during failover. The shared-nothing architecture of active-standby systems is unnecessarily complex and cost inefficient.
In this thesis we present SHADOW systems, a new technique for database high availability. In a SHADOW system, the responsibility for database replication is pushed from the database systems into a shared, reliable, storage system. The active and standby systems share access to a single logical copy of the database, which resides in shared storage. SHADOW introduces write offloading, which frees the active system from the need to update the persistent database, placing that responsibility on the underutilized standby system instead. By exploiting shared storage, SHADOW systems avoid the overhead of database-managed synchronized replication, while ensuring that no updates will be lost during a failover. We have implemented a SHADOW system using PostgreSQL, and we present the results of a performance evaluation that shows that the SHADOW system can outperform both traditional synchronous replication and standalone PostgreSQL systems
Design and Analysis of a Logless Dynamic Reconfiguration Protocol
Distributed replication systems based on the replicated state machine model
have become ubiquitous as the foundation of modern database systems. To ensure
availability in the presence of faults, these systems must be able to
dynamically replace failed nodes with healthy ones via dynamic reconfiguration.
MongoDB is a document oriented database with a distributed replication
mechanism derived from the Raft protocol. In this paper, we present
MongoRaftReconfig, a novel dynamic reconfiguration protocol for the MongoDB
replication system. MongoRaftReconfig utilizes a logless approach to managing
configuration state and decouples the processing of configuration changes from
the main database operation log. The protocol's design was influenced by
engineering constraints faced when attempting to redesign an unsafe, legacy
reconfiguration mechanism that existed previously in MongoDB. We provide a
safety proof of MongoRaftReconfig, along with a formal specification in TLA+.
To our knowledge, this is the first published safety proof and formal
specification of a reconfiguration protocol for a Raft-based system. We also
present results from model checking its safety properties on finite protocol
instances. Finally, we discuss the conceptual novelties of MongoRaftReconfig,
how it can be understood as an optimized and generalized version of the single
server reconfiguration algorithm of Raft, and present an experimental
evaluation of how its optimizations can provide performance benefits for
reconfigurations.Comment: 35 pages, 2 figure
- …