11 research outputs found
Fast Algorithms for Maintaining Replica Consistency in Lazy Master Replicated Databases
Projet RODIN, Projet REFLECSIn a lazy master replicated database, a transaction can commit after updating one replica copy (primary copy) at some master node. After the transaction commits, the updates are propagated towards the other replicas (secondary copies), which are updated in separate refresh transactions. A central problem is the design of algorithms that maintain replica's consistency while at the same time minimizing the performance degradation due to the synchronization of refresh transactions. In this paper, we propose a simple and general refreshment algorithm that solves this problem and we prove its correctness. The principle of the algorithm is to let refresh transactions wait for a certain «deliver time» before being executed at a node having secondary copies. We then present two main optimizations to this algorithm. One is based on specific properties of the topology of replica distribution across nodes. In particular, we characterize the nodes for which the deliver time can be null. The other improves the refreshment algorithm by using an immediate update propagation strategy. Our performance evaluation demonstrate the effectiveness of this optimization
Updates in Highly Unreliable, Replicated Peer-to-Peer Systems
In this paper we study the problem of updates in truly decentralised and self-organising systems such as pure P2P systems. We assume low online probabilities of the peers (Full Document</a
RAIDb: Redundant Array of Inexpensive Databases
Clusters of workstations become more and more popular to power data server applications such as large scale Web sites or e-Commerce applications. There has been much research on scaling the front tiers (web servers and application servers) using clusters, but databases usually remain on large dedicated SMP machines. In this paper, we address database performance scalability and high availability using clusters of commodity hardware. Our approach consists of studying different replication and partitioning strategies to achieve various degree of performance and fault tolerance. We propose the concept of Redundant Array of Inexpensive Databases (RAIDb). RAIDb is to databases what RAID is to disks. RAIDb aims at providing better performance and fault tolerance than a single database, at low cost, by combining multiple database instances into an array of databases. Like RAID, we define different RAIDb levels that provide various cost/performance/fault tolerance tradeoffs. RAIDb-0 features full partitioning, RAIDb-1 offers full replication and RAIDb-2 introduces an intermediate solution called partial replication, in which the user can define the degree of replication of each database table. We present a Java implementation of RAIDb called Clustered JDBC or C-JDBC. C-JDBC achieves both database performance scalability and high availability at the middleware level without changing existing applications. We show, using the TPC-W benchmark, that RAIDb-2 can offer better performance scalability (up to 25%) than traditional approaches by allowing fine-grain control on replication. Distributing and restricting the replication of frequently written tables to a small set of backends reduces I/O usage and improves CPU utilization of each cluster node
Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems
Providing the ability to increase or decrease allocated resources on demand as the transactional load varies is essential for database management systems (DBMS) deployed on today's computing platforms, such as the cloud. The need to maintain consistency of the database, at very large scales, while providing high performance and reliability makes elasticity particularly challenging. In this thesis, we exploit data partitioning as a way to provide elastic DBMS scalability. We assert that the flexibility provided by a partitioned, shared-nothing parallel DBMS can be used to implement elasticity. Our idea is to start with a small number of servers that manage all the partitions, and to elastically scale out by dynamically adding new servers and redistributing database partitions among these servers as the load varies. Implementing this approach requires (a) efficient mechanisms for addition/removal of servers and migration of partitions, and (b) policies to efficiently determine the optimal placement of partitions on the given servers as well as plans for partition migration.
This thesis presents Elasca, a system that implements both these features in an existing shared-nothing DBMS (namely VoltDB) to provide automatic elastic scalability. Elasca consists of a mechanism for enabling elastic scalability, and a workload-aware optimizer for determining optimal partition placement and migration plans. Our optimizer minimizes computing resources required and balances load effectively without compromising system performance, even in the presence of variations in intensity and skew of the load. The results of our experiments show that Elasca is able to achieve performance close to a fully provisioned system while saving 35% resources on average. Furthermore, Elasca's workload-aware optimizer performs up to 79% less data movement than a greedy approach to resource minimization, and also balance load much more effectively
EdgeX: Edge Replication for Web Applications
Global web applications face the problem of high network latency due to their need to
communicate with distant data centers. Many applications use edge networks for caching
images, CSS, javascript, and other static content in order to avoid some of this network
latency. However, for updates and for anything other than static content, communication
with the data center is still required, and can dominate application request latencies. One
way to address this problem is to push more of the web application, as well the database on
which it depends, from the remote data center towards the edge of the network. This thesis
presents preliminary work in this direction. Speci cally, it presents an edge-aware dynamic
data replication architecture for relational database systems supporting web applications.
The objective is to allow dynamic content to be served from the edge of the network, with
low latency
Fast algorithms for maintaining replica consistency in lazy master replicated databases
Themes 3 et 1 - Interaction homme-machine, images, donnees, connaissances - Reseaux et systemes. Projets RODIN et REFLECSSIGLEAvailable from INIST (FR), Document Supply Service, under shelf-number : 14802 E, issue : a.1999 n.3654 / INIST-CNRS - Institut de l'Information Scientifique et TechniqueFRFranc
Métricas de performance en administración de bases de datos distribuidas en redes LAN y WAN
El presente trabajo presenta un estudio de las principales caracterÃsticas que debe seguir un esquema de replicación de datos. A partir de cada una de las caracterÃsticas en juego, se definieron modelo de simulación que permite evaluar el comportamiento posible del modelo de datos y su esquema de replicación. Con los resultados obtenidos es posible evaluar distintas alternativas de solución y, de esa forma, aproximarse al esquema que mejor se adecue para el problema que se está estudiandoFacultad de Informátic