20 research outputs found
Recommended from our members
A Comparative Study of Divergence Control Algorithms
This paper evaluates and compares the performance of two-phase locking divergence control (2PLDC) and optimistic divergence control (ODC) algorithms using a comprehensive centralized database simulation model. We examine a system with multiclass workloads in which on-line update transactions and long-duration queries progress based on epsilon serializability (ESR). Our results demonstrate that significant performance enhancements can be achieved with a non-zero tolerable inconsistency (ϵ-spec). With sufficient ϵ-spec and limited system resources, both algorithms achieve comparable performance. However, with low resource contention, ODC performs significantly better than 2PLDC. Moreover, given a small ϵ-spec, ODC returns more accurate results on the committed queries then 2PLDC
Recommended from our members
Execution Autonomy in Distributed Transaction Processing
We study the feasibility of execution autonomy in systems with asynchronous transaction processing based on epsilon-serializability (ESR). The abstract correctness criteria defined by ESR are implemented by techniques such as asynchronous divergence control and asynchronous consistency restoration. Concrete application examples in a distributed environment, such as banking, are described in order to illustrate the advantages of using ESR to support execution autonomy
Recommended from our members
A Formal Characterization of Epsilon Serializability
Epsilon Serializability (ESR) is a generalization of classic serializability (SR). ESR allows some limited amount of inconsistency in transaction processing (TP), through an interface called epsilon-transactions (ETs). For example, some query ETs may view inconsistent data due to non-SR interleaving with concurrent updates. In this paper, we restrict our attention to the situation where query-only ETs run concurrently with consistent update transactions that are SR without the ETs. This paper presents a formal characterization of ESR and ETs. Using the ACTA framework, the first part of this characterization formally expresses the inter-transaction conflicts that are recognized by ESR and, through that, defines ESR, analogous to the manner in which conflict-based serializability is defined. The second part of the paper is devoted to deriving expressions for: (1) the inconsistency in the values of data -- arising from ongoing updates, (2) the inconsistency of the results of a query ““ arising from the inconsistency of the data read in order to process the query, and (3) the inconsistency exported by an update ET - arising from ongoing queries reading uncommitted data produced by the update ET. These expressions are used to determine the preconditions that ET operations have to satisfy in order to maintain the limits on the inconsistency in the data read by query ETs, the inconsistency exported by update ETs, and the inconsistency in the results of queries. This determination suggests possible mechanisms that can be used to realize ESR
The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis
Datastores today rely on distribution and replication to achieve improved
performance and fault-tolerance. But correctness of many applications depends
on strong consistency properties - something that can impose substantial
overheads, since it requires coordinating the behavior of multiple nodes. This
paper describes a new approach to achieving strong consistency in distributed
systems while minimizing communication between nodes. The key insight is to
allow the state of the system to be inconsistent during execution, as long as
this inconsistency is bounded and does not affect transaction correctness. In
contrast to previous work, our approach uses program analysis to extract
semantic information about permissible levels of inconsistency and is fully
automated. We then employ a novel homeostasis protocol to allow sites to
operate independently, without communicating, as long as any inconsistency is
governed by appropriate treaties between the nodes. We discuss mechanisms for
optimizing treaties based on workload characteristics to minimize
communication, as well as a prototype implementation and experiments that
demonstrate the benefits of our approach on common transactional benchmarks
Performance assessment of real-time data management on wireless sensor networks
Technological advances in recent years have allowed the maturity of Wireless Sensor Networks
(WSNs), which aim at performing environmental monitoring and data collection. This sort of
network is composed of hundreds, thousands or probably even millions of tiny smart computers
known as wireless sensor nodes, which may be battery powered, equipped with sensors, a radio
transceiver, a Central Processing Unit (CPU) and some memory. However due to the small size and
the requirements of low-cost nodes, these sensor node resources such as processing power, storage
and especially energy are very limited.
Once the sensors perform their measurements from the environment, the problem of data
storing and querying arises. In fact, the sensors have restricted storage capacity and the on-going
interaction between sensors and environment results huge amounts of data. Techniques for data
storage and query in WSN can be based on either external storage or local storage. The external
storage, called warehousing approach, is a centralized system on which the data gathered by the
sensors are periodically sent to a central database server where user queries are processed. The
local storage, in the other hand called distributed approach, exploits the capabilities of sensors
calculation and the sensors act as local databases. The data is stored in a central database server
and in the devices themselves, enabling one to query both.
The WSNs are used in a wide variety of applications, which may perform certain operations on
collected sensor data. However, for certain applications, such as real-time applications, the sensor
data must closely reflect the current state of the targeted environment. However, the environment
changes constantly and the data is collected in discreet moments of time. As such, the collected
data has a temporal validity, and as time advances, it becomes less accurate, until it does not
reflect the state of the environment any longer. Thus, these applications must query and analyze
the data in a bounded time in order to make decisions and to react efficiently, such as industrial
automation, aviation, sensors network, and so on. In this context, the design of efficient real-time
data management solutions is necessary to deal with both time constraints and energy consumption.
This thesis studies the real-time data management techniques for WSNs. It particularly it focuses
on the study of the challenges in handling real-time data storage and query for WSNs and on the
efficient real-time data management solutions for WSNs.
First, the main specifications of real-time data management are identified and the available
real-time data management solutions for WSNs in the literature are presented. Secondly, in order to
provide an energy-efficient real-time data management solution, the techniques used to manage
data and queries in WSNs based on the distributed paradigm are deeply studied. In fact, many
research works argue that the distributed approach is the most energy-efficient way of managing
data and queries in WSNs, instead of performing the warehousing. In addition, this approach can provide quasi real-time query processing because the most current data will be retrieved from the
network.
Thirdly, based on these two studies and considering the complexity of developing, testing, and
debugging this kind of complex system, a model for a simulation framework of the real-time
databases management on WSN that uses a distributed approach and its implementation are
proposed. This will help to explore various solutions of real-time database techniques on WSNs
before deployment for economizing money and time. Moreover, one may improve the proposed
model by adding the simulation of protocols or place part of this simulator on another available
simulator. For validating the model, a case study considering real-time constraints as well as energy
constraints is discussed.
Fourth, a new architecture that combines statistical modeling techniques with the distributed
approach and a query processing algorithm to optimize the real-time user query processing are
proposed. This combination allows performing a query processing algorithm based on admission
control that uses the error tolerance and the probabilistic confidence interval as admission
parameters. The experiments based on real world data sets as well as synthetic data sets
demonstrate that the proposed solution optimizes the real-time query processing to save more
energy while meeting low latency.Fundação para a Ciência e Tecnologi
Optimistic replication
Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic ” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence — and provides a comprehensive survey of techniques developed for addressing these challenges
Estimating data divergence in cloud computing storage systems
Dissertação para obtenção do Grau de Mestre em
Engenharia InformáticaMany internet services are provided through cloud computing infrastructures that
are composed of multiple data centers. To provide high availability and low latency, data is replicated in machines in different data centers, which introduces the complexity of guaranteeing that clients view data consistently. Data stores often opt for a relaxed approach to replication, guaranteeing only eventual consistency, since it improves latency of operations. However, this may lead to replicas having different values for the same data.
One solution to control the divergence of data in eventually consistent systems is
the usage of metrics that measure how stale data is for a replica. In the past, several
algorithms have been proposed to estimate the value of these metrics in a deterministic
way. An alternative solution is to rely on probabilistic metrics that estimate divergence with a certain degree of certainty. This relaxes the need to contact all replicas while still providing a relatively accurate measurement.
In this work we designed and implemented a solution to estimate the divergence of
data in eventually consistent data stores, that scale to many replicas by allowing clientside caching. Measuring the divergence when there is a large number of clients calls for the development of new algorithms that provide probabilistic guarantees. Additionally, unlike previous works, we intend to focus on measuring the divergence relative to a state that can lead to the violation of application invariants.Partially funded by project PTDC/EIA EIA/108963/2008 and by an ERC Starting Grant, Agreement Number 30773
Performance characteristics of semantics-based concurrency control protocols.
by Keith, Hang-kwong Mak.Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 122-127).Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 2 --- Background --- p.4Chapter 2.1 --- Read/Write Model --- p.4Chapter 2.2 --- Abstract Data Type Model --- p.5Chapter 2.3 --- Overview of Semantics-Based Concurrency Control Protocols --- p.7Chapter 2.4 --- Concurrency Hierarchy --- p.9Chapter 2.5 --- Control Flow of the Strict Two Phase Locking Protocol --- p.11Chapter 2.5.1 --- Flow of an Operation --- p.12Chapter 2.5.2 --- Response Time of a Transaction --- p.13Chapter 2.5.3 --- Factors Affecting the Response Time of a Transaction --- p.14Chapter 3 --- Semantics-Based Concurrency Control Protocols --- p.16Chapter 3.1 --- Strict Two Phase Locking --- p.16Chapter 3.2 --- Conflict Relations --- p.17Chapter 3.2.1 --- Commutativity (COMM) --- p.17Chapter 3.2.2 --- Forward and Right Backward Commutativity --- p.19Chapter 3.2.3 --- Exploiting Context-Specific Information --- p.21Chapter 3.2.4 --- Relaxing Correctness Criterion by Allowing Bounded Inconsistency --- p.26Chapter 4 --- Related Work --- p.32Chapter 4.1 --- Exploiting Transaction Semantics --- p.32Chapter 4.2 --- Exploting Object Semantics --- p.34Chapter 4.3 --- Sacrificing Consistency --- p.35Chapter 4.4 --- Other Approaches --- p.37Chapter 5 --- Performance Study (Testbed Approach) --- p.39Chapter 5.1 --- System Model --- p.39Chapter 5.1.1 --- Main Memory Database --- p.39Chapter 5.1.2 --- System Configuration --- p.40Chapter 5.1.3 --- Execution of Operations --- p.41Chapter 5.1.4 --- Recovery --- p.42Chapter 5.2 --- Parameter Settings and Performance Metrics --- p.43Chapter 6 --- Performance Results and Analysis (Testbed Approach) --- p.46Chapter 6.1 --- Read/Write Model vs. Abstract Data Type Model --- p.46Chapter 6.2 --- Using Context-Specific Information --- p.52Chapter 6.3 --- Role of Conflict Ratio --- p.55Chapter 6.4 --- Relaxing the Correctness Criterion --- p.58Chapter 6.4.1 --- Overhead and Performance Gain --- p.58Chapter 6.4.2 --- Range Queries using Bounded Inconsistency --- p.63Chapter 7 --- Performance Study (Simulation Approach) --- p.69Chapter 7.1 --- Simulation Model --- p.70Chapter 7.1.1 --- Logical Queueing Model --- p.70Chapter 7.1.2 --- Physical Queueing Model --- p.71Chapter 7.2 --- Experiment Information --- p.74Chapter 7.2.1 --- Parameter Settings --- p.74Chapter 7.2.2 --- Performance Metrics --- p.75Chapter 8 --- Performance Results and Analysis (Simulation Approach) --- p.76Chapter 8.1 --- Relaxing Correctness Criterion of Serial Executions --- p.77Chapter 8.1.1 --- Impact of Resource Contention --- p.77Chapter 8.1.2 --- Impact of Infinite Resources --- p.80Chapter 8.1.3 --- Impact of Limited Resources --- p.87Chapter 8.1.4 --- Impact of Multiple Resources --- p.89Chapter 8.1.5 --- Impact of Transaction Type --- p.95Chapter 8.1.6 --- Impact of Concurrency Control Overhead --- p.96Chapter 8.2 --- Exploiting Context-Specific Information --- p.98Chapter 8.2.1 --- Impact of Limited Resource --- p.98Chapter 8.2.2 --- Impact of Infinite and Multiple Resources --- p.101Chapter 8.2.3 --- Impact of Transaction Length --- p.106Chapter 8.2.4 --- Impact of Buffer Size --- p.108Chapter 8.2.5 --- Impact of Concurrency Control Overhead --- p.110Chapter 8.3 --- Summary and Discussion --- p.113Chapter 8.3.1 --- Summary of Results --- p.113Chapter 8.3.2 --- Relaxing Correctness Criterion vs. Exploiting Context-Specific In- formation --- p.114Chapter 9 --- Conclusions --- p.116Bibliography --- p.122Chapter A --- Commutativity Tables for Queue Objects --- p.128Chapter B --- Specification of a Queue Object --- p.129Chapter C --- Commutativity Tables with Bounded Inconsistency for Queue Objects --- p.132Chapter D --- Some Implementation Issues --- p.134Chapter D.1 --- Important Data Structures --- p.134Chapter D.2 --- Conflict Checking --- p.136Chapter D.3 --- Deadlock Detection --- p.137Chapter E --- Simulation Results --- p.139Chapter E.l --- Impact of Infinite Resources (Bounded Inconsistency) --- p.140Chapter E.2 --- Impact of Multiple Resource (Bounded Inconsistency) --- p.141Chapter E.3 --- Impact of Transaction Type (Bounded Inconsistency) --- p.142Chapter E.4 --- Impact of Concurrency Control Overhead (Bounded Inconsistency) --- p.144Chapter E.4.1 --- Infinite Resources --- p.144Chapter E.4.2 --- Limited Resource --- p.146Chapter E.5 --- Impact of Resource Levels (Exploiting Context-Specific Information) --- p.149Chapter E.6 --- Impact of Buffer Size (Exploiting Context-Specific Information) --- p.150Chapter E.7 --- Impact of Concurrency Control Overhead (Exploiting Context-Specific In- formation) --- p.155Chapter E.7.1 --- Impact of Infinite Resources --- p.155Chapter E.7.2 --- Impact of Limited Resources --- p.157Chapter E.7.3 --- Impact of Transaction Length --- p.160Chapter E.7.4 --- Role of Conflict Ratio --- p.16