20 research outputs found

    The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis

    Get PDF
    Datastores today rely on distribution and replication to achieve improved performance and fault-tolerance. But correctness of many applications depends on strong consistency properties - something that can impose substantial overheads, since it requires coordinating the behavior of multiple nodes. This paper describes a new approach to achieving strong consistency in distributed systems while minimizing communication between nodes. The key insight is to allow the state of the system to be inconsistent during execution, as long as this inconsistency is bounded and does not affect transaction correctness. In contrast to previous work, our approach uses program analysis to extract semantic information about permissible levels of inconsistency and is fully automated. We then employ a novel homeostasis protocol to allow sites to operate independently, without communicating, as long as any inconsistency is governed by appropriate treaties between the nodes. We discuss mechanisms for optimizing treaties based on workload characteristics to minimize communication, as well as a prototype implementation and experiments that demonstrate the benefits of our approach on common transactional benchmarks

    Performance assessment of real-time data management on wireless sensor networks

    Get PDF
    Technological advances in recent years have allowed the maturity of Wireless Sensor Networks (WSNs), which aim at performing environmental monitoring and data collection. This sort of network is composed of hundreds, thousands or probably even millions of tiny smart computers known as wireless sensor nodes, which may be battery powered, equipped with sensors, a radio transceiver, a Central Processing Unit (CPU) and some memory. However due to the small size and the requirements of low-cost nodes, these sensor node resources such as processing power, storage and especially energy are very limited. Once the sensors perform their measurements from the environment, the problem of data storing and querying arises. In fact, the sensors have restricted storage capacity and the on-going interaction between sensors and environment results huge amounts of data. Techniques for data storage and query in WSN can be based on either external storage or local storage. The external storage, called warehousing approach, is a centralized system on which the data gathered by the sensors are periodically sent to a central database server where user queries are processed. The local storage, in the other hand called distributed approach, exploits the capabilities of sensors calculation and the sensors act as local databases. The data is stored in a central database server and in the devices themselves, enabling one to query both. The WSNs are used in a wide variety of applications, which may perform certain operations on collected sensor data. However, for certain applications, such as real-time applications, the sensor data must closely reflect the current state of the targeted environment. However, the environment changes constantly and the data is collected in discreet moments of time. As such, the collected data has a temporal validity, and as time advances, it becomes less accurate, until it does not reflect the state of the environment any longer. Thus, these applications must query and analyze the data in a bounded time in order to make decisions and to react efficiently, such as industrial automation, aviation, sensors network, and so on. In this context, the design of efficient real-time data management solutions is necessary to deal with both time constraints and energy consumption. This thesis studies the real-time data management techniques for WSNs. It particularly it focuses on the study of the challenges in handling real-time data storage and query for WSNs and on the efficient real-time data management solutions for WSNs. First, the main specifications of real-time data management are identified and the available real-time data management solutions for WSNs in the literature are presented. Secondly, in order to provide an energy-efficient real-time data management solution, the techniques used to manage data and queries in WSNs based on the distributed paradigm are deeply studied. In fact, many research works argue that the distributed approach is the most energy-efficient way of managing data and queries in WSNs, instead of performing the warehousing. In addition, this approach can provide quasi real-time query processing because the most current data will be retrieved from the network. Thirdly, based on these two studies and considering the complexity of developing, testing, and debugging this kind of complex system, a model for a simulation framework of the real-time databases management on WSN that uses a distributed approach and its implementation are proposed. This will help to explore various solutions of real-time database techniques on WSNs before deployment for economizing money and time. Moreover, one may improve the proposed model by adding the simulation of protocols or place part of this simulator on another available simulator. For validating the model, a case study considering real-time constraints as well as energy constraints is discussed. Fourth, a new architecture that combines statistical modeling techniques with the distributed approach and a query processing algorithm to optimize the real-time user query processing are proposed. This combination allows performing a query processing algorithm based on admission control that uses the error tolerance and the probabilistic confidence interval as admission parameters. The experiments based on real world data sets as well as synthetic data sets demonstrate that the proposed solution optimizes the real-time query processing to save more energy while meeting low latency.Fundação para a Ciência e Tecnologi

    Optimistic replication

    Get PDF
    Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic ” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence — and provides a comprehensive survey of techniques developed for addressing these challenges

    Estimating data divergence in cloud computing storage systems

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaMany internet services are provided through cloud computing infrastructures that are composed of multiple data centers. To provide high availability and low latency, data is replicated in machines in different data centers, which introduces the complexity of guaranteeing that clients view data consistently. Data stores often opt for a relaxed approach to replication, guaranteeing only eventual consistency, since it improves latency of operations. However, this may lead to replicas having different values for the same data. One solution to control the divergence of data in eventually consistent systems is the usage of metrics that measure how stale data is for a replica. In the past, several algorithms have been proposed to estimate the value of these metrics in a deterministic way. An alternative solution is to rely on probabilistic metrics that estimate divergence with a certain degree of certainty. This relaxes the need to contact all replicas while still providing a relatively accurate measurement. In this work we designed and implemented a solution to estimate the divergence of data in eventually consistent data stores, that scale to many replicas by allowing clientside caching. Measuring the divergence when there is a large number of clients calls for the development of new algorithms that provide probabilistic guarantees. Additionally, unlike previous works, we intend to focus on measuring the divergence relative to a state that can lead to the violation of application invariants.Partially funded by project PTDC/EIA EIA/108963/2008 and by an ERC Starting Grant, Agreement Number 30773

    Performance characteristics of semantics-based concurrency control protocols.

    Get PDF
    by Keith, Hang-kwong Mak.Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 122-127).Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 2 --- Background --- p.4Chapter 2.1 --- Read/Write Model --- p.4Chapter 2.2 --- Abstract Data Type Model --- p.5Chapter 2.3 --- Overview of Semantics-Based Concurrency Control Protocols --- p.7Chapter 2.4 --- Concurrency Hierarchy --- p.9Chapter 2.5 --- Control Flow of the Strict Two Phase Locking Protocol --- p.11Chapter 2.5.1 --- Flow of an Operation --- p.12Chapter 2.5.2 --- Response Time of a Transaction --- p.13Chapter 2.5.3 --- Factors Affecting the Response Time of a Transaction --- p.14Chapter 3 --- Semantics-Based Concurrency Control Protocols --- p.16Chapter 3.1 --- Strict Two Phase Locking --- p.16Chapter 3.2 --- Conflict Relations --- p.17Chapter 3.2.1 --- Commutativity (COMM) --- p.17Chapter 3.2.2 --- Forward and Right Backward Commutativity --- p.19Chapter 3.2.3 --- Exploiting Context-Specific Information --- p.21Chapter 3.2.4 --- Relaxing Correctness Criterion by Allowing Bounded Inconsistency --- p.26Chapter 4 --- Related Work --- p.32Chapter 4.1 --- Exploiting Transaction Semantics --- p.32Chapter 4.2 --- Exploting Object Semantics --- p.34Chapter 4.3 --- Sacrificing Consistency --- p.35Chapter 4.4 --- Other Approaches --- p.37Chapter 5 --- Performance Study (Testbed Approach) --- p.39Chapter 5.1 --- System Model --- p.39Chapter 5.1.1 --- Main Memory Database --- p.39Chapter 5.1.2 --- System Configuration --- p.40Chapter 5.1.3 --- Execution of Operations --- p.41Chapter 5.1.4 --- Recovery --- p.42Chapter 5.2 --- Parameter Settings and Performance Metrics --- p.43Chapter 6 --- Performance Results and Analysis (Testbed Approach) --- p.46Chapter 6.1 --- Read/Write Model vs. Abstract Data Type Model --- p.46Chapter 6.2 --- Using Context-Specific Information --- p.52Chapter 6.3 --- Role of Conflict Ratio --- p.55Chapter 6.4 --- Relaxing the Correctness Criterion --- p.58Chapter 6.4.1 --- Overhead and Performance Gain --- p.58Chapter 6.4.2 --- Range Queries using Bounded Inconsistency --- p.63Chapter 7 --- Performance Study (Simulation Approach) --- p.69Chapter 7.1 --- Simulation Model --- p.70Chapter 7.1.1 --- Logical Queueing Model --- p.70Chapter 7.1.2 --- Physical Queueing Model --- p.71Chapter 7.2 --- Experiment Information --- p.74Chapter 7.2.1 --- Parameter Settings --- p.74Chapter 7.2.2 --- Performance Metrics --- p.75Chapter 8 --- Performance Results and Analysis (Simulation Approach) --- p.76Chapter 8.1 --- Relaxing Correctness Criterion of Serial Executions --- p.77Chapter 8.1.1 --- Impact of Resource Contention --- p.77Chapter 8.1.2 --- Impact of Infinite Resources --- p.80Chapter 8.1.3 --- Impact of Limited Resources --- p.87Chapter 8.1.4 --- Impact of Multiple Resources --- p.89Chapter 8.1.5 --- Impact of Transaction Type --- p.95Chapter 8.1.6 --- Impact of Concurrency Control Overhead --- p.96Chapter 8.2 --- Exploiting Context-Specific Information --- p.98Chapter 8.2.1 --- Impact of Limited Resource --- p.98Chapter 8.2.2 --- Impact of Infinite and Multiple Resources --- p.101Chapter 8.2.3 --- Impact of Transaction Length --- p.106Chapter 8.2.4 --- Impact of Buffer Size --- p.108Chapter 8.2.5 --- Impact of Concurrency Control Overhead --- p.110Chapter 8.3 --- Summary and Discussion --- p.113Chapter 8.3.1 --- Summary of Results --- p.113Chapter 8.3.2 --- Relaxing Correctness Criterion vs. Exploiting Context-Specific In- formation --- p.114Chapter 9 --- Conclusions --- p.116Bibliography --- p.122Chapter A --- Commutativity Tables for Queue Objects --- p.128Chapter B --- Specification of a Queue Object --- p.129Chapter C --- Commutativity Tables with Bounded Inconsistency for Queue Objects --- p.132Chapter D --- Some Implementation Issues --- p.134Chapter D.1 --- Important Data Structures --- p.134Chapter D.2 --- Conflict Checking --- p.136Chapter D.3 --- Deadlock Detection --- p.137Chapter E --- Simulation Results --- p.139Chapter E.l --- Impact of Infinite Resources (Bounded Inconsistency) --- p.140Chapter E.2 --- Impact of Multiple Resource (Bounded Inconsistency) --- p.141Chapter E.3 --- Impact of Transaction Type (Bounded Inconsistency) --- p.142Chapter E.4 --- Impact of Concurrency Control Overhead (Bounded Inconsistency) --- p.144Chapter E.4.1 --- Infinite Resources --- p.144Chapter E.4.2 --- Limited Resource --- p.146Chapter E.5 --- Impact of Resource Levels (Exploiting Context-Specific Information) --- p.149Chapter E.6 --- Impact of Buffer Size (Exploiting Context-Specific Information) --- p.150Chapter E.7 --- Impact of Concurrency Control Overhead (Exploiting Context-Specific In- formation) --- p.155Chapter E.7.1 --- Impact of Infinite Resources --- p.155Chapter E.7.2 --- Impact of Limited Resources --- p.157Chapter E.7.3 --- Impact of Transaction Length --- p.160Chapter E.7.4 --- Role of Conflict Ratio --- p.16
    corecore