5 research outputs found
Towards quality-of-service driven consistency for Big Data management
International audienceWith the advent of Cloud Computing, Big Data management has become a fundamental challenge during the deployment and operation of distributed highly available and fault-tolerant storage systems such as the HBase extensible record-store. These systems can provide support for geo-replication, which comes with the issue of data consistency among distributed sites. In order to offer a best-in-class service to applications, one wants to maximise performance while minimising latency. In terms of data replication, that means incurring in as low latency as possible when moving data between distant data centres. Traditional consistency models introduce a significant problem for systems architects, which is specially important to note in cases where large amounts of data need to be replicated across wide-area networks. In such scenarios it might be suitable to use eventual consistency, and even though not always convenient, latency can be partly reduced and traded for consistency guarantees so that data-transfers do not impact performance. In contrast, this work proposes a broader range of data semantics for consistency while prioritising data at the cost of putting a minimum latency overhead on the rest of non-critical updates. Finally, we show how these semantics can help in finding an optimal data replication strategy for achieving just the required level of data consistency under low latency and a more efficient network bandwidth utilisation
Combining Generality and Practicality in a Conit-Based Continuous Consistency Model for Wide-Area Replication
Replication is a key approach to scaling wide-area applications. However, the overhead associated with largescale replication quickly becomes prohibitive across widearea networks. One effective approach to addressing this limitation is to allow applications to dynamically trade reduced consistency for increased performance and availability. Although extensive study has been performed on relaxed consistency models in traditional replicated databases, none of the models can simultaneously achieve the following two typically conflicting requirements imposed by wide-area applications: generality (capturing applicationspecific consistency semantics) and practicality (enabling efficient application-independent consistency protocols to be designed and providing natural ways to express application semantics). In this paper, we propose a conit-based continuous consistency model designed to simultaneously achieve generality and practicality. Our conit theory provides generality, where applicati..
Contention management for distributed data replication
PhD ThesisOptimistic replication schemes provide distributed applications with access
to shared data at lower latencies and greater availability. This is
achieved by allowing clients to replicate shared data and execute actions
locally. A consequence of this scheme raises issues regarding shared data
consistency. Sometimes an action executed by a client may result in
shared data that may conflict and, as a consequence, may conflict with
subsequent actions that are caused by the conflicting action. This requires
a client to rollback to the action that caused the conflicting data,
and to execute some exception handling. This can be achieved by relying
on the application layer to either ignore or handle shared data inconsistencies
when they are discovered during the reconciliation phase of an
optimistic protocol.
Inconsistency of shared data has an impact on the causality relationship
across client actions. In protocol design, it is desirable to preserve the
property of causality between different actions occurring across a distributed
application. Without application level knowledge, we assume
an action causes all the subsequent actions at the same client. With
application knowledge, we can significantly ease the protocol burden of
provisioning causal ordering, as we can identify which actions do not
cause other actions (even if they precede them). This, in turn, makes
possible the clientâs ability to rollback to past actions and to change
them, without having to alter subsequent actions. Unfortunately, increased
instances of application level causal relations between actions
lead to a significant overhead in protocol. Therefore, minimizing the
rollback associated with conflicting actions, while preserving causality,
is seen as desirable for lower exception handling in the application layer.
In this thesis, we present a framework that utilizes causality to create
a scheduler that can inform a contention management scheme to reduce
the rollback associated with the conflicting access of shared data.
Our framework uses a backoff contention management scheme to provide
causality preserving for those optimistic replication systems with high
causality requirements, without the need for application layer knowledge.
We present experiments which demonstrate that our framework reduces
clientsâ rollback and, more importantly, that the overall throughput of
the system is improved when the contention management is used with
applications that require causality to be preserved across all actions