4 research outputs found

    Cost- and workload-driven data management in the cloud

    Get PDF
    This thesis deals with the challenge of finding the right balance between consistency, availability, latency and costs, captured by the CAP/PACELC trade-offs, in the context of distributed data management in the Cloud. At the core of this work, cost and workload-driven data management protocols, called CCQ protocols, are developed. First, this includes the development of C3, which is an adaptive consistency protocol that is able to adjust consistency at runtime by considering consistency and inconsistency costs. Second, the development of Cumulus, an adaptive data partitioning protocol, that can adapt partitions by considering the application workload so that expensive distributed transactions are minimized or avoided. And third, the development of QuAD, a quorum-based replication protocol, that constructs the quorums in such a way so that, given a set of constraints, the best possible performance is achieved. The behavior of each CCQ protocol is steered by a cost model, which aims at reducing the costs and overhead for providing the desired data management guarantees. The CCQ protocols are able to continuously assess their behavior, and if necessary to adapt the behavior at runtime based on application workload and the cost model. This property is crucial for applications deployed in the Cloud, as they are characterized by a highly dynamic workload, and high scalability and availability demands. The dynamic adaptation of the behavior at runtime does not come for free, and may generate considerable overhead that might outweigh the gain of adaptation. The CCQ cost models incorporate a control mechanism, which aims at avoiding expensive and unnecessary adaptations, which do not provide any benefits to applications. The adaptation is a distributed activity that requires coordination between the sites in a distributed database system. The CCQ protocols implement safe online adaptation approaches, which exploit the properties of 2PC and 2PL to ensure that all sites behave in accordance with the cost model, even in the presence of arbitrary failures. It is crucial to guarantee a globally consistent view of the behavior, as in contrary the effects of the cost models are nullified. The presented protocols are implemented as part of a prototypical database system. Their modular architecture allows for a seamless extension of the optimization capabilities at any level of their implementation. Finally, the protocols are quantitatively evaluated in a series of experiments executed in a real Cloud environment. The results show their feasibility and ability to reduce application costs, and to dynamically adjust the behavior at runtime without violating their correctness

    Analysis of Impact of Network Delay on Multiversion Conservative Timestamp Algorithms in DDBS

    No full text
    In a distributed environment, users access databases in remote sites via a communication network. The randomness of the network delay may cause operations to arrive at the remote sites out of sequence. Multiversion conservative timestamp algorithms can be used to schedule the operations to maintain the consistency of the databases. We model the algorithms as queueing systems with partial order resequencing constraints, where the resequencing constraints vary as the number of versions in the algorithms. Under the assumption of an i:i:d: network delay, the distributions of the response time of operations and the buffer occupancy in the resequencing buffer are derived analytically for the algorithms with version number one to infinity. The impact of the network delay on the performance of the DDBS is investigated. It is found that the variance of the network delay has significant effect on the system performance. 1 Introduction In a Distributed DataBase System (DDBS), users access sha..

    Analysis of Impact of Network Delay on Multiversion Conservative Timestamp Algorithms in DDBS

    No full text
    In a distributed environment, users access databases in remote sites via a communication network. The randomness of the network delay may cause operations to arrive at the remote sites out of sequence. Multiversion conservative timestamp algorithms can be used to schedule the operations to maintain the consistency of the databases. We model the algorithms as queueing systems with partial order resequencing constraints, where the resequencing constraints vary as the number of versions in the algorithms. Under the assumption of an i:i:d: network delay, the distributions of the response time of operations and the buffer occupancy in the resequencing buffer are derived analytically for the algorithms with version number one to infinity. The impact of the network delay on the performance of the DDBS is investigated. It is found that the variance of the network delay has significant effect on the system performance
    corecore