116 research outputs found

    Eventual Consistent Databases: State of the Art

    Get PDF
    One of the challenges of cloud programming is to achieve the right balance between the availability and consistency in a distributed database. Cloud computing environments, particularly cloud databases, are rapidly increasing in importance, acceptance and usage in major applications, which need the partition-tolerance and availability for scalability purposes, but sacrifice the consistency side (CAP theorem). In these environments, the data accessed by users is stored in a highly available storage system, thus the use of paradigms such as eventual consistency became more widespread. In this paper, we review the state-of-the-art database systems using eventual consistency from both industry and research. Based on this review, we discuss the advantages and disadvantages of eventual consistency, and identify the future research challenges on the databases using eventual consistency

    Megastore: structured storage for Big Data

    Get PDF
    Megastore es uno de los componentes principales de la infraestructura de datos de Google, elcual ha permitido el procesamiento y almacenamiento de grandes volúmenes de datos (BigData) con alta escalabilidad, confiabilidad y seguridad. Las compañías e individuos que usanestá tecnología se están beneficiando al mismo tiempo de un servicio estable y de altadisponibilidad. En este artículo se realiza un análisis de la infraestructura de datos de Google,comenzando por una revisión de los componentes principales que se han implementado en losúltimos años hasta la creación de Megastore. Se presenta también un análisis de los aspectostécnicos más importantes que se han implementado en este sistema de almacenamiento y que le han permitido cumplir con los objetivos para los que fue creado.Abstract:Megastore is one of the building blocks of Google’s data infrastructure. It has allowed storingand processing operations of huge volumes of data (Big Data) with high scalability, reliabilityand security. Companies and individuals using this technology benefit from a highly availableand stable service. In this paper an analysis of Google’s data infrastructure is made, startingwith a review of the core components that have been developed in recent years until theimplementation of Megastore. An analysis is also made of the most importan

    Transactions and data management in NoSQL cloud databases

    Get PDF
    NoSQL databases have become the preferred option for storing and processing data in cloud computing as they are capable of providing high data availability, scalability and efficiency. But in order to achieve these attributes, NoSQL databases make certain trade-offs. First, NoSQL databases cannot guarantee strong consistency of data. They only guarantee a weaker consistency which is based on eventual consistency model. Second, NoSQL databases adopt a simple data model which makes it easy for data to be scaled across multiple nodes. Third, NoSQL databases do not support table joins and referential integrity which by implication, means they cannot implement complex queries. The combination of these factors implies that NoSQL databases cannot support transactions. Motivated by these crucial issues this thesis investigates into the transactions and data management in NoSQL databases. It presents a novel approach that implements transactional support for NoSQL databases in order to ensure stronger data consistency and provide appropriate level of performance. The novelty lies in the design of a Multi-Key transaction model that guarantees the standard properties of transactions in order to ensure stronger consistency and integrity of data. The model is implemented in a novel loosely-coupled architecture that separates the implementation of transactional logic from the underlying data thus ensuring transparency and abstraction in cloud and NoSQL databases. The proposed approach is validated through the development of a prototype system using real MongoDB system. An extended version of the standard Yahoo! Cloud Services Benchmark (YCSB) has been used in order to test and evaluate the proposed approach. Various experiments have been conducted and sets of results have been generated. The results show that the proposed approach meets the research objectives. It maintains stronger consistency of cloud data as well as appropriate level of reliability and performance

    Consistency issue and related trade-offs in distributed replicated systems and databases: a review

    Get PDF
    However, achieving these qualities requires resolving a number of trade-offs between various properties during system design and operation. This paper reviews trade-offs in distributed replicated databases and provides a survey of recent research papers studying distributed data storage. The paper first discusses a compromise between consistency and latency that appears in distributed replicated data storages and directly follows from CAP and PACELC theorems. Consistency refers to the guarantee that all clients in a distributed system observe the same data at the same time. To ensure strong consistency, distributed systems typically employ coordination mechanisms and synchronization protocols that involve communication and agreement among distributed replicas. These mechanisms introduce additional overhead and latency and can dramatically increase the time taken to complete operations when replicas are globally distributed across the Internet. In addition, we study trade-offs between other system properties including availability, durability, cost, energy consumption, read and write latency, etc. In this paper we also provide a comprehensive review and classification of recent research works in distributed replicated databases. Reviewed papers showcase several major areas of research, ranging from performance evaluation and comparison of various NoSQL databases to suggest new strategies for data replication and putting forward new consistency models. In particular, we observed a shift towards exploring hybrid consistency models of causal consistency and eventual consistency with causal ordering due to their ability to strike a balance between operations ordering guarantees and high performance. Researchers have also proposed various consistency control algorithms and consensus quorum protocols to coordinate distributed replicas. Insights from this review can empower practitioners to make informed decisions in designing and managing distributed data storage systems as well as help identify existing gaps in the body of knowledge and suggest further research directions

    Eventual Consistency: Origin and Support

    Get PDF
    Eventual consistency is demanded nowadays in geo-replicated services that need to be highly scalable and available. According to the CAP constraints, when network partitions may arise, a distributed service should choose between being strongly consistent or being highly available. Since scalable services should be available, a relaxed consistency (while the network is partitioned) is the preferred choice. Eventual consistency is not a common data-centric consistency model, but only a state convergence condition to be added to a relaxed consistency model. There are still several aspects of eventual consistency that have not been analysed in depth in previous works: 1. which are the oldest replication proposals providing eventual consistency, 2. which replica consistency models provide the best basis for building eventually consistent services, 3. which mechanisms should be considered for implementing an eventually consistent service, and 4. which are the best combinations of those mechanisms for achieving different concrete goals. This paper provides some notes on these important topics