1,199 research outputs found

    Ginger: A Transactional Middleware with Data and Operation Centric Mixed Consistency

    Get PDF
    Many modern digital services to correspond to user demand need to offer high availability and low response times. To that end, a lot of digital services resort to geo-replicateddistributed systems. These systems are deployed closer to users, splitting latency acrossmultiple servers and allowing for faster access and communication. However, to accommodate these systems the data stores are also split up across multiple locations. Committing an operation is such systems requires coordination among the multiple replicas.These systems must allow data to be stored as fast as possible without breaking safety constraints of the developers systems.There are three main approaches to define the level of consistency to be guaranteed when accessing the data: over data, over operations or over transactions. The problem with approaches such as consistency over data or consistency over transactions is that they are very limited, as they can result in operations that could be executed in lower consistency levels to be executed at higher consistency levels. Our approach to this problemis the conciliation of executing transactions while expressing consistency in both data and operations. We instantiate this proposition in a middleware system, called Ginger,that is deployed between the user and the data stores. Ginger benefits from all the other approaches, allowing for execution of transactions, that include operations with different levels of consistency, over data with different levels of consistency. This provides the benefits of the isolation from transactions while also providing the performance and control,that consistency defined over operations and consistency defined over data provide.Our experimental results show that Ginger comparing to previously mentioned approaches, such as consistency over data and consistency over transaction, provides faster transaction committing speeds. Ginger serves as proof of concept that using consistency defined both over data and operations while using transactions is possible and may be aviable approach. Further development of the system will provide more functionalities,further evaluation, and a more in-depth comparison to other systems.Os serviços digitais modernos para corresponder às necessidades dos utilizadores precisam de oferecer alta disponibilidade e baixos tempos de resposta. Para tal, os serviços digitais recorrem a sistemas geo-replicados. Esses sistemas são implantados perto dos utilizadores, dividindo a latência entre servidores. No entanto, para acomodar esses sistemas, os serviços de armazenamentos de dados são divididos. O commiting de uma operação nesses sistemas requer coordenação entre múltiplas réplicas. Esses sistemas devem permitir que os dados sejam armazenados rapidamente, sem quebrar restrições de segurança.Existem três abordagens principais para definir o nível de consistência a ser garantido durante o acesso aos dados: sobre dados, sobre operações ou sobre transacções. O problema com abordagens como consistência sobre dados ou sobre transacções é que são limitadas, podendo resultar em operações de níveis de consistência baixos serem executadas com níveis de consistência mais altos. A nossa abordagem a este problema é a conciliação da expressão de consistência tanto nos dados como nas operações. Instanciámos esta proposição num sistema de middleware, denominado Ginger, que é implantado entre o usuário e os serviços de armazenamentos de dados. O Ginger beneficia de todas as abordagens referidas, permitindo a execução de transacções, que incluem operações com diferentes níveis de consistência, sobre dados com diferentes níveis de consistência. Isto beneficia do isolamento das transacções, ao mesmo tempo que fornece o desempenho e o controle, que a consistência definida nas operações e a consistência definida nos dados fornecem.Os nossos resultados experimentais mostram que o Ginger, em comparação com as outras abordagens, como por exemplo consistência sobre os dados e consistência sobre a transação, fornece velocidades de commiting de transacções mais rápidas. Ginger serve como prova de conceito de que o uso de transacções com níveis de consistência definidos sobre os dados e operações é possível e pode ser uma abordagem viável. O desenvolvimento futuro do sistema fornecerá mais funcionalidades, avaliação adicional e uma comparação mais aprofundada com outros sistemas

    syncope: Automatic Enforcement of Distributed Consistency Guarantees

    Get PDF

    Performance evaluation of various deployment scenarios of the 3-replicated Cassandra NoSQL cluster on AWS

    Get PDF
    A concept of distributed replicated NoSQL data storages Cassandra-like, HBase, MongoDB has been proposed to effectively manage Big Data set whose volume, velocity and variability are difficult to deal with by using the traditional Relational Database Management Systems. Tradeoffs between consistency, availability, partition tolerance and latency is intrinsic to such systems. Although relations between these properties have been previously identified by the well-known CAP and PACELC theorems in qualitative terms, it is still necessary to quantify how different consistency settings, deployment patterns and other properties affect system performance.This experience report analysis performance of the Cassandra NoSQL database cluster and studies the tradeoff between data consistency guaranties and performance in distributed data storages. The primary focus is on investigating the quantitative interplay between Cassandra response time, throughput and its consistency settings considering different single- and multi-region deployment scenarios. The study uses the YCSB benchmarking framework and reports the results of the read and write performance tests of the three-replicated Cassandra cluster deployed in the Amazon AWS. In this paper, we also put forward a notation which can be used to formally describe distributed deployment of Cassandra cluster and its nodes relative to each other and to a client application. We present quantitative results showing how different consistency settings and deployment patterns affect Cassandra performance under different workloads. In particular, our experiments show that strong consistency costs up to 22 % of performance in case of the centralized Cassandra cluster deployment and can cause a 600 % increase in the read/write requests if Cassandra replicas and its clients are globally distributed across different AWS Regions

    Functional programming abstractions for weakly consistent systems

    Get PDF
    In recent years, there has been a wide-spread adoption of both multicore and cloud computing. Traditionally, concurrent programmers have relied on the underlying system providing strong memory consistency, where there is a semblance of concurrent tasks operating over a shared global address space. However, providing scalable strong consistency guarantees as the scale of the system grows is an increasingly difficult endeavor. In a multicore setting, the increasing complexity and the lack of scalability of hardware mechanisms such as cache coherence deters scalable strong consistency. In geo-distributed compute clouds, the availability concerns in the presence of partial failures prohibit strong consistency. Hence, modern multicore and cloud computing platforms eschew strong consistency in favor of weakly consistent memory, where each task\u27s memory view is incomparable with the other tasks. As a result, programmers on these platforms must tackle the full complexity of concurrent programming for an asynchronous distributed system. ^ This dissertation argues that functional programming language abstractions can simplify scalable concurrent programming for weakly consistent systems. Functional programming espouses mutation-free programming, and rare mutations when present are explicit in their types. By controlling and explicitly reasoning about shared state mutations, functional abstractions simplify concurrent programming. Building upon this intuition, this dissertation presents three major contributions, each focused on addressing a particular challenge associated with weakly consistent loosely coupled systems. First, it describes A NERIS, a concurrent functional programming language and runtime for the Intel Single-chip Cloud Computer, and shows how to provide an efficient cache coherent virtual address space on top of a non cache coherent multicore architecture. Next, it describes RxCML, a distributed extension of MULTIMLTON and shows that, with the help of speculative execution, synchronous communication can be utilized as an efficient abstraction for programming asynchronous distributed systems. Finally, it presents QUELEA, a programming system for eventually consistent distributed stores, and shows that the choice of correct consistency level for replicated data type operations and transactions can be automated with the help of high-level declarative contracts

    Distributed transactional reads: the strong, the quick, the fresh & the impossible

    Get PDF
    International audienceThis paper studies the costs and trade-offs of providing transactional consistent reads in a distributed storage system. We identify the following dimensions: read consistency, read delay (latency), and data freshness. We show that there is a three-way trade-off between them, which can be summarised as follows: (i) it is not possible to ensure at the same time order-preserving (e.g., causally-consistent) or atomic reads, Minimal Delay, and maximal freshness; thus, reading data that is the most fresh without delay is possible only in a weakly-isolated mode; (ii) to ensure atomic or order-preserving reads at Minimal Delay imposes to read data from the past (not fresh); (iii) however, order-preserving minimal-delay reads can be fresher than atomic; (iv) reading atomic or order-preserving data at maximal freshness may block reads or writes indefinitely. Our impossibility results hold independently of other features of the database, such as update semantics (totally ordered or not) or data model (structured or unstructured). Guided by these results, we modify an existing protocol to ensure minimal-delay reads (at the cost of freshness) under atomic-visibility and causally-consistent semantics. Our experimental evaluation supports the theoretical results

    Automated deductive verification of systems software

    Get PDF
    Software has become an integral part of our everyday lives, and so is our reliance on his correct functioning. Systems software lies at the heart of computer systems, consequently ensuring its reliability and security is of paramount importance. This thesis explores automated deductive verification for increasing reliability and security of systems software. The thesis is comprised of the three main threads. The first thread describes how the state-of-the art deductive verification techniques can help in developing more secure operating system. We have developed a prototype of an Android-based operating system with strong assurance guarantees. Operating systems code heavily relies on mutable data structures. In our experience, reasoning about such pointer-manipulating programs was the hardest aspect of the operating system verification effort because correctness criteria describes intricate combinations of structure (shape), content (data), and separation. Thus, in the second thread, we explore design and development of an automated verification system for assuring correctness of pointer-manipulating programs using an extension of Hoare’s logic for reasoning about programs that access and update heap allocated data-structures. We have developed a verification framework that allows reasoning about C programs using only domain specific code annotations. The same thread contains a novel idea that enables efficient runtime checking of assertions that can express properties of dynamically manipulated linked-list data structures. Finally, we describe the work that paves a new way for reasoning about distributed protocols. We propose certified program models, where an executable language (such as C) is used for modelling – an executable language enables testing, and emerging program verifiers for mainstream executable languages enable certification of such models. As an instance of this approach, concurrent C code is used for modelling and a program verifier for concurrent C (VCC from Microsoft Research) is used for certification of new class of systems software that serves as a backbone for efficient distributed data storage

    Data management techniques

    Get PDF
    Today, it is projected that data storage and management is becoming one of the key challenges in order to achieve ultrascale computing for several reasons. First, data is expected to grow exponentially in the coming years and this progression will imply that disruptive technologies will be needed to store large amounts of data and more importantly to access it in a timely manner. Second, the improvement of computing elements and their scalability are shifting application execution from CPU bound to I/O bound. This creates additional challenges for significantly improving the access to data to keep with computation time and thus avoid high-performance computing (HPC) from being underutilized due to large periods of I/O activity. Third, the two initially separate worlds of HPC that mainly consisted on one hand of simulations that are CPU bound and on the other hand of analytics that mainly perform huge data scans to discover information and are I/O bound are blurring. Now, simulations and analytics need to work cooperatively and share the same I/O infrastructure

    Deducing Operation Commutativity from Replicated Data Declaration

    Get PDF
    Distributed systems often resort to data replication not only to enhance their availability but also to reduce user-perceived latency by balancing the load between replicas and routing their requests accordingly. The choice of which consistency level that should be adopted by these replicated systems is critical for the fulfilment of their performance and correctness requirements. However, defining a strategy that strikes the right balance between these concerns in this type of environments is far from being a trivial task due to the related overheads that are amplified in distributed scenarios. Recognising the tension between latency and consistency, many systems allow multiple consistency levels to coexist. Nevertheless, the performance fine-tuning mechanisms supported by the existing hybrid solutions place a high burden on the programmer since the necessary input can be somehow complex requiring him to understand the semantics of each operation of the service he is developing in order to correctly instruct the system on how to handle concurrent updates. Thus, specifying operation dependencies, orderings and invariants to be preserved or even picking the right consistency level to be assigned to a certain data item is, generally, an error-prone task that hinders reasoning. To overcome this adversity, this work aims to reduce the effort spent by the programmer by only requiring the latter to introduce a simple and intuitive input at data declaration. Following this approach, reasoning is centralised and all accesses to replicated data are identified automatically. With all data accesses identified, it is then possible to deduce the side effects of each operation and determine, for each one of them, those with which it conflicts. In this context, this thesis also presents a compile-time analysis applied to the Java language able to evaluate operation pairwise commutativity from the input given at data declaration

    8th SC@RUG 2011 proceedings:Student Colloquium 2010-2011

    Get PDF
    corecore