2,286 research outputs found

    Optimizing simulation on shared-memory platforms: The smart cities case

    Get PDF
    Modern advancements in computing architectures have been accompanied by new emergent paradigms to run Parallel Discrete Event Simulation models efficiently. Indeed, many new paradigms to effectively use the available underlying hardware have been proposed in the literature. Among these, the Share-Everything paradigm tackles massively-parallel shared-memory machines, in order to support speculative simulation by taking into account the limits and benefits related to this family of architectures. Previous results have shown how this paradigm outperforms traditional speculative strategies (such as data-separated Time Warp systems) whenever the granularity of executed events is small. In this paper, we show performance implications of this simulation-engine organization when the simulation models have a variable granularity. To this end, we have selected a traffic model, tailored for smart cities-oriented simulation. Our assessment illustrates the effects of the various tuning parameters related to the approach, opening to a higher understanding of this innovative paradigm

    Total order in opportunistic networks

    Get PDF
    Opportunistic network applications are usually assumed to work only with unordered immutable messages, like photos, videos, or music files, while applications that depend on ordered or mutable messages, like chat or shared contents editing applications, are ignored. In this paper, we examine how total ordering can be achieved in an opportunistic network. By leveraging on existing dissemination and causal order algorithms, we propose a commutative replicated data type algorithm on the basis of Logoot for achieving total order without using tombstones in opportunistic networks where message delivery is not guaranteed by the routing layer. Our algorithm is designed to use the nature of the opportunistic network to reduce the metadata size compared to the original Logoot, and even to achieve in some cases higher hit rates compared to the dissemination algorithms when no order is enforced. Finally, we present the results of the experiments for the new algorithm by using an opportunistic network emulator, mobility traces, and Wikipedia pages.Peer ReviewedPostprint (author's final draft

    Okapi: Causally Consistent Geo-Replication Made Faster, Cheaper and More Available

    Get PDF
    Okapi is a new causally consistent geo-replicated key- value store. Okapi leverages two key design choices to achieve high performance. First, it relies on hybrid logical/physical clocks to achieve low latency even in the presence of clock skew. Second, Okapi achieves higher resource efficiency and better availability, at the expense of a slight increase in update visibility latency. To this end, Okapi implements a new stabilization protocol that uses a combination of vector and scalar clocks and makes a remote update visible when its delivery has been acknowledged by every data center. We evaluate Okapi with different workloads on Amazon AWS, using three geographically distributed regions and 96 nodes. We compare Okapi with two recent approaches to causal consistency, Cure and GentleRain. We show that Okapi delivers up to two orders of magnitude better performance than GentleRain and that Okapi achieves up to 3.5x lower latency and a 60% reduction of the meta-data overhead with respect to Cure

    Multi-value distributed key-value stores

    Get PDF
    Tese de Doutoramento em InformaticsMany large scale distributed data stores rely on optimistic replication to scale and remain highly available in the face of network partitions. Managing data without strong coordination results in eventually consistent data stores that allow for concurrent data updates. To allow writing applications in the absence of linearizability or transactions, the seminal Dynamo data store proposed a multi-value API in which a get returns the set of concurrent written values. In this scenario, it is important to be able to accurately and efficiently identify updates executed concurrently. Logical clocks are often used to track data causality, necessary to distinguish concurrent from causally related writes on the same key. However, in traditional mechanisms there is a non-negligible metadata overhead per key, which also keeps growing with time, proportional to the node churn rate. Another challenge is deleting keys while respecting causality: while the values can be deleted, per-key metadata cannot be permanently removed in current data stores. These systems often use anti-entropy mechanisms (like Merkle Trees) to detect and repair divergent data versions across nodes. However, in practice hash-based data structures are not suitable to a store using consistent hashing and create too many false positives. Also, highly available systems usually provide eventual consistency, which is the weakest form of consistency. This results in a programming model difficult to use and to reason about. It has been proved that causal consistency is the strongest consistency model achievable if we want highly available services. It provides better programming semantics such as sessions guarantees. However, classical causal consistency is a memory model that that is problematic for concurrent updates, in the absence of concurrency control primitives. Used in eventually consistent data stores, it leads to arbitrating between concurrent updates which leads to data loss. We propose three novel techniques in this thesis. The first is Dotted Version Vectors: a solution that combines a new logical clock mechanism and a request handling workflow that together support the traditional Dynamo key-value store API while capturing causality in an accurate and scalable way, avoiding false conflicts. It maintains concise information per version, linear only on the number of replicas, and includes a container data structure that allows sets of concurrent versions to be merged efficiently, with time complexity linear on the number of replicas plus versions. The second is DottedDB: a Dynamo-like key-value store, which uses a novel node-wide logical clock framework, overcoming three fundamental limitations of the state of the art: (1) minimize the metadata per key necessary to track causality, avoiding its growth even in the face of node churn; (2) correctly and durably delete keys, with no need for tombstones; (3) offer a lightweight anti-entropy mechanism to converge replicated data, avoiding the need for Merkle Trees. The third and final contribution is Causal Multi-Value Consistency: a novel consistency model that respects the causality of client operations while properly supporting concurrent updates without arbitration, by having the same Dynamo-like multi-value nature. In addition, we extend this model to provide the same semantics with read and write transactions. For both models, we define an efficient implementation on top of a distributed key-value store.Várias bases de dados de larga escala usam técnicas de replicação otimista para escalar e permanecer altamente disponíveis face a falhas e partições na rede. Gerir os dados sem coordenação forte entre os nós do servidor e o cliente resulta em bases de dados "inevitavelmente coerentes" que permitem escritas de dados concorrentes. Para permitir que aplicações escrevam na base de dados na ausência de transações e mecanismos de coerência forte, a influente base de dados Dynamo propôs uma interface multi-valor, que permite a uma leitura devolver um conjunto de valores escritos concorrentemente para a mesma chave. Neste cenário, é importante identificar com exatidão e eficiência quais as escritas efetuadas numa chave de forma potencialmente concorrente. Relógios lógicos são normalmente usados para gerir a causalidade das chaves, de forma a detetar escritas causalmente concorrentes na mesma chave. No entanto, mecanismos tradicionais adicionam metadados cujo tamanho cresce proporcionalmente com a entrada e saída de nós no servidor. Outro desafio é a remoção de chaves do sistema, respeitando a causalidade e ao mesmo tempo não deixando metadados permanentes no servidor. Estes sistemas de dados utilizam também mecanismos de anti-entropia (tais como Merkle Trees) para detetar e reparar dados replicados em diferentes nós que divirjam. No entanto, na prática estas estruturas de dados baseadas em hashes não são adequados para sistemas que usem hashing consistente para a partição de dados e resultam em muitos falsos positivos. Outro aspeto destes sistemas é o facto de normalmente apenas suportarem coerência inevitável, que é a garantia mais fraca em termos de coerência de dados. Isto resulta num modelo de programação difícil de usar e compreender. Foi provado que coerência causal é a forma mais forte de coerência de dados que se consegue fornecer, de forma a que se consiga também ser altamente disponível face a falhas. Este modelo fornece uma semântica mais interessante ao cliente do sistema, nomeadamente as garantias de sessão. No entanto, a coerência causal tradicional é definida sobre um modelo de memória não apropriado para escritas concorrentes não controladas. Isto leva a que se arbitre um vencedor quando escritas acontecem concorrentemente, levando a perda de dados. Propomos nesta tese três novas técnicas. A primeira chama-se Dotted Version Vectors: uma solução que combina um novo mecanismo de relógios lógicos com uma interação entre o cliente e o servidor, que permitem fornecer uma interface multi-valor ao cliente similar ao Dynamo de forma eficiente e escalável, sem falsos conflitos. O novo relógio lógico mantém informação precisa por versão de uma chave, de tamanho linear no número de réplicas da chave no sistema. Permite também que versão diferentes sejam corretamente e eficientemente reunidas. A segunda contribuição chama-se DottedDB: uma base de dados similar ao Dynamo, mas que implementa um novo mecanismo de relógios lógicos ao nível dos nós, que resolve três limitações fundamentais do estado da arte: (1) minimiza os metadados necessários manter por chave para gerir a causalidade, evitando o seu crescimento com a entrada e saída de nós; (2) permite remover chaves de forma permanente, sem a necessidade de manter metadados indefinidamente no servidor; (3) um novo protocolo de anti-entropia para reparar dados replicados, de modo a que todas as réplicas na base de dados convirjam, sem que seja necessário operações dispendiosas como as usadas com Merkle Trees. A terceira e última contribuição é Coerência Causal Multi-Valor: um novo modelo de coerência de dados que respeita a causalidade das operações efetuadas pelos clientes e que também suporta operações concorrentes, sem que seja necessário arbitrar um vencedor entre as escritas, seguindo o espírito da interface multi-valor do Dynamo. Adicionalmente, estendemos este modelo para fornecer transações de escritas ou leituras, respeitando a mesma semântica da causalidade. Para ambos os modelos, definimos uma implementação eficiente em cima de uma base de dados distribuída.Fundação para a Ciência e Tecnologia (FCT) - with the research grant SFRH/BD/86735/201

    A novel causally consistent replication protocol with partial geo-replication

    Get PDF
    Distributed storage systems are a fundamental component of large-scale Internet services. To keep up with the increasing expectations of users regarding availability and latency, the design of data storage systems has evolved to achieve these properties, by exploiting techniques such as partial replication, geo-replication and weaker consistency models. While systems with these characteristics exist, they usually do not provide all these properties or do so in an inefficient manner, not taking full advantage of them. Additionally, weak consistency models, such as eventual consistency, put an excessively high burden on application programmers for writing correct applications, and hence, multiple systems have moved towards providing additional consistency guarantees such as implementing the causal (and causal+) consistency models. In this thesis we approach the existing challenges in designing a causally consistent replication protocol, with a focus on the use of geo and partial data replication. To this end, we present a novel replication protocol, capable of enriching an existing geo and partially replicated datastore with the causal+ consistency model. In addition, this thesis also presents a concrete implementation of the proposed protocol over the popular Cassandra datastore system. This implementation is complemented with experimental results obtained in a realistic scenario, in which we compare our proposal withmultiple configurations of the Cassandra datastore (without causal consistency guarantees) and with other existing alternatives. The results show that our proposed solution is able to achieve a balanced performance, with low data visibility delays and without significant performance penalties
    corecore