Search CORE

7,742 research outputs found

DKVF: A Framework for Rapid Prototyping and Evaluating Distributed Key-value Stores

Author: Kulkarni Sandeep
Roohitavaf Mohammad
Publication venue
Publication date: 15/01/2018
Field of study

We present our framework DKVF that enables one to quickly prototype and evaluate new protocols for key-value stores and compare them with existing protocols based on selected benchmarks. Due to limitations of CAP theorem, new protocols must be developed that achieve the desired trade-off between consistency and availability for the given application at hand. Hence, both academic and industrial communities focus on developing new protocols that identify a different (and hopefully better in one or more aspect) point on this trade-off curve. While these protocols are often based on a simple intuition, evaluating them to ensure that they indeed provide increased availability, consistency, or performance is a tedious task. Our framework, DKVF, enables one to quickly prototype a new protocol as well as identify how it performs compared to existing protocols for pre-specified benchmarks. Our framework relies on YCSB (Yahoo! Cloud Servicing Benchmark) for benchmarking. We demonstrate DKVF by implementing four existing protocols --eventual consistency, COPS, GentleRain and CausalSpartan-- with it. We compare the performance of these protocols against different loading conditions. We find that the performance is similar to our implementation of these protocols from scratch. And, the comparison of these protocols is consistent with what has been reported in the literature. Moreover, implementation of these protocols was much more natural as we only needed to translate the pseudocode into Java (and add the necessary error handling). Hence, it was possible to achieve this in just 1-2 days per protocol. Finally, our framework is extensible. It is possible to replace individual components in the framework (e.g., the storage component)

arXiv.org e-Print Archive

Crossref

A policy language definition for provenance in pervasive computing

Author: Alsiyami Aeshah Abdulkarim Dammad
Publication venue
Publication date: 29/05/2012
Field of study

Recent advances in computing technology have led to the paradigm of pervasive computing, which provides a means of simplifying daily life by integrating information processing into the everyday physical world. Pervasive computing draws its power from knowing the surroundings and creates an environment which combines computing and communication capabilities. Sensors that provide high-resolution spatial and instant measurement are most commonly used for forecasting, monitoring and real-time environmental modelling. Sensor data generated by a sensor network depends on several influences, such as the configuration and location of the sensors or the processing performed on the raw measurements. Storing sufficient metadata that gives meaning to the recorded observation is important in order to draw accurate conclusions or to enhance the reliability of the result dataset that uses this automatically collected data. This kind of metadata is called provenance data, as the origin of the data and the process by which it arrived from its origin are recorded. Provenance is still an exploratory field in pervasive computing and many open research questions are yet to emerge. The context information and the different characteristics of the pervasive environment call for different approaches to a provenance support system. This work implements a policy language definition that specifies the collecting model for provenance management systems and addresses the challenges that arise with stream data and sensor environments. The structure graph of the proposed model is mapped to the Open Provenance Model in order to facilitating the sharing of provenance data and interoperability with other systems. As provenance security has been recognized as one of the most important components in any provenance system, an access control language has been developed that is tailored to support the special requirements of provenance: fine-grained polices, privacy policies and preferences. Experimental evaluation findings show a reasonable overhead for provenance collecting and a reasonable time for provenance query performance, while a numerical analysis was used to evaluate the storage overhead

Sussex Research Online

A Semantic Consistency Model to Reduce Coordination in Replicated Systems

Author: Gomes Nuno Filipe Estêvão
Publication venue
Publication date: 01/02/2021
Field of study

Large-scale distributed applications need to be available and responsive to satisfy millions of users, which can be achieved by having data geo-replicated in multiple replicas. However, a partitioned system cannot sustain availability and consistency at fully. The usage of weak consistency models might lead to data integrity violations, triggered by problematic concurrent updates, such as selling twice the last ticket on a flight company service. To overcome possible conflicts, programmers might opt to apply strong consistency, which guarantees a total order between operations, while preserving data integrity. Nevertheless, the illusion of being a non-replicated system affects its availability. In contrast, weaker notions might be used, such as eventual consistency, that boosts responsiveness, as operations are executed directly at the source replica and their effects are propagated to remote replicas in the background. However, this approach might put data integrity at risk. Current protocols that preserve invariants rely on, at least, causal consistency, a consistency model that maintains causal dependencies between operations. In this dissertation, we propose a protocol that includes a semantic consistency model. This consistency model stands between eventual consistency and causal consistency. We guarantee better performance comparing with causal consistency, and ensure data integrity. Through semantic analysis, relying on the static analysis tool CISE3, we manage to limit the maximum number of dependencies that each operation will have. To support the protocol, we developed a communication algorithm in a cluster. Additionally, we present an architecture that uses Akka, an actor-based middleware in which actors communicate by exchanging messages. This architecture adopts the publish/subscribe pattern and includes data persistence. We also consider the stability of operations, as well as a dynamic cluster environment, ensuring the convergence of the replicated state. Finally, we perform an experimental evaluation regarding the performance of the algorithm using standard case studies. The evaluation confirms that by relying on semantic analysis, the system requires less coordination between the replicas than causal consistency, ensuring data integrity.Aplicações distribuídas em larga escala necessitam de estar disponíveis e de serem responsivas para satisfazer milhões de utilizadores, o que pode ser alcançado através da geo-replicação dos dados em múltiplas réplicas. No entanto, um sistema particionado não consegue garantir disponibilidade e consistência na sua totalidade. O uso de modelos de consistência fraca pode levar a violações da integridade dos dados, originadas por escritas concorrentes problemáticas. Para superar possíveis conflitos, os programadores podem optar por aplicar modelos de consistência forte, originando uma ordem total das operações, assegurando a integridade dos dados. Em contrapartida, podem ser utilizadas noções mais fracas, como a consistência eventual, que aumenta a capacidade de resposta, uma vez que as operações são executadas diretamente na réplica de origem e os seus efeitos são propagados para réplicas remotas. No entanto, esta abordagem pode colocar em risco a integridade dos dados. Os protocolos existentes que preservam as invariantes dependem, pelo menos, da consistência causal, um modelo de consistência que mantém as dependências causais entre operações. Nesta dissertação propomos um protocolo que inclui um modelo de consistência semântica. Este modelo situa-se entre a consistência eventual e a consistência causal. Garantimos um melhor desempenho em comparação com a consistência causal, e asseguramos a integridade dos dados. Através de uma análise semântica, obtida através da ferramenta de análise estática CISE3, conseguimos limitar o número de dependências de cada operação. Para suportar o protocolo, desenvolvemos um algoritmo de comunicação entre um aglomerado de réplicas. Adicionalmente, apresentamos uma arquitetura que utiliza Akka, um middleware baseado em atores que trocam mensagens entre si. Esta arquitetura utiliza o padrão publish/subscribe e inclui a persistência dos dados. Consideramos também a estabilidade das operações, bem como um ambiente dinâmico de réplicas, assegurando a convergência do estado. Por último, apresentamos a avaliação do desempenho do algoritmo desenvolvido, que confirma que a análise semântica das operações requer menos coordenação entre as réplicas que a consistência causal

Repositório da Universidade Nova de Lisboa

Activity Report 2012. Project-Team RMOD. Analyses and Languages Constructs for Object-Oriented Application Evolution

Author: Anquetil Nicolas
Cassou Damien
Denker Marcus
Ducasse Stéphane
Pollet Damien
Publication venue: HAL CCSD
Publication date: 20/12/2012
Field of study

Activity Report 2012 Project-Team RMOD Analyses and Languages Constructs for Object-Oriented Application Evolutio

HAL - Lille 3

INRIA a CCSD electronic archive server

Big continuous data: dealing with velocity by composing event streams

Author: Espinosa Oviedo Javier Alfonso
Vargas-Solar Genoveva
Zechinelli-Martini José-Luis
Publication venue: Springer Verlag
Publication date: 01/01/2016
Field of study

International audienceThe rate at which we produce data is growing steadily, thus creating even larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to on-going events. Modern applications consuming these streams need to extract behaviour patterns that can be obtained by aggregating and mining statically and dynamically huge event histories. An event is the notification that a happening of interest has occurred. Event streams must be combined or aggregated to produce more meaningful information. By combining and aggregating them either from multiple producers, or from a single one during a given period of time, a limited set of events describing meaningful situations may be notified to consumers. Event streams with their volume and continuous production cope mainly with two of the characteristics given to Big Data by the 5V’s model: volume & velocity. Techniques such as complex pattern detection, event correlation, event aggregation, event mining and stream processing, have been used for composing events. Nevertheless, to the best of our knowledge, few approaches integrate different composition techniques (online and post-mortem) for dealing with Big Data velocity. This chapter gives an analytical overview of event stream processing and composition approaches: complex event languages, services and event querying systems on distributed logs. Our analysis underlines the challenges introduced by Big Data velocity and volume and use them as reference for identifying the scope and limitations of results stemming from different disciplines: networks, distributed systems, stream databases, event composition services, and data mining on traces

Hal - Université Grenoble Alpes

Metadata and provenance management

Author: Berriman Bruce
Chervenak Ann
Corcho Oscar
Deelman Ewa
Groth Paul
Moreau Luc
Publication venue
Publication date: 01/01/2009
Field of study

Scientists today collect, analyze, and generate TeraBytes and PetaBytes of data. These data are often shared and further processed and analyzed among collaborators. In order to facilitate sharing and data interpretations, data need to carry with it metadata about how the data was collected or generated, and provenance information about how the data was processed. This chapter describes metadata and provenance in the context of the data lifecycle. It also gives an overview of the approaches to metadata and provenance management, followed by examples of how applications use metadata and provenance in their scientific processes

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

A review of experiences with reliable multicast

Author: Kenneth P. Birman
Publication venue: 'Wiley'
Publication date: 01/01/2002
Field of study

Crossref