1,881 research outputs found

    On the support of versioning in distributed key-value stores

    Get PDF
    The ability to access and query data stored in multiple versions is an important asset for many applications, such as Web graph analysis, collaborative editing platforms, data forensics, or correlation mining. The storage and retrieval of versioned data requires a specific API and support from the storage layer. The choice of the data structures used to maintain versioned data has a fundamental impact on the performance of insertions and queries. The appropriate data structure also depends on the nature of the versioned data and the nature of the access patterns. In this paper we study the design and implementation space for providing versioning support on top of a distributed key-value store (KVS). We define an API for versioned data access supporting multiple writers and show that a plain KVS does not offer the necessary synchronization power for implementing this API. We leverage the support for listeners at the KVS level and propose a general construction for implementing arbitrary types of data structures for storing and querying versioned data. We explore the design space of versioned data storage ranging from a flat data structure to a distributed sharded index. The resulting system, \system, is implemented on top of an industrial-grade open-source KVS, Infinispan. Our evaluation, based on real-world Wikipedia access logs, studies the performance of each versioning mechanisms in terms of load balancing, latency and storage overhead in the context of different access scenarios

    FreSh: A Lock-Free Data Series Index

    Full text link
    We present FreSh, a lock-free data series index that exhibits good performance (while being robust). FreSh is based on Refresh, which is a generic approach we have developed for supporting lock-freedom in an efficient way on top of any localityaware data series index. We believe Refresh is of independent interest and can be used to get well-performed lock-free versions of other locality-aware blocking data structures. For developing FreSh, we first studied in depth the design decisions of current state-of-the-art data series indexes, and the principles governing their performance. This led to a theoretical framework, which enables the development and analysis of data series indexes in a modular way. The framework allowed us to apply Refresh, repeatedly, to get lock-free versions of the different phases of a family of data series indexes. Experiments with several synthetic and real datasets illustrate that FreSh achieves performance that is as good as that of the state-of-the-art blocking in-memory data series index. This shows that the helping mechanisms of FreSh are light-weight, respecting certain principles that are crucial for performance in locality-aware data structures.This paper was published in SRDS 2023.Comment: 12 pages, 18 figures, Conference: Symposium on Reliable Distributed Systems (SRDS 2023

    Smart meter data processing: a showcase for simple and efficient textual processing

    Full text link
    The increase in the production and collection of data from devices is an ongoing trend due to the roll-out of more cyber-physical applications. Smart meters, because of their importance in power grids, are a class of such devices whose produced data requires meticulous processing. In this paper, we use Unicage, a data processing system based on classic Unix shell scripting, that delivers excellent performance in a simple package. We use this methodology to process smart meter data in XML format, subjected to the constraints posed by a real use case. We develop a solution that parses, validates and performs a simple aggregation of 27 million XML files in less than 10 minutes. We present a study of the solution as well as the benefits of its adoption.Comment: 11 pages, 5 figures, 1 table, 9 listings. Accepted after review for the 1st Workshop on High-Performance and Reliable Big Data (HPBD 2021), which was held virtually on September 20th 2021, and was co-located with the 40th International Symposium on Reliable Distributed Systems (SRDS 2021

    Impact of EU duty cycle and transmission power limitations for sub-GHz LPWAN SRDs : an overview and future challenges

    Get PDF
    Long-range sub-GHz technologies such as LoRaWAN, SigFox, IEEE 802.15.4, and DASH7 are increasingly popular for academic research and daily life applications. However, especially in the European Union (EU), the use of their corresponding frequency bands are tightly regulated, since they must confirm to the short-range device (SRD) regulations. Regulations and standards for SRDs exist on various levels, from global to national, but are often a source of confusion. Not only are multiple institutes responsible for drafting legislation and regulations, depending on the type of document can these rules be informational or mandatory. Regulations also vary from region to region; for example, regulations in the United States of America (USA) rely on electrical field strength and harmonic strength, while EU regulations are based on duty cycle and maximum transmission power. A common misconception is the presence of a common 1% duty cycle, while in fact the duty cycle is frequency band-specific and can be loosened under certain circumstances. This paper clarifies the various regulations for the European region, the parties involved in drafting and enforcing regulation, and the impact on recent technologies such as SigFox, LoRaWAN, and DASH7. Furthermore, an overview is given of potential mitigation approaches to cope with the duty cycle constraints, as well as future research directions

    ALBUS: a Probabilistic Monitoring Algorithm to Counter Burst-Flood Attacks

    Full text link
    Modern DDoS defense systems rely on probabilistic monitoring algorithms to identify flows that exceed a volume threshold and should thus be penalized. Commonly, classic sketch algorithms are considered sufficiently accurate for usage in DDoS defense. However, as we show in this paper, these algorithms achieve poor detection accuracy under burst-flood attacks, i.e., volumetric DDoS attacks composed of a swarm of medium-rate sub-second traffic bursts. Under this challenging attack pattern, traditional sketch algorithms can only detect a high share of the attack bursts by incurring a large number of false positives. In this paper, we present ALBUS, a probabilistic monitoring algorithm that overcomes the inherent limitations of previous schemes: ALBUS is highly effective at detecting large bursts while reporting no legitimate flows, and therefore improves on prior work regarding both recall and precision. Besides improving accuracy, ALBUS scales to high traffic rates, which we demonstrate with an FPGA implementation, and is suitable for programmable switches, which we showcase with a P4 implementation.Comment: Accepted at the 42nd International Symposium on Reliable Distributed Systems (SRDS 2023

    Consistency Management Among Replicas in Peer-to-Peer Mobile Ad Hoc Networks

    Get PDF
    Recent advances in wireless communication along with peer-to-peer (P2P) paradigm have led to increasing interest in P2P mobile ad hoc networks. In this paper, we assume an environment where each mobile peer accesses data items held by other peers which are connected by a mobile ad hoc network. Since peers\u27 mobility causes frequent network partitions, replicas of a data item may be inconsistent due to write operations performed by mobile peers. In such an environment, the global consistency of data items is not desirable by many applications. Thus, new consistency maintenance based on local conditions such as location and time need to be investigated. This paper attempts to classify different consistency levels according to requirements from applications and provides protocols to realize them. We report simulation results to investigate the characteristics of these consistency protocols in a P2P wireless ad hoc network environment and their relationship with the quorum sizes
    corecore