1,881 research outputs found
On the support of versioning in distributed key-value stores
The ability to access and query data stored in multiple versions is an important asset for many applications, such as Web graph analysis, collaborative editing platforms, data forensics, or correlation mining. The storage and retrieval of versioned data requires a specific API and support from the storage layer. The choice of the data structures used to maintain versioned data has a fundamental impact on the performance of insertions and queries. The appropriate data structure also depends on the nature of the versioned data and the nature of the access patterns. In this paper we study the design and implementation space for providing versioning support on top of a distributed key-value store (KVS). We define an API for versioned data access supporting multiple writers and show that a plain KVS does not offer the necessary synchronization power for implementing this API. We leverage the support for listeners at the KVS level and propose a general construction for implementing arbitrary types of data structures for storing and querying versioned data. We explore the design space of versioned data storage ranging from a flat data structure to a distributed sharded index. The resulting system, \system, is implemented on top of an industrial-grade open-source KVS, Infinispan. Our evaluation, based on real-world Wikipedia access logs, studies the performance of each versioning mechanisms in terms of load balancing, latency and storage overhead in the context of different access scenarios
FreSh: A Lock-Free Data Series Index
We present FreSh, a lock-free data series index that exhibits good
performance (while being robust). FreSh is based on Refresh, which is a generic
approach we have developed for supporting lock-freedom in an efficient way on
top of any localityaware data series index. We believe Refresh is of
independent interest and can be used to get well-performed lock-free versions
of other locality-aware blocking data structures. For developing FreSh, we
first studied in depth the design decisions of current state-of-the-art data
series indexes, and the principles governing their performance. This led to a
theoretical framework, which enables the development and analysis of data
series indexes in a modular way. The framework allowed us to apply Refresh,
repeatedly, to get lock-free versions of the different phases of a family of
data series indexes. Experiments with several synthetic and real datasets
illustrate that FreSh achieves performance that is as good as that of the
state-of-the-art blocking in-memory data series index. This shows that the
helping mechanisms of FreSh are light-weight, respecting certain principles
that are crucial for performance in locality-aware data structures.This paper
was published in SRDS 2023.Comment: 12 pages, 18 figures, Conference: Symposium on Reliable Distributed
Systems (SRDS 2023
Smart meter data processing: a showcase for simple and efficient textual processing
The increase in the production and collection of data from devices is an
ongoing trend due to the roll-out of more cyber-physical applications. Smart
meters, because of their importance in power grids, are a class of such devices
whose produced data requires meticulous processing. In this paper, we use
Unicage, a data processing system based on classic Unix shell scripting, that
delivers excellent performance in a simple package. We use this methodology to
process smart meter data in XML format, subjected to the constraints posed by a
real use case. We develop a solution that parses, validates and performs a
simple aggregation of 27 million XML files in less than 10 minutes. We present
a study of the solution as well as the benefits of its adoption.Comment: 11 pages, 5 figures, 1 table, 9 listings. Accepted after review for
the 1st Workshop on High-Performance and Reliable Big Data (HPBD 2021), which
was held virtually on September 20th 2021, and was co-located with the 40th
International Symposium on Reliable Distributed Systems (SRDS 2021
Impact of EU duty cycle and transmission power limitations for sub-GHz LPWAN SRDs : an overview and future challenges
Long-range sub-GHz technologies such as LoRaWAN, SigFox, IEEE 802.15.4, and DASH7 are increasingly popular for academic research and daily life applications. However, especially in the European Union (EU), the use of their corresponding frequency bands are tightly regulated, since they must confirm to the short-range device (SRD) regulations. Regulations and standards for SRDs exist on various levels, from global to national, but are often a source of confusion. Not only are multiple institutes responsible for drafting legislation and regulations, depending on the type of document can these rules be informational or mandatory. Regulations also vary from region to region; for example, regulations in the United States of America (USA) rely on electrical field strength and harmonic strength, while EU regulations are based on duty cycle and maximum transmission power. A common misconception is the presence of a common 1% duty cycle, while in fact the duty cycle is frequency band-specific and can be loosened under certain circumstances. This paper clarifies the various regulations for the European region, the parties involved in drafting and enforcing regulation, and the impact on recent technologies such as SigFox, LoRaWAN, and DASH7. Furthermore, an overview is given of potential mitigation approaches to cope with the duty cycle constraints, as well as future research directions
ALBUS: a Probabilistic Monitoring Algorithm to Counter Burst-Flood Attacks
Modern DDoS defense systems rely on probabilistic monitoring algorithms to
identify flows that exceed a volume threshold and should thus be penalized.
Commonly, classic sketch algorithms are considered sufficiently accurate for
usage in DDoS defense. However, as we show in this paper, these algorithms
achieve poor detection accuracy under burst-flood attacks, i.e., volumetric
DDoS attacks composed of a swarm of medium-rate sub-second traffic bursts.
Under this challenging attack pattern, traditional sketch algorithms can only
detect a high share of the attack bursts by incurring a large number of false
positives.
In this paper, we present ALBUS, a probabilistic monitoring algorithm that
overcomes the inherent limitations of previous schemes: ALBUS is highly
effective at detecting large bursts while reporting no legitimate flows, and
therefore improves on prior work regarding both recall and precision. Besides
improving accuracy, ALBUS scales to high traffic rates, which we demonstrate
with an FPGA implementation, and is suitable for programmable switches, which
we showcase with a P4 implementation.Comment: Accepted at the 42nd International Symposium on Reliable Distributed
Systems (SRDS 2023
Recommended from our members
Improving DBMS performance through diverse redundancy
Database replication is widely used to improve both fault tolerance and DBMS performance. Non-diverse database replication has a significant limitation - it is effective against crash failures only. Diverse redundancy is an effective mechanism of tolerating a wider range of failures, including many non-crash failures. However it has not been adopted in practice because many see DBMS performance as the main concern. In this paper we show experimental evidence that diverse redundancy (diverse replication) can bring benefits in terms of DBMS performance, too. We report on experimental results with an optimistic architecture built with two diverse DBMSs under a load derived from TPC-C benchmark, which show that a diverse pair performs faster not only than non-diverse pairs but also than the individual copies of the DBMSs used. This result is important because it shows potential for DBMS performance better than anything achievable with the available off-the-shelf servers
Consistency Management Among Replicas in Peer-to-Peer Mobile Ad Hoc Networks
Recent advances in wireless communication along with peer-to-peer (P2P) paradigm have led to increasing interest in P2P mobile ad hoc networks. In this paper, we assume an environment where each mobile peer accesses data items held by other peers which are connected by a mobile ad hoc network. Since peers\u27 mobility causes frequent network partitions, replicas of a data item may be inconsistent due to write operations performed by mobile peers. In such an environment, the global consistency of data items is not desirable by many applications. Thus, new consistency maintenance based on local conditions such as location and time need to be investigated. This paper attempts to classify different consistency levels according to requirements from applications and provides protocols to realize them. We report simulation results to investigate the characteristics of these consistency protocols in a P2P wireless ad hoc network environment and their relationship with the quorum sizes
- …