22 research outputs found

    Synchronization of data in heterogeneous decentralized systems

    Get PDF
    Data synchronization is the problem of reconciling the differences between large data stores that differ in a small number of records. It is a common thread among disparate distributed systems ranging from fleets of Internet of Things (IoT) devices to clusters of distributed databases in the cloud. Most recently, data synchronization has arisen in globally distributed public blockchains that build the basis for the envisioned decentralized Internet of the future. Moreover, the parallel development of edge computing has significantly increased the heterogeneity of networks and computing devices. The merger of highly heterogeneous system resources and the decentralized nature of future Internet applications calls for a new approach to data synchronization. In this dissertation, we look at the problem of data synchronization through the prism of set reconciliation and introduce novel tools and protocols that improve the performance of data synchronization in heterogeneous decentralized systems. First, we compare the analytical properties of the state-of-the-art set reconciliation protocols, and investigate the impact of theoretical assumptions and implementation decisions on the synchronization performance. Second, we introduce GenSync, the first unified set reconciliation middleware. Using GenSync's distinctive benchmarking layer, we find that the best protocol choice is highly sensitive to the system conditions, and a bad protocol choice causes a severe hit in performance. We showcase the evaluative power of GenSync in one of the world's largest wireless network emulators, and demonstrate choosing the best GenSync protocol under a high and low user mobility in an emulated cellular network. Finally, we introduce SREP (Set Reconciliation-Enhanced Propagation), a novel blockchain transaction pool synchronization protocol with quantifiable guarantees. Through simulations, we show that SREP incurs significantly smaller bandwidth overhead than a similar approach from the literature, especially in the networks of realistic sizes (tens of thousands of participants)

    Range-Based Set Reconciliation and Authenticated Set Representations

    Full text link
    Range-based set reconciliation is a simple approach to efficiently computing the union of two sets over a network, based on recursively partitioning the sets and comparing fingerprints of the partitions to probabilistically detect whether a partition requires further work. Whereas prior presentations of this approach focus on specific fingerprinting schemes for specific use-cases, we give a more generic description and analysis in the broader context of set reconciliation. Precisely capturing the design space for fingerprinting schemes allows us to survey for cryptographically secure schemes. Furthermore, we reduce the time complexity of local computations by a logarithmic factor compared to previous publications. In investigating secure associative hash functions, we open up a new class of tree-based authenticated data structures which need not prescribe a deterministic balancing scheme

    Probabilistic Data Structures in Adversarial Environments

    Get PDF
    Probabilistic data structures use space-efficient representations of data in order to (approximately) respond to queries about the data. Traditionally, these structures are accompanied by probabilistic bounds on query-response errors. These bounds implicitly assume benign attack models, in which the data and the queries are chosen non-adaptively, and independent of the randomness used to construct the representation. Yet probabilistic data structures are increasingly used in settings where these assumptions may be violated. This work provides a provable-security treatment of probabilistic data structures in adversarial environments. We give a syntax that captures a wide variety of in-use structures, and our security notions support derivation of error bounds in the presence of powerful attacks. We use our formalisms to analyze Bloom filters, counting (Bloom) filters and count-min sketch data structures. For the traditional version of these, our security findings are largely negative; however, we show that simple embellishments (e.g., using salts or secret keys) yields structures that provide provable security, and with little overhead

    Avoiding Flow Size Overestimation in the Count-Min Sketch with Bloom Filter Constructions

    Get PDF
    The Count-Min sketch is the most popular data structure for flow size estimation, a basic measurement task required in many networks. Typically the number of potential flows is large, eliminating the possibility to maintain a counter per flow within memory of high access rate. The Count-Min sketch is probabilistic and relies on mapping each flow to multiple counters through hashing. This implies potential estimation error such that the size of a flow is overestimated when all flow counters are shared with other flows with observed traffic. Although the error in the estimation can be probabilistically bounded, many applications can benefit from accurate flow size estimation and the guarantee to completely avoid overestimation. We describe a design of the Count-Min sketch with accurate estimations whenever the number of flows with observed traffic follows a known bound, regardless of the identity of these particular flows. We make use of a concept of Bloom filters that avoid false positives and indicate the limitations of existing Bloom filter designs towards accurate size estimation. We suggest new Bloom filter constructions that allow scalability with the support for a larger number of flows and explain how these can imply the unique guarantee of accurate flow size estimation in the well known Count-Min sketch.Ori Rottenstreich was partially supported by the German-Israeli Foundation for Scientic Research and Development (GIF), by the Gordon Fund for System Engineering as well as by the Technion Hiroshi Fujiwara Cyber Security Research Center and the Israel National Cyber Directorate. Pedro Reviriego would like to acknowledge the sup-port of the ACHILLES project PID2019-104207RB-I00 and the Go2Edge network RED2018-102585-T funded by the Spanish Ministry of Science and Innovation and of the Madrid Community research project TAPIR-CM grant no. P2018/TCS-4496

    Bloom Filter with a False Positive Free Zone

    Get PDF
    corecore