2 research outputs found

    Synchronization of data in heterogeneous decentralized systems

    Get PDF
    Data synchronization is the problem of reconciling the differences between large data stores that differ in a small number of records. It is a common thread among disparate distributed systems ranging from fleets of Internet of Things (IoT) devices to clusters of distributed databases in the cloud. Most recently, data synchronization has arisen in globally distributed public blockchains that build the basis for the envisioned decentralized Internet of the future. Moreover, the parallel development of edge computing has significantly increased the heterogeneity of networks and computing devices. The merger of highly heterogeneous system resources and the decentralized nature of future Internet applications calls for a new approach to data synchronization. In this dissertation, we look at the problem of data synchronization through the prism of set reconciliation and introduce novel tools and protocols that improve the performance of data synchronization in heterogeneous decentralized systems. First, we compare the analytical properties of the state-of-the-art set reconciliation protocols, and investigate the impact of theoretical assumptions and implementation decisions on the synchronization performance. Second, we introduce GenSync, the first unified set reconciliation middleware. Using GenSync's distinctive benchmarking layer, we find that the best protocol choice is highly sensitive to the system conditions, and a bad protocol choice causes a severe hit in performance. We showcase the evaluative power of GenSync in one of the world's largest wireless network emulators, and demonstrate choosing the best GenSync protocol under a high and low user mobility in an emulated cellular network. Finally, we introduce SREP (Set Reconciliation-Enhanced Propagation), a novel blockchain transaction pool synchronization protocol with quantifiable guarantees. Through simulations, we show that SREP incurs significantly smaller bandwidth overhead than a similar approach from the literature, especially in the networks of realistic sizes (tens of thousands of participants)
    corecore