2,171 research outputs found
Practical Target-Based Synchronization Strategies for Immutable Time-Series Data Tables
As the Internet of Things and industrial monitoring of utilities grow, efficiently synchronizing immutable time-series data streams between databases becomes a pressing issue. Extracting data from critical production databases demands careful consideration of the stress imposed on the machines, so synchronization strategies are required to minimize the transfer of duplicate data and the load imposed on remote sources.
Literature on the synchronization problem is generalized to arbitrary tables and does not consider the characteristics of time-series data streams, so research was required to investigate methods to quickly synchronize source and target time-series data tables. This thesis examines immutable time-series scenarios and synchronization strategies to answer the following question: given several scenarios, which target-based immutable time-series synchronization strategies best optimize run-time, bandwidth, and accuracy?
The strategies explored in this research are implemented into the Meerschaum system, a project intended to leverage these time-series concepts for production deployments. As a practical demonstration, these strategies are used to continuously cache Clemson University’s utilities data
Reconciling Graphs and Sets of Sets
We explore a generalization of set reconciliation, where the goal is to
reconcile sets of sets. Alice and Bob each have a parent set consisting of
child sets, each containing at most elements from a universe of size .
They want to reconcile their sets of sets in a scenario where the total number
of differences between all of their child sets (under the minimum difference
matching between their child sets) is . We give several algorithms for this
problem, and discuss applications to reconciliation problems on graphs,
databases, and collections of documents. We specifically focus on graph
reconciliation, providing protocols based on set of sets reconciliation for
random graphs from and for forests of rooted trees
Key-value storage system synchronization in peer-to-peer environments
Data synchronization is the problem of bringing multiple versions of the same data on different remote devices to the most up to date version. This thesis looks into the particular problem of key-value storage systems synchronization between mobile devices in a peer-to-peer environment. In this research, we describe, implement and evaluate a new key-value storage system synchronization algorithm using a 2-phase approach, combining approximate synchronization in the first phase and exact synchronization in the second phase. The 2-phase architecture helps the algorithm achieve considerable boost in performance in all three major criteria of a data synchronization algorithm, namely synchronization time, processing time and communication cost, while still being suitable to operate in a peer-to-peer environment. The performance increase makes it feasible to employ database synchronization technique in a wider range of mobile applications, especially those operating on a slow peer-to-peer network
- …