3 research outputs found

    On sample-path staleness in lazy data replication

    Full text link
    We analyze synchronization issues arising between two stochastic point processes, one of which models data churn at an information source and the other periodic downloads from its replica (e.g., search engine, web cache, distributed database). Due to lazy (pull-based) synchronization, the replica experiences recurrent staleness, which translates into some form of penalty stemming from its reduced ability to perform consistent compu-tation and/or provide up-to-date responses to customer requests. We model this system under non-Poisson update/refresh processes and obtain sample-path averages of various metrics of staleness cost, generalizing previous results and exposing novel problems in this field

    Distributed Synchronization Under Data Churn

    Get PDF
    Nowadays an increasing number of applications need to maintain local copies of remote data sources to provide services to their users. Because of the dynamic nature of the sources, an application has to synchronize its copies with remote sources constantly to provide reliable services. Instead of push-based synchronization, we focus on pull-based strategy because it doesn’t require source cooperation and has been widely adopted by existing systems. The scalability of the pull-based synchronization comes at the expense of increased inconsistency of the copied content. We model this system under non-Poisson update/refresh processes and obtain sample-path averages of various metrics of staleness cost, generalizing previous results and studying its statistical properties. Computing staleness requires knowledge of the inter-update distribution at the source, which can only be estimated through blind sampling – periodic downloads and comparison against previous copies. We show that all previous approaches are biased unless the observation rate tends to infinity or the update process is Poisson. To overcome these issues, we propose four new algorithms that achieve various levels of consistency, which depend on the amount of temporal information revealed by the source and capabilities of the download process. Then we focus on applying freshness to P2P replication systems. We extend our results to several more difficult algorithms – cascaded replication, cooperative caching, and redundant querying from the clients. Surprisingly, we discover that optimal cooperation involves just a single peer and that redundant querying can hurt the ability of the system to handle load (i.e., may lead to lower scalability)
    corecore