8 research outputs found
Residual-Based Estimation of Peer and Link Lifetimes in P2P Networks
Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-BasedMethod (CBM), which divides a given observation window into two halves and samples users ldquocreatedrdquo in the first half every Delta time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we first derive a model for CBM and show that small window size or large Delta may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent tradeoff between overhead and accuracy, which does not allow any fundamental improvement to the method. Instead, we propose a completely different approach for sampling user dynamics that keeps track of only residual lifetimes of peers and uses a simple renewal-process model to recover the actual lifetimes from the observed residuals. Our analysis indicates that for reasonably large systems, the proposed method can reduce bandwidth consumption by several orders of magnitude compared to prior approaches while simultaneously achieving higher accuracy. We finish the paper by implementing a two-tier Gnutella network crawler equipped with the proposed sampling method and obtain the distribution of ultrapeer lifetimes in a network of 6.4 million users and 60 million links. Our experimental results show that ultrapeer lifetimes are Pareto with shape alpha ap 1.1; however, link lifetimes exhibit much lighter tails with alpha ap 1.8
Residual-Based Measurement of Peer and Link Lifetimes in Gnutella Networks
Existing methods of measuring lifetimes in P2P systems usually rely on the so-called create-based method (CBM), which divides a given observation window into two halves and samples users created in the first half every Delta time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we flrst derive a model for CBM and show that small window size or large Delta may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent tradeoff between overhead and accuracy, which does not allow any fundamental improvement to the method. Instead, we propose a completely different approach for sampling user dynamics that keeps track of only residual lifetimes of peers and uses a simple renewal-process model to recover the actual lifetimes from the observed residuals. Our analysis indicates that for reasonably large systems, the proposed method can reduce bandwidth consumption by several orders of magnitude compared to prior approaches while simultaneously achieving higher accuracy. We finish the paper by implementing a two-tier Gnutella network crawler equipped with the proposed sampling method and obtain the distribution of ultrapeer lifetimes in a network of 6.4 million users and 60 million links. Our experimental results show that ultrapeer lifetimes are Pareto with shape a alpha ap 1.1; however, link lifetimes exhibit much lighter tails with alpha ap 1.9
Node Isolation Model and Age-Based Neighbor Selection in Unstructured P2P Networks
Previous analytical studies of unstructured P2P resilience have assumed exponential user lifetimes and only considered age-independent neighbor replacement. In this paper, we overcome these limitations by introducing a general node-isolation model for heavy-tailed user lifetimes and arbitrary neighbor-selection algorithms. Using this model, we analyze two age-biased neighbor-selection strategies and show that they significantly improve the residual lifetimes of chosen users, which dramatically reduces the probability of user isolation and graph partitioning compared with uniform selection of neighbors. In fact, the second strategy based on random walks on age-proportional graphs demonstrates that, for lifetimes with infinite variance, the system monotonically increases its resilience as its age and size grow. Specifically, we show that the probability of isolation converges to zero as these two metrics tend to infinity. We finish the paper with simulations in finite-size graphs that demonstrate the effect of this result in practice
On Node Isolation under Churn in Unstructured P2P Networks with Heavy-Tailed Lifetimes
Previous analytical studies [12], [18] of unstructured P2P resilience have assumed exponential user lifetimes and only considered age-independent neighbor replacement. In this paper, we overcome these limitations by introducing a general node-isolation model for heavy-tailed user lifetimes and arbitrary neighbor-selection algorithms. Using this model, we analyze two age-biased neighbor-selection strategies and show that they significantly improve the residual lifetimes of chosen users, which dramatically reduces the probability of user isolation and graph partitioning compared to uniform selection of neighbors. In fact, the second strategy based on random walks on age-weighted graphs demonstrates that for lifetimes with infinite variance, the system monotonically increases its resilience as its age and size grow. Specifically, we show that the probability of isolation converges to zero as these two metrics tend to infinity. We finish the paper with simulations in finite-size graphs that demonstrate the effect of this result in practice
Understanding Churn in Decentralized Peer-to-Peer Networks
This dissertation presents a novel modeling framework for understanding the dynamics
of peer-to-peer (P2P) networks under churn (i.e., random user arrival/departure)
and designing systems more resilient against node failure. The proposed models are
applicable to general distributed systems under a variety of conditions on graph construction
and user lifetimes.
The foundation of this work is a new churn model that describes user arrival and
departure as a superposition of many periodic (renewal) processes. It not only allows
general (non-exponential) user lifetime distributions, but also captures heterogeneous
behavior of peers. We utilize this model to analyze link dynamics and the ability
of the system to stay connected under churn. Our results offers exact computation
of user-isolation and graph-partitioning probabilities for any monotone lifetime distribution,
including heavy-tailed cases found in real systems. We also propose an
age-proportional random-walk algorithm for creating links in unstructured P2P networks
that achieves zero isolation probability as system size becomes infinite. We
additionally obtain many insightful results on the transient distribution of in-degree,
edge arrival process, system size, and lifetimes of live users as simple functions of the
aggregate lifetime distribution.
The second half of this work studies churn in structured P2P networks that are
usually built upon distributed hash tables (DHTs). Users in DHTs maintain two types of neighbor sets: routing tables and successor/leaf sets. The former tables determine
link lifetimes and routing performance of the system, while the latter are built for
ensuring DHT consistency and connectivity. Our first result in this area proves that
robustness of DHTs is mainly determined by zone size of selected neighbors, which
leads us to propose a min-zone algorithm that significantly reduces link churn in
DHTs. Our second result uses the Chen-Stein method to understand concurrent
failures among strongly dependent successor sets of many DHTs and finds an optimal
stabilization strategy for keeping Chord connected under churn
On Static and Dynamic Partitioning Behavior of Large-Scale Networks
In this paper, we analyze the problem of network disconnection in the context of large-scale P2P networks and understand how both static and dynamic patterns of node failure affect the resilience of such graphs. We start by applying classical results from random graph theory to show that a large variety of deterministic and random P2P graphs almost surely (i.e., with probability 1 − o(1)) remain connected under random failure if and only if they have no isolated nodes. This simple, yet powerful, result subsequently allows us to derive in closed-form the probability that a P2P network develops isolated nodes, and therefore partitions, under both types of node failure. We finish the paper by demonstrating that our models match simulations very well and that dynamic P2P systems are extremely resilient under node churn as long as the neighbor replacement delay is much smaller than the average user lifetime. 1