105 research outputs found

    Characterization of P2P Systems

    Full text link
    Understanding existing systems and devising new P2P techniques relies on having access to representative models derived from empirical observations of existing systems. However, the large and dynamic nature of P2P systems makes capturing accurate measurements challenging. Because there is no central repository, data must b

    Residual-Based Measurement of Peer and Link Lifetimes in Gnutella Networks

    Get PDF
    Existing methods of measuring lifetimes in P2P systems usually rely on the so-called create-based method (CBM), which divides a given observation window into two halves and samples users created in the first half every Delta time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we flrst derive a model for CBM and show that small window size or large Delta may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent tradeoff between overhead and accuracy, which does not allow any fundamental improvement to the method. Instead, we propose a completely different approach for sampling user dynamics that keeps track of only residual lifetimes of peers and uses a simple renewal-process model to recover the actual lifetimes from the observed residuals. Our analysis indicates that for reasonably large systems, the proposed method can reduce bandwidth consumption by several orders of magnitude compared to prior approaches while simultaneously achieving higher accuracy. We finish the paper by implementing a two-tier Gnutella network crawler equipped with the proposed sampling method and obtain the distribution of ultrapeer lifetimes in a network of 6.4 million users and 60 million links. Our experimental results show that ultrapeer lifetimes are Pareto with shape a alpha ap 1.1; however, link lifetimes exhibit much lighter tails with alpha ap 1.9

    Residual-Based Estimation of Peer and Link Lifetimes in P2P Networks

    Get PDF
    Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-BasedMethod (CBM), which divides a given observation window into two halves and samples users ldquocreatedrdquo in the first half every Delta time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we first derive a model for CBM and show that small window size or large Delta may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent tradeoff between overhead and accuracy, which does not allow any fundamental improvement to the method. Instead, we propose a completely different approach for sampling user dynamics that keeps track of only residual lifetimes of peers and uses a simple renewal-process model to recover the actual lifetimes from the observed residuals. Our analysis indicates that for reasonably large systems, the proposed method can reduce bandwidth consumption by several orders of magnitude compared to prior approaches while simultaneously achieving higher accuracy. We finish the paper by implementing a two-tier Gnutella network crawler equipped with the proposed sampling method and obtain the distribution of ultrapeer lifetimes in a network of 6.4 million users and 60 million links. Our experimental results show that ultrapeer lifetimes are Pareto with shape alpha ap 1.1; however, link lifetimes exhibit much lighter tails with alpha ap 1.8

    Respondent-Driven Sampling for Characterizing Unstructured Overlays

    Full text link

    Resource location based on precomputed partial random walks in dynamic networks

    Full text link
    The problem of finding a resource residing in a network node (the \emph{resource location problem}) is a challenge in complex networks due to aspects as network size, unknown network topology, and network dynamics. The problem is especially difficult if no requirements on the resource placement strategy or the network structure are to be imposed, assuming of course that keeping centralized resource information is not feasible or appropriate. Under these conditions, random algorithms are useful to search the network. A possible strategy for static networks, proposed in previous work, uses short random walks precomputed at each network node as partial walks to construct longer random walks with associated resource information. In this work, we adapt the previous mechanisms to dynamic networks, where resource instances may appear in, and disappear from, network nodes, and the nodes themselves may leave and join the network, resembling realistic scenarios. We analyze the resulting resource location mechanisms, providing expressions that accurately predict average search lengths, which are validated using simulation experiments. Reduction of average search lengths compared to simple random walk searches are found to be very large, even in the face of high network volatility. We also study the cost of the mechanisms, focusing on the overhead implied by the periodic recomputation of partial walks to refresh the information on resources, concluding that the proposed mechanisms behave efficiently and robustly in dynamic networks.Comment: 39 pages, 25 figure

    Robust and Scalable Sampling Algorithms for Network Measurement

    Get PDF
    Recent growth of the Internet in both scale and complexity has imposed a number of difficult challenges on existing measurement techniques and approaches, which are essential for both network management and many ongoing research projects. For any measurement algorithm, achieving both accuracy and scalability is very challenging given hard resource constraints (e.g., bandwidth, delay, physical memory, and CPU speed). My dissertation research tackles this problem by first proposing a novel mechanism called residual sampling, which intentionally introduces a predetermined amount of bias into the measurement process. We show that such biased sampling can be extremely scalable; moreover, we develop residual estimation algorithms that can unbiasedly recover the original information from the sampled data. Utilizing these results, we further develop two versions of the residual sampling mechanism: a continuous version for characterizing the user lifetime distribution in large-scale peer-to-peer networks and a discrete version for monitoring flow statistics (including per-flow counts and the flow size distribution) in high-speed Internet routers. For the former application in P2P networks, this work presents two methods: ResIDual-based Estimator (RIDE), which takes single-point snapshots of the system and assumes systems with stationary arrivals, and Uniform RIDE (U-RIDE), which takes multiple snapshots and adapts to systems with arbitrary (including non-stationary) arrival processes. For the latter application in traffic monitoring, we introduce Discrete RIDE (D-RIDE), which allows one to sample each flow with a geometric random variable. Our numerous simulations and experiments with P2P networks and real Internet traces confirm that these algorithms are able to make accurate estimation about the monitored metrics and simultaneously meet the requirements of hard resource constraints. These results show that residual sampling indeed provides an ideal solution to balancing between accuracy and scalability
    • …
    corecore