6,517 research outputs found

    Characterizing the query behavior in peer-to-peer file sharing systems

    Get PDF
    This paper characterizes the query behavior of peers in a peer-to-peer (P2P) file sharing system. In contrast to previous work, which provides various aggregate workload statistics, we characterize peer behavior in a form that can be used for constructing representative synthetic workloads for evaluating new P2P system designs. In particular, the analysis exposes heterogeneous behavior that occurs on different days, in different geographical regions (i. e., Asia, Europe, and North America) or during different periods of the day. The workload measures include the fraction of connected sessions that are passive (i. e., issue no queries), the duration of such sessions, and for each active session, the number of queries issued, time until first query, query interarrival time, time after last query, and distribution of query popularity. Moreover, the key correlations in these workload measures are captured in the form of conditional distributions, such that the correlations can be accurately reproduced in a synthetic workload. The characterization is based on trace data gathered in the Gnutella P2P system over a period of 40 days. To characterize system-independent user behavior, we eliminate queries that are specific to the Gnutella system software, such as re-queries that are automatically issued by some client implementations to improve system responsiveness

    Relating Query Popularity and File Replication in the Gnutella Peer-to-Peer Network

    Get PDF
    In this paper, we characterize the user behavior in a peer-to-peer (P2P) file sharing network. Our characterization is based on the results of an extensive passive measurement study of the messages exchanged in the Gnutella P2P file sharing system. Using the data recorded during this measurement study, we analyze which queries a user issues and which files a user shares. The investigation of users queries leads to the characterization of query popularity. Furthermore, the analysis of the files shared by the users leads to a characterization of file replication. As major contribution, we relate query popularity and file replication by an analytical formula characterizing the matching of files to queries. The analytical formula defines a matching probability for each pair of query and file, which depends on the rank of the query with respect query popularity, but is independent of the rank of the file with respect to file replication. We validate this model by conducting a detailed simulation study of a Gnutella-style overlay network and comparing simulation results to the results obtained from the measurement

    An Improved Scheme for Interest Mining Based on a Reconfiguration of the Peer-to-Peer Overlay

    Get PDF
    Tan et al. proposed a scheme to improve the quality of a file search in unstructured Peer-to-Peer systems by focusing on the similarity of interest of the participating peers. Although it certainly improves the cost/performance ratio of a simple flooding-based scheme used in conventional systems, the Tan's method has a serious drawback such that a query cannot reach a target peer if a requesting peer is not connected with the target peer through a path consisting of peers to have similar interest to the given query. In order to overcome such drawback of the Tan's method, we propose a scheme to reconfigure the underlying network in such a way that a requesting peer has a neighbor interested in the given query, before transmitting a query to its neighbors. The performance of the proposed scheme is evaluated by simulation. The result of simulation indicates that it certainly overcomes the drawback of the Tan's method

    Diffusive capture processes for information search

    Get PDF
    We show how effectively the diffusive capture processes (DCP) on complex networks can be applied to information search in the networks. Numerical simulations show that our method generates only 2% of traffic compared with the most popular flooding-based query-packet-forwarding (FB) algorithm. We find that the average searching time, , of the our model is more scalable than another well known $n$-random walker model and comparable to the FB algorithm both on real Gnutella network and scale-free networks with $\gamma =2.4$. We also discuss the possible relationship between and , the second moment of the degree distribution of the networks

    Generalized probabilistic flooding in unstructured peer-to-peer networks

    Get PDF

    Statistical Modelling of Information Sharing: Community, Membership and Content

    Full text link
    File-sharing systems, like many online and traditional information sharing communities (e.g. newsgroups, BBS, forums, interest clubs), are dynamical systems in nature. As peers get in and out of the system, the information content made available by the prevailing membership varies continually in amount as well as composition, which in turn affects all peers' join/leave decisions. As a result, the dynamics of membership and information content are strongly coupled, suggesting interesting issues about growth, sustenance and stability. In this paper, we propose to study such communities with a simple statistical model of an information sharing club. Carrying their private payloads of information goods as potential supply to the club, peers join or leave on the basis of whether the information they demand is currently available. Information goods are chunked and typed, as in a file sharing system where peers contribute different files, or a forum where messages are grouped by topics or threads. Peers' demand and supply are then characterized by statistical distributions over the type domain. This model reveals interesting critical behaviour with multiple equilibria. A sharp growth threshold is derived: the club may grow towards a sustainable equilibrium only if the value of an order parameter is above the threshold, or shrink to emptiness otherwise. The order parameter is composite and comprises the peer population size, the level of their contributed supply, the club's efficiency in information search, the spread of supply and demand over the type domain, as well as the goodness of match between them.Comment: accepted in International Symposium on Computer Performance, Modeling, Measurements and Evaluation, Juan-les-Pins, France, October-200

    Analyzing peer-to-peer traffic across large networks

    Get PDF

    Distributed Reasoning in a Peer-to-Peer Setting: Application to the Semantic Web

    Full text link
    In a peer-to-peer inference system, each peer can reason locally but can also solicit some of its acquaintances, which are peers sharing part of its vocabulary. In this paper, we consider peer-to-peer inference systems in which the local theory of each peer is a set of propositional clauses defined upon a local vocabulary. An important characteristic of peer-to-peer inference systems is that the global theory (the union of all peer theories) is not known (as opposed to partition-based reasoning systems). The main contribution of this paper is to provide the first consequence finding algorithm in a peer-to-peer setting: DeCA. It is anytime and computes consequences gradually from the solicited peer to peers that are more and more distant. We exhibit a sufficient condition on the acquaintance graph of the peer-to-peer inference system for guaranteeing the completeness of this algorithm. Another important contribution is to apply this general distributed reasoning setting to the setting of the Semantic Web through the Somewhere semantic peer-to-peer data management system. The last contribution of this paper is to provide an experimental analysis of the scalability of the peer-to-peer infrastructure that we propose, on large networks of 1000 peers
    corecore