96 research outputs found

    Relating Query Popularity and File Replication in the Gnutella Peer-to-Peer Network

    Get PDF
    In this paper, we characterize the user behavior in a peer-to-peer (P2P) file sharing network. Our characterization is based on the results of an extensive passive measurement study of the messages exchanged in the Gnutella P2P file sharing system. Using the data recorded during this measurement study, we analyze which queries a user issues and which files a user shares. The investigation of users queries leads to the characterization of query popularity. Furthermore, the analysis of the files shared by the users leads to a characterization of file replication. As major contribution, we relate query popularity and file replication by an analytical formula characterizing the matching of files to queries. The analytical formula defines a matching probability for each pair of query and file, which depends on the rank of the query with respect query popularity, but is independent of the rank of the file with respect to file replication. We validate this model by conducting a detailed simulation study of a Gnutella-style overlay network and comparing simulation results to the results obtained from the measurement

    A framework for the dynamic management of Peer-to-Peer overlays

    Get PDF
    Peer-to-Peer (P2P) applications have been associated with inefficient operation, interference with other network services and large operational costs for network providers. This thesis presents a framework which can help ISPs address these issues by means of intelligent management of peer behaviour. The proposed approach involves limited control of P2P overlays without interfering with the fundamental characteristics of peer autonomy and decentralised operation. At the core of the management framework lays the Active Virtual Peer (AVP). Essentially intelligent peers operated by the network providers, the AVPs interact with the overlay from within, minimising redundant or inefficient traffic, enhancing overlay stability and facilitating the efficient and balanced use of available peer and network resources. They offer an “insider‟s” view of the overlay and permit the management of P2P functions in a compatible and non-intrusive manner. AVPs can support multiple P2P protocols and coordinate to perform functions collectively. To account for the multi-faceted nature of P2P applications and allow the incorporation of modern techniques and protocols as they appear, the framework is based on a modular architecture. Core modules for overlay control and transit traffic minimisation are presented. Towards the latter, a number of suitable P2P content caching strategies are proposed. Using a purpose-built P2P network simulator and small-scale experiments, it is demonstrated that the introduction of AVPs inside the network can significantly reduce inter-AS traffic, minimise costly multi-hop flows, increase overlay stability and load-balancing and offer improved peer transfer performance

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Schema matching in a peer-to-peer database system

    Get PDF
    Includes bibliographical references (p. 112-118).Peer-to-peer or P2P systems are applications that allow a network of peers to share resources in a scalable and efficient manner. My research is concerned with the use of P2P systems for sharing databases. To allow data mediation between peers' databases, schema mappings need to exist, which are mappings between semantically equivalent attributes in different peers' schemas. Mappings can either be defined manually or found semi-automatically using a technique called schema matching. However, schema matching has not been used much in dynamic environments, such as P2P networks. Therefore, this thesis investigates how to enable effective semi-automated schema matching within a P2P network

    Optimising Structured P2P Networks for Complex Queries

    Get PDF
    With network enabled consumer devices becoming increasingly popular, the number of connected devices and available services is growing considerably - with the number of connected devices es- timated to surpass 15 billion devices by 2015. In this increasingly large and dynamic environment it is important that users have a comprehensive, yet efficient, mechanism to discover services. Many existing wide-area service discovery mechanisms are centralised and do not scale to large numbers of users. Additionally, centralised services suffer from issues such as a single point of failure, high maintenance costs, and difficulty of management. As such, this Thesis seeks a Peer to Peer (P2P) approach. Distributed Hash Tables (DHTs) are well known for their high scalability, financially low barrier of entry, and ability to self manage. They can be used to provide not just a platform on which peers can offer and consume services, but also as a means for users to discover such services. Traditionally DHTs provide a distributed key-value store, with no search functionality. In recent years many P2P systems have been proposed providing support for a sub-set of complex query types, such as keyword search, range queries, and semantic search. This Thesis presents a novel algorithm for performing any type of complex query, from keyword search, to complex regular expressions, to full-text search, over any structured P2P overlay. This is achieved by efficiently broadcasting the search query, allowing each peer to process the query locally, and then efficiently routing responses back to the originating peer. Through experimentation, this technique is shown to be successful when the network is stable, however performance degrades under high levels of network churn. To address the issue of network churn, this Thesis proposes a number of enhancements which can be made to existing P2P overlays in order to improve the performance of both the existing DHT and the proposed algorithm. Through two case studies these enhancements are shown to improve not only the performance of the proposed algorithm under churn, but also the performance of traditional lookup operations in these networks

    Denial-of-service resilience in peer-to-peer file sharing systems

    Get PDF
    Peer-to-peer (p2p) file sharing systems are characterized by highly replicated content distributed among nodes with enormous aggregate resources for storage and communication. These properties alone are not sufficient, however, to render p2p networks immune to denial-of-service (DoS) attack. In this paper, we study, by means of analytical modeling and simulation, the resilience of p2p file sharing systems against DoS attacks, in which malicious nodes respond to queries with erroneous responses. We consider the filetargeted attacks in current use in the Internet, and we introduce a new class of p2p-network-targeted attacks. In file-targeted attacks, the attacker puts a large number of corrupted versions of a single file on the network. We demonstrate that the effectiveness of these attacks is highly dependent on the clients’ behavior. For the attacks to succeed over the long term, clients must be unwilling to share files, slow in removing corrupted files from their machines, and quick to give up downloading when the system is under attack. In network-targeted attacks, attackers respond to queries for any file with erroneous information. Our results indicate that these attacks are highly scalable: increasing the number of malicious nodes yields a hyperexponential decrease in system goodput, and a moderate number of attackers suffices to cause a near-collapse of the entire system. The key factors inducing this vulnerability are (i) hierarchical topologies with misbehaving “supernodes,” (ii) high path-length networks in which attackers have increased opportunity to falsify control information, and (iii) power-law networks in which attackers insert themselves into high-degree points in the graph. Finally, we consider the effects of client counter-strategies such as randomized reply selection, redundant and parallel download, and reputation systems. Some counter-strategies (e.g., randomized reply selection) provide considerable immunity to attack (reducing the scaling from hyperexponential to linear), yet significantly hurt performance in the absence of an attack. Other counter-strategies yield little benefit (or penalty). In particular, reputation systems show little impact unless they operate with near perfection

    Towards efficient distributed search in a peer-to-peer network.

    Get PDF
    Cheng Chun Kong.Thesis submitted in: November 2006.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 62-64).Abstracts in English and Chinese.Abstract --- p.1槪要 --- p.2Acknowledgement --- p.3Chapter 1. --- Introduction --- p.5Chapter 2. --- Literature Review --- p.10Chapter 3. --- DesignChapter A. --- Overview --- p.22Chapter B. --- Basic idea --- p.23Chapter C. --- Follow-up design --- p.30Chapter D. --- Summary --- p.40Chapter 4. --- Experimental FindingsChapter A. --- Goal --- p.41Chapter B. --- Analysis Methodology --- p.41Chapter C. --- Validation --- p.47Chapter D. --- Results --- p.47Chapter 5. --- DeploymentChapter A. --- Limitations --- p.58Chapter B. --- Miscellaneous Design Issues --- p.59Chapter 6. --- Future Directions and Conclusions --- p.61Reference --- p.62Appendix --- p.6

    Novel Analytical Modelling-based Simulation of Worm Propagation in Unstructured Peer-to-Peer Networks

    No full text
    Millions of users world-wide are sharing content using Peer-to-Peer (P2P) networks, such as Skype and Bit Torrent. While such new innovations undoubtedly bring benefits, there are nevertheless some associated threats. One of the main hazards is that P2P worms can penetrate the network, even from a single node and then spread rapidly. Understanding the propagation process of such worms has always been a challenge for researchers. Different techniques, such as simulations and analytical models, have been adopted in the literature. While simulations provide results for specific input parameter values, analytical models are rather more general and potentially cover the whole spectrum of given parameter values. Many attempts have been made to model the worm propagation process in P2P networks. However, the reported analytical models to-date have failed to cover the whole spectrum of all relevant parameters and have therefore resulted in high false-positives. This consequently affects the immunization and mitigation strategies that are adopted to cope with an outbreak of worms. The first key contribution of this thesis is the development of a susceptible, exposed, infectious, and Recovered (SEIR) analytical model for the worm propagation process in a P2P network, taking into account different factors such as the configuration diversity of nodes, user behaviour and the infection time-lag. These factors have not been considered in an integrated form previously and have been either ignored or partially addressed in state-of-the-art analytical models. Our proposed SEIR analytical model holistically integrates, for the first time, these key factors in order to capture a more realistic representation of the whole worm propagation process. The second key contribution is the extension of the proposed SEIR model to the mobile M-SEIR model by investigating and incorporating the role of node mobility, the size of the worm and the bandwidth of wireless links in the worm propagation process in mobile P2P networks. The model was designed to be flexible and applicable to both wired and wireless nodes. The third contribution is the exploitation of a promising modelling paradigm, Agent-based Modelling (ABM), in the P2P worm modelling context. Specifically, to exploit the synergies between ABM and P2P, an integrated ABM-Based worm propagation model has been built and trialled in this research for the first time. The introduced model combines the implementation of common, complex P2P protocols, such as Gnutella and GIA, along with the aforementioned analytical models. Moreover, a comparative evaluation between ABM and conventional modelling tools has been carried out, to demonstrate the key benefits of ease of real-time analysis and visualisation. As a fourth contribution, the research was further extended by utilizing the proposed SEIR model to examine and evaluate a real-world data set on one of the most recent worms, namely, the Conficker worm. Verification of the model was achieved using ABM and conventional tools and by then comparing the results on the same data set with those derived from developed benchmark models. Finally, the research concludes that the worm propagation process is to a great extent affected by different factors such as configuration diversity, user-behaviour, the infection time lag and the mobility of nodes. It was found that the infection propagation values derived from state-of-the-art mathematical models are hypothetical and do not actually reflect real-world values. In summary, our comparative research study has shown that infection propagation can be reduced due to the natural immunity against worms that can be provided by a holistic exploitation of the range of factors proposed in this work

    Providing Freshness for Cached Data in Unstructured Peer-to-Peer Systems

    Get PDF
    Replication is a popular technique for increasing data availability and improving perfor- mance in peer-to-peer systems. Maintaining freshness of replicated data is challenging due to the high cost of update management. While updates have been studied in structured networks, they have been neglected in unstructured networks. We therefore confront the problem of maintaining fresh replicas of data in unstructured peer-to-peer networks. We propose techniques that leverage path replication to support efficient lazy updates and provide freshness for cached data in these systems using only local knowledge. In addition, we show that locally available information may be used to provide additional guarantees of freshness at an acceptable cost to performance. Through performance simulations based on both synthetic and real-world workloads from big data environments, we demonstrate the effectiveness of our approach

    Preliminary specification and design documentation for software components to achieve catallaxy in computational systems

    Get PDF
    This Report is about the preliminary specifications and design documentation for software components to achieve Catallaxy in computational systems. -- Die Arbeit beschreibt die Spezifikation und das Design von Softwarekomponenten, um das Konzept der Katallaxie in Grid Systemen umzusetzen. Eine Einführung ordnet das Konzept der Katallaxie in bestehende Grid Taxonomien ein und stellt grundlegende Komponenten vor. Anschließend werden diese Komponenten auf ihre Anwendbarkeit in bestehenden Application Layer Netzwerken untersucht.Grid Computing
    corecore