3,491 research outputs found

    Small-World File-Sharing Communities

    Full text link
    Web caches, content distribution networks, peer-to-peer file sharing networks, distributed file systems, and data grids all have in common that they involve a community of users who generate requests for shared data. In each case, overall system performance can be improved significantly if we can first identify and then exploit interesting structure within a community's access patterns. To this end, we propose a novel perspective on file sharing based on the study of the relationships that form among users based on the files in which they are interested. We propose a new structure that captures common user interests in data--the data-sharing graph-- and justify its utility with studies on three data-distribution systems: a high-energy physics collaboration, the Web, and the Kazaa peer-to-peer network. We find small-world patterns in the data-sharing graphs of all three communities. We analyze these graphs and propose some probable causes for these emergent small-world patterns. The significance of small-world patterns is twofold: it provides a rigorous support to intuition and, perhaps most importantly, it suggests ways to design mechanisms that exploit these naturally emerging patterns

    Relating Query Popularity and File Replication in the Gnutella Peer-to-Peer Network

    Get PDF
    In this paper, we characterize the user behavior in a peer-to-peer (P2P) file sharing network. Our characterization is based on the results of an extensive passive measurement study of the messages exchanged in the Gnutella P2P file sharing system. Using the data recorded during this measurement study, we analyze which queries a user issues and which files a user shares. The investigation of users queries leads to the characterization of query popularity. Furthermore, the analysis of the files shared by the users leads to a characterization of file replication. As major contribution, we relate query popularity and file replication by an analytical formula characterizing the matching of files to queries. The analytical formula defines a matching probability for each pair of query and file, which depends on the rank of the query with respect query popularity, but is independent of the rank of the file with respect to file replication. We validate this model by conducting a detailed simulation study of a Gnutella-style overlay network and comparing simulation results to the results obtained from the measurement

    X-Vine: Secure and Pseudonymous Routing Using Social Networks

    Full text link
    Distributed hash tables suffer from several security and privacy vulnerabilities, including the problem of Sybil attacks. Existing social network-based solutions to mitigate the Sybil attacks in DHT routing have a high state requirement and do not provide an adequate level of privacy. For instance, such techniques require a user to reveal their social network contacts. We design X-Vine, a protection mechanism for distributed hash tables that operates entirely by communicating over social network links. As with traditional peer-to-peer systems, X-Vine provides robustness, scalability, and a platform for innovation. The use of social network links for communication helps protect participant privacy and adds a new dimension of trust absent from previous designs. X-Vine is resilient to denial of service via Sybil attacks, and in fact is the first Sybil defense that requires only a logarithmic amount of state per node, making it suitable for large-scale and dynamic settings. X-Vine also helps protect the privacy of users social network contacts and keeps their IP addresses hidden from those outside of their social circle, providing a basis for pseudonymous communication. We first evaluate our design with analysis and simulations, using several real world large-scale social networking topologies. We show that the constraints of X-Vine allow the insertion of only a logarithmic number of Sybil identities per attack edge; we show this mitigates the impact of malicious attacks while not affecting the performance of honest nodes. Moreover, our algorithms are efficient, maintain low stretch, and avoid hot spots in the network. We validate our design with a PlanetLab implementation and a Facebook plugin.Comment: 15 page

    Storage systems for mobile-cloud applications

    Get PDF
    Mobile devices have become the major computing platform in todays world. However, some apps on mobile devices still suffer from insufficient computing and energy resources. A key solution is to offload resource-demanding computing tasks from mobile devices to the cloud. This leads to a scenario where computing tasks in the same application run concurrently on both the mobile device and the cloud. This dissertation aims to ensure that the tasks in a mobile app that employs offloading can access and share files concurrently on the mobile and the cloud in a manner that is efficient, consistent, and transparent to locations. Existing distributed file systems and network file systems do not satisfy these requirements. Furthermore, current offloading platforms either do not support efficient file access for offloaded tasks or do not offload tasks with file accesses. The first part of the dissertation addresses this issue by designing and implementing an application-level file system named Overlay File System (OFS). OFS assumes a cloud surrogate is paired with each mobile device for task and storage offloading. To achieve high efficiency, OFS maintains and buffers local copies of data sets on both the surrogate and the mobile device. OFS ensures consistency and guarantees that all the reads get the latest data. To effectively reduce the network traffic and the execution delay, OFS uses a delayed-update mechanism, which combines write-invalidate and write-update policies. To guarantee location transparency, OFS creates a unified view of file data. The research tests OFS on Android OS with a real mobile application and real mobile user traces. Extensive experiments show that OFS can effectively support consistent file accesses from computation tasks, no matter where they run. In addition, OFS can effectively reduce both file access latency and network traffic incurred by file accesses. While OFS allows offloaded tasks to access the required files in a consistent and transparent manner, file accesses by offloaded tasks can be further improved. Instead of retrieving the required files from its associated mobile device, a surrogate can discover and retrieve identical or similar file(s) from the surrogates belonging to other users to meet its needs. This is based on two observations: 1) multiple users have the same or similar files, e.g., shared files or images/videos of same object; 2) the need for a certain file content in mobile apps can usually be described by context features of the content, e.g., location, objects in an image, etc.; thus, any file with the required context features can be used to satisfy the need. Since files may be retrieved from surrogates, this solution improves latency and saves wireless bandwidth and power on mobile devices. The second part of the dissertation proposes and develops a Context-Aware File Discovery Service (CAFDS) that implements the idea described above. CAFDS uses a self-organizing map and k-means clustering to classify files into file groups based on file contexts. It then uses an enhanced decision tree to locate and retrieve files based on the file contexts defined by apps. To support diverse file discovery demands from various mobile apps, CAFDS allows apps to add new file contexts and to update existing file contexts dynamically, without affecting the discovery process. To evaluate the effectiveness of CAFDS, the research has implemented a prototype on Android and Linux. The performance of CAFDS was tested against Chord, a DHT based lookup scheme, and SPOON, a P2P file sharing system. The experiments show that CAFDS provides lower end-to-end latency for file search than Chord and SPOON, while providing similar scalability to Chord

    Versatile, Scalable, and Accurate Simulation of Distributed Applications and Platforms

    Get PDF
    International audienceThe study of parallel and distributed applications and platforms, whether in the cluster, grid, peer-to-peer, volunteer, or cloud computing domain, often mandates empirical evaluation of proposed algorithmic and system solutions via simulation. Unlike direct experimentation via an application deployment on a real-world testbed, simulation enables fully repeatable and configurable experiments for arbitrary hypothetical scenarios. Two key concerns are accuracy (so that simulation results are scientifically sound) and scalability (so that simulation experiments can be fast and memory-efficient). While the scalability of a simulator is easily measured, the accuracy of many state-of-the-art simulators is largely unknown because they have not been sufficiently validated. In this work we describe recent accuracy and scalability advances made in the context of the SimGrid simulation framework. A design goal of SimGrid is that it should be versatile, i.e., applicable across all aforementioned domains. We present quantitative results that show that SimGrid compares favorably to state-of-the-art domain-specific simulators in terms of scalability, accuracy, or the trade-off between the two. An important implication is that, contrary to popular wisdom, striving for versatility in a simulator is not an impediment but instead is conducive to improving both accuracy and scalability

    A framework for the dynamic management of Peer-to-Peer overlays

    Get PDF
    Peer-to-Peer (P2P) applications have been associated with inefficient operation, interference with other network services and large operational costs for network providers. This thesis presents a framework which can help ISPs address these issues by means of intelligent management of peer behaviour. The proposed approach involves limited control of P2P overlays without interfering with the fundamental characteristics of peer autonomy and decentralised operation. At the core of the management framework lays the Active Virtual Peer (AVP). Essentially intelligent peers operated by the network providers, the AVPs interact with the overlay from within, minimising redundant or inefficient traffic, enhancing overlay stability and facilitating the efficient and balanced use of available peer and network resources. They offer an “insider‟s” view of the overlay and permit the management of P2P functions in a compatible and non-intrusive manner. AVPs can support multiple P2P protocols and coordinate to perform functions collectively. To account for the multi-faceted nature of P2P applications and allow the incorporation of modern techniques and protocols as they appear, the framework is based on a modular architecture. Core modules for overlay control and transit traffic minimisation are presented. Towards the latter, a number of suitable P2P content caching strategies are proposed. Using a purpose-built P2P network simulator and small-scale experiments, it is demonstrated that the introduction of AVPs inside the network can significantly reduce inter-AS traffic, minimise costly multi-hop flows, increase overlay stability and load-balancing and offer improved peer transfer performance
    • 

    corecore