3,491 research outputs found
Small-World File-Sharing Communities
Web caches, content distribution networks, peer-to-peer file sharing
networks, distributed file systems, and data grids all have in common that they
involve a community of users who generate requests for shared data. In each
case, overall system performance can be improved significantly if we can first
identify and then exploit interesting structure within a community's access
patterns. To this end, we propose a novel perspective on file sharing based on
the study of the relationships that form among users based on the files in
which they are interested.
We propose a new structure that captures common user interests in data--the
data-sharing graph-- and justify its utility with studies on three
data-distribution systems: a high-energy physics collaboration, the Web, and
the Kazaa peer-to-peer network. We find small-world patterns in the
data-sharing graphs of all three communities. We analyze these graphs and
propose some probable causes for these emergent small-world patterns. The
significance of small-world patterns is twofold: it provides a rigorous support
to intuition and, perhaps most importantly, it suggests ways to design
mechanisms that exploit these naturally emerging patterns
Relating Query Popularity and File Replication in the Gnutella Peer-to-Peer Network
In this paper, we characterize the user behavior in a peer-to-peer (P2P) file sharing network. Our characterization is based on the results of an extensive passive measurement study of the messages exchanged in the Gnutella P2P file sharing system. Using the data recorded during this measurement study, we analyze which queries a user issues and which files a user shares. The investigation of users queries leads to the characterization of query popularity. Furthermore, the analysis of the files shared by the users leads to a characterization of file replication. As major contribution, we relate query popularity and file replication by an analytical formula characterizing the matching of files to queries. The analytical formula defines a matching probability for each pair of query and file, which depends on the rank of the query with respect query popularity, but is independent of the rank of the file with respect to file replication. We validate this model by conducting a detailed simulation study of a Gnutella-style overlay network and comparing simulation results to the results obtained from the measurement
X-Vine: Secure and Pseudonymous Routing Using Social Networks
Distributed hash tables suffer from several security and privacy
vulnerabilities, including the problem of Sybil attacks. Existing social
network-based solutions to mitigate the Sybil attacks in DHT routing have a
high state requirement and do not provide an adequate level of privacy. For
instance, such techniques require a user to reveal their social network
contacts. We design X-Vine, a protection mechanism for distributed hash tables
that operates entirely by communicating over social network links. As with
traditional peer-to-peer systems, X-Vine provides robustness, scalability, and
a platform for innovation. The use of social network links for communication
helps protect participant privacy and adds a new dimension of trust absent from
previous designs. X-Vine is resilient to denial of service via Sybil attacks,
and in fact is the first Sybil defense that requires only a logarithmic amount
of state per node, making it suitable for large-scale and dynamic settings.
X-Vine also helps protect the privacy of users social network contacts and
keeps their IP addresses hidden from those outside of their social circle,
providing a basis for pseudonymous communication. We first evaluate our design
with analysis and simulations, using several real world large-scale social
networking topologies. We show that the constraints of X-Vine allow the
insertion of only a logarithmic number of Sybil identities per attack edge; we
show this mitigates the impact of malicious attacks while not affecting the
performance of honest nodes. Moreover, our algorithms are efficient, maintain
low stretch, and avoid hot spots in the network. We validate our design with a
PlanetLab implementation and a Facebook plugin.Comment: 15 page
Storage systems for mobile-cloud applications
Mobile devices have become the major computing platform in todays world. However, some apps on mobile devices still suffer from insufficient computing and energy resources. A key solution is to offload resource-demanding computing tasks from mobile devices to the cloud. This leads to a scenario where computing tasks in the same application run concurrently on both the mobile device and the cloud.
This dissertation aims to ensure that the tasks in a mobile app that employs offloading can access and share files concurrently on the mobile and the cloud in a manner that is efficient, consistent, and transparent to locations. Existing distributed file systems and network file systems do not satisfy these requirements. Furthermore, current offloading platforms either do not support efficient file access for offloaded tasks or do not offload tasks with file accesses.
The first part of the dissertation addresses this issue by designing and implementing an application-level file system named Overlay File System (OFS). OFS assumes a cloud surrogate is paired with each mobile device for task and storage offloading. To achieve high efficiency, OFS maintains and buffers local copies of data sets on both the surrogate and the mobile device. OFS ensures consistency and guarantees that all the reads get the latest data. To effectively reduce the network traffic and the execution delay, OFS uses a delayed-update mechanism, which combines write-invalidate and write-update policies. To guarantee location transparency, OFS creates a unified view of file data.
The research tests OFS on Android OS with a real mobile application and real mobile user traces. Extensive experiments show that OFS can effectively support consistent file accesses from computation tasks, no matter where they run. In addition, OFS can effectively reduce both file access latency and network traffic incurred by file accesses.
While OFS allows offloaded tasks to access the required files in a consistent and transparent manner, file accesses by offloaded tasks can be further improved. Instead of retrieving the required files from its associated mobile device, a surrogate can discover and retrieve identical or similar file(s) from the surrogates belonging to other users to meet its needs. This is based on two observations: 1) multiple users have the same or similar files, e.g., shared files or images/videos of same object; 2) the need for a certain file content in mobile apps can usually be described by context features of the content, e.g., location, objects in an image, etc.; thus, any file with the required context features can be used to satisfy the need. Since files may be retrieved from surrogates, this solution improves latency and saves wireless bandwidth and power on mobile devices.
The second part of the dissertation proposes and develops a Context-Aware File Discovery Service (CAFDS) that implements the idea described above. CAFDS uses a self-organizing map and k-means clustering to classify files into file groups based on file contexts. It then uses an enhanced decision tree to locate and retrieve files based on the file contexts defined by apps. To support diverse file discovery demands from various mobile apps, CAFDS allows apps to add new file contexts and to update existing file contexts dynamically, without affecting the discovery process.
To evaluate the effectiveness of CAFDS, the research has implemented a prototype on Android and Linux. The performance of CAFDS was tested against Chord, a DHT based lookup scheme, and SPOON, a P2P file sharing system. The experiments show that CAFDS provides lower end-to-end latency for file search than Chord and SPOON, while providing similar scalability to Chord
Versatile, Scalable, and Accurate Simulation of Distributed Applications and Platforms
International audienceThe study of parallel and distributed applications and platforms, whether in the cluster, grid, peer-to-peer, volunteer, or cloud computing domain, often mandates empirical evaluation of proposed algorithmic and system solutions via simulation. Unlike direct experimentation via an application deployment on a real-world testbed, simulation enables fully repeatable and configurable experiments for arbitrary hypothetical scenarios. Two key concerns are accuracy (so that simulation results are scientifically sound) and scalability (so that simulation experiments can be fast and memory-efficient). While the scalability of a simulator is easily measured, the accuracy of many state-of-the-art simulators is largely unknown because they have not been sufficiently validated. In this work we describe recent accuracy and scalability advances made in the context of the SimGrid simulation framework. A design goal of SimGrid is that it should be versatile, i.e., applicable across all aforementioned domains. We present quantitative results that show that SimGrid compares favorably to state-of-the-art domain-specific simulators in terms of scalability, accuracy, or the trade-off between the two. An important implication is that, contrary to popular wisdom, striving for versatility in a simulator is not an impediment but instead is conducive to improving both accuracy and scalability
A framework for the dynamic management of Peer-to-Peer overlays
Peer-to-Peer (P2P) applications have been associated with inefficient operation, interference with other network services and large operational costs for network providers. This thesis presents a framework which can help ISPs address these issues by means of intelligent management of peer behaviour. The proposed approach involves limited control of P2P overlays without interfering with the fundamental characteristics of peer autonomy and decentralised operation.
At the core of the management framework lays the Active Virtual Peer (AVP). Essentially intelligent peers operated by the network providers, the AVPs interact with the overlay from within, minimising redundant or inefficient traffic, enhancing overlay stability and facilitating the efficient and balanced use of available peer and network resources. They offer an âinsiderâsâ view of the overlay and permit the management of P2P functions in a compatible and non-intrusive manner. AVPs can support multiple P2P protocols and coordinate to perform functions collectively.
To account for the multi-faceted nature of P2P applications and allow the incorporation of modern techniques and protocols as they appear, the framework is based on a modular architecture. Core modules for overlay control and transit traffic minimisation are presented. Towards the latter, a number of suitable P2P content caching strategies are proposed.
Using a purpose-built P2P network simulator and small-scale experiments, it is demonstrated that the introduction of AVPs inside the network can significantly reduce inter-AS traffic, minimise costly multi-hop flows, increase overlay stability and load-balancing and offer improved peer transfer performance
- âŠ