21 research outputs found
Statistical Modelling of Information Sharing: Community, Membership and Content
File-sharing systems, like many online and traditional information sharing
communities (e.g. newsgroups, BBS, forums, interest clubs), are dynamical
systems in nature. As peers get in and out of the system, the information
content made available by the prevailing membership varies continually in
amount as well as composition, which in turn affects all peers' join/leave
decisions. As a result, the dynamics of membership and information content are
strongly coupled, suggesting interesting issues about growth, sustenance and
stability.
In this paper, we propose to study such communities with a simple statistical
model of an information sharing club. Carrying their private payloads of
information goods as potential supply to the club, peers join or leave on the
basis of whether the information they demand is currently available.
Information goods are chunked and typed, as in a file sharing system where
peers contribute different files, or a forum where messages are grouped by
topics or threads. Peers' demand and supply are then characterized by
statistical distributions over the type domain.
This model reveals interesting critical behaviour with multiple equilibria. A
sharp growth threshold is derived: the club may grow towards a sustainable
equilibrium only if the value of an order parameter is above the threshold, or
shrink to emptiness otherwise. The order parameter is composite and comprises
the peer population size, the level of their contributed supply, the club's
efficiency in information search, the spread of supply and demand over the type
domain, as well as the goodness of match between them.Comment: accepted in International Symposium on Computer Performance,
Modeling, Measurements and Evaluation, Juan-les-Pins, France, October-200
EARM: An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems
In p2p systems, file replication and replica consistency maintenance are most widely used techniques for better system performance. Most of the file replication methods replicates file in all nodes or at two ends in a clientserver query path or close to the server, leading to low replica utilization, produces unnecessary replicas and hence extra consistency maintenance overhead. Most of the consistency maintenance methods depends on either message spreading or structure based for update message propagation without considering file replication dynamism, leading to inefficient file update and outdated file response. These paper presents an Efficient and Adaptive file Replication and consistency Maintenance (EARM) that combines file replication and consistency maintenance mechanism that achieves higher query efficiency in file replication and consistency maintenance at a low cost. Instead of accepting passively file replicas and updates, each node determines file replication and update polling by adapting to time-varying file query and update rates. Simulation results demonstrate the effectiveness of EARM in comparison with other approaches
Performance models of access latency in cloud storage systems
Access latency is a key performance metric for cloud storage systems and has great impact on user experience, but most papers focus on other performance metrics such as storage overhead, repair cost and so on. Only recently do some models argue that coding can reduce access latency. However, they are developed for special scenarios, which may not reflect reality. To fill the gaps between existing work and practice, in this paper, we propose a more practical model to measure access latency. This model can also be used to compare access latency of different codes used by different companies. To the best of our knowledge, this model is the first to provide a general method to compare access latencies of different erasure codes.postprin
Towards Peer-to-Peer-based Cryptanalysis
Abstract-Modern cryptanalytic algorithms require a large amount of computational power. An approach to cope with this requirement is to distribute these algorithms among many computers and to perform the computation massively parallel. However, existing approaches for distributing cryptanalytic algorithms are based on a client/server or a grid architecture. In this paper we propose the usage of peer-to-peer (P2P) technology for distributed cryptanalytic calculations. Our contribution in this paper is three-fold: We first identify the challenges resulting from this approach and provide a classification of algorithms suited for P2P-based computation. Secondly, we discuss and classify some specific cryptanalytic algorithms and their suitability for such an approach. Finally we provide a new, fully decentralized approach for distributing such computationally intensive jobs. Our design takes special care about scalability and the possible untrustworthy nature of the participating peers
APRE: A Replication Method for Unstructured P2P Networks
We present APRE, a replication method for structureless Peer-to-Peer overlays. The goal of our method
is to achieve real-time replication of even the most sparsely located content relative to demand. APRE
adaptively expands or contracts the replica set of an object in order to improve the sharing process and
achieve a low load distribution among the providers. To achieve that, it utilizes search knowledge to identify
possible replication targets inside query-intensive areas of the overlay. We present detailed simulation
results where APRE exhibits both efficiency and robustness relative to the number of requesters and the
respective request rates. The scheme proves particularly useful in the event of flash crowds, managing to
quickly adapt to sudden surges in load
A Comparison of Optimistic Approaches to Collaborative Editing of Wiki Pages
Wikis, a popular tool for sharing knowledge, are basically collaborative editing systems. However, existing wiki systems offer limited support for co-operative authoring, and they do not scale well, because they are based on a centralised architecture. This paper compares the well-known centralised MediaWiki system with several peer-to-peer approaches to editing of wiki pages: an operational transformation approach (MOT2), a commutativity-oriented approach (WOOTO) and a conflict resolution approach (ACF). We evaluate and compare them, according to a number of qualitative and quantitative metrics
Geostry - a Peer-to-Peer System for Location-based Information
An interesting development is summarized by the notion of âUbiquitous Computingâ: In this area, miniature systems are integrated into everyday objects making these objects âsmartâ and able to communicate. Thereby, everyday objects can gather information about their state and their environment. By embedding this information into a model of the real world, which nowadays can be modeled very realistically using sophisticated 3D modeling techniques, it is possible to generate powerful digital world models. Not only can existing objects of the real world and their state be mapped into these world models, but additional information can be linked to these objects as well. The result is a symbiosis of the real world and digital information spaces.
In this thesis, we present a system that allows for an easy access to this information. In contrast to existing solutions our approach is not based on a server-client architecture. Geostry bases on a peer-to-peer system and thus incorporates all the advantages, such as self-organization, fairness (in terms of costs), scalability and many more. Setting up the network is realized through a decentralized bootstrapping protocol based on an existing Internet service to provide robustness and availability. To selectively find geographic-related information Geostry supports spatial queries. They - among other things - enable the user to search for information e.g. in a certain district only. Sometimes, a certain piece of information raises particular interest. To cope with the run on the single computer that provides this specific information, Geostry offers dynamic replication mechanisms. Thereby, the information is replicated for as long as the rush lasts. Thus, Geostry offers all aspects from setting up a network, providing access to geo-related information and replication methods to provide accessibility in times of high loads
Routing and caching on DHTS
L'obiettivo della tesi e' quello di analizzare i principali meccanismi di caching e routing implementati oggigiorno nelle DHT piu' utilizzate.
In particolare, la nostra analisi mostra come tali meccanismi siano sostanzialmente inefficaci nel garantire un adeguato load balancing tra i peers; le principali cause di questo fenomeno sono individuate nella struttura, eccessivamente rigida, adottata dalle DHT e nella mancanza di correlazione tra meccanismi di routing e di caching.
Viene quindi proposto un diverso overlay, organizzato in base a una struttura ipercubica, che permetta di adottare un algoritmo di routing piu' flessibile e di sviluppare due meccanismi di caching e routing strettamente interconnessi.
In particolare, l'overlay ottenuto riesce a garantire che ogni nodo subisca un carico al piu' costante, con una taglia di cache costante e una complessita' di routing polilogaritmica nel caso peggior
Efficient data reliability management of cloud storage systems for big data applications
Cloud service providers are consistently striving to provide efficient and reliable service, to their client's Big Data storage need. Replication is a simple and flexible method to ensure reliability and availability of data. However, it is not an efficient solution for Big Data since it always scales in terabytes and petabytes. Hence erasure coding is gaining traction despite its shortcomings. Deploying erasure coding in cloud storage confronts several challenges like encoding/decoding complexity, load balancing, exponential resource consumption due to data repair and read latency. This thesis has addressed many challenges among them. Even though data durability and availability should not be compromised for any reason, client's requirements on read performance (access latency) may vary with the nature of data and its access pattern behaviour. Access latency is one of the important metrics and latency acceptance range can be recorded in the client's SLA. Several proactive recovery methods, for erasure codes are proposed in this research, to reduce resource consumption due to recovery. Also, a novel cache based solution is proposed to mitigate the access latency issue of erasure coding