11 research outputs found

    A HOLISTIC REDUNDANCY- AND INCENTIVE-BASED FRAMEWORK TO IMPROVE CONTENT AVAILABILITY IN PEER-TO-PEER NETWORKS

    Get PDF
    Peer-to-Peer (P2P) technology has emerged as an important alternative to the traditional client-server communication paradigm to build large-scale distributed systems. P2P enables the creation, dissemination and access to information at low cost and without the need of dedicated coordinating entities. However, existing P2P systems fail to provide high-levels of content availability, which limit their applicability and adoption. This dissertation takes a holistic approach to device mechanisms to improve content availability in large-scale P2P systems. Content availability in P2P can be impacted by hardware failures and churn. Hardware failures, in the form of disk or node failures, render information inaccessible. Churn, an inherent property of P2P, is the collective effect of the users’ uncoordinated behavior, which occurs when a large percentage of nodes join and leave frequently. Such a behavior reduces content availability significantly. Mitigating the combined effect of hardware failures and churn on content availability in P2P requires new and innovative solutions that go beyond those applied in existing distributed systems. To addresses this challenge, the thesis proposes two complementary, low cost mechanisms, whereby nodes self-organize to overcome failures and improve content availability. The first mechanism is a low complexity and highly flexible hybrid redundancy scheme, referred to as Proactive Repair (PR). The second mechanism is an incentive-based scheme that promotes cooperation and enforces fair exchange of resources among peers. These mechanisms provide the basis for the development of distributed self-organizing algorithms to automate PR and, through incentives, maximize their effectiveness in realistic P2P environments. Our proposed solution is evaluated using a combination of analytical and experimental methods. The analytical models are developed to determine the availability and repair cost properties of PR. The results indicate that PR’s repair cost outperforms other redundancy schemes. The experimental analysis was carried out using simulation and the development of a testbed. The simulation results confirm that PR improves content availability in P2P. The proposed mechanisms are implemented and tested using a DHT-based P2P application environment. The experimental results indicate that the incentive-based mechanism can promote fair exchange of resources and limits the impact of uncooperative behaviors such as “free-riding”

    Contributions to security and privacy protection in recommendation systems

    Get PDF
    A recommender system is an automatic system that, given a customer model and a set of available documents, is able to select and offer those documents that are more interesting to the customer. From the point of view of security, there are two main issues that recommender systems must face: protection of the users' privacy and protection of other participants of the recommendation process. Recommenders issue personalized recommendations taking into account not only the profile of the documents, but also the private information that customers send to the recommender. Hence, the users' profiles include personal and highly sensitive information, such as their likes and dislikes. In order to have a really useful recommender system and improve its efficiency, we believe that users shouldn't be afraid of stating their preferences. The second challenge from the point of view of security involves the protection against a new kind of attack. Copyright holders have shifted their targets to attack the document providers and any other participant that aids in the process of distributing documents, even unknowingly. In addition, new legislation trends such as ACTA or the ¿Sinde-Wert law¿ in Spain show the interest of states all over the world to control and prosecute these intermediate nodes. we proposed the next contributions: 1.A social model that captures user's interests into the users' profiles, and a metric function that calculates the similarity between users, queries and documents. This model represents profiles as vectors of a social space. Document profiles are created by means of the inspection of the contents of the document. Then, user profiles are calculated as an aggregation of the profiles of the documents that the user owns. Finally, queries are a constrained view of a user profile. This way, all profiles are contained in the same social space, and the similarity metric can be used on any pair of them. 2.Two mechanisms to protect the personal information that the user profiles contain. The first mechanism takes advantage of the Johnson-Lindestrauss and Undecomposability of random matrices theorems to project profiles into social spaces of less dimensions. Even if the information about the user is reduced in the projected social space, under certain circumstances the distances between the original profiles are maintained. The second approach uses a zero-knowledge protocol to answer the question of whether or not two profiles are affine without leaking any information in case of that they are not. 3.A distributed system on a cloud that protects merchants, customers and indexers against legal attacks, by means of providing plausible deniability and oblivious routing to all the participants of the system. We use the term DocCloud to refer to this system. DocCloud organizes databases in a tree-shape structure over a cloud system and provide a Private Information Retrieval protocol to avoid that any participant or observer of the process can identify the recommender. This way, customers, intermediate nodes and even databases are not aware of the specific database that answered the query. 4.A social, P2P network where users link together according to their similarity, and provide recommendations to other users in their neighborhood. We defined an epidemic protocol were links are established based on the neighbors similarity, clustering and randomness. Additionally, we proposed some mechanisms such as the use SoftDHT to aid in the identification of affine users, and speed up the process of creation of clusters of similar users. 5.A document distribution system that provides the recommended documents at the end of the process. In our view of a recommender system, the recommendation is a complete process that ends when the customer receives the recommended document. We proposed SCFS, a distributed and secure filesystem where merchants, documents and users are protectedEste documento explora c omo localizar documentos interesantes para el usuario en grandes redes distribuidas mediante el uso de sistemas de recomendaci on. Se de fine un sistema de recomendaci on como un sistema autom atico que, dado un modelo de cliente y un conjunto de documentos disponibles, es capaz de seleccionar y ofrecer los documentos que son m as interesantes para el cliente. Las caracter sticas deseables de un sistema de recomendaci on son: (i) ser r apido, (ii) distribuido y (iii) seguro. Un sistema de recomendaci on r apido mejora la experiencia de compra del cliente, ya que una recomendaci on no es util si es que llega demasiado tarde. Un sistema de recomendaci on distribuido evita la creaci on de bases de datos centralizadas con informaci on sensible y mejora la disponibilidad de los documentos. Por ultimo, un sistema de recomendaci on seguro protege a todos los participantes del sistema: usuarios, proveedores de contenido, recomendadores y nodos intermedios. Desde el punto de vista de la seguridad, existen dos problemas principales a los que se deben enfrentar los sistemas de recomendaci on: (i) la protecci on de la intimidad de los usuarios y (ii) la protecci on de los dem as participantes del proceso de recomendaci on. Los recomendadores son capaces de emitir recomendaciones personalizadas teniendo en cuenta no s olo el per l de los documentos, sino tambi en a la informaci on privada que los clientes env an al recomendador. Por tanto, los per les de usuario incluyen informaci on personal y altamente sensible, como sus gustos y fobias. Con el n de desarrollar un sistema de recomendaci on util y mejorar su e cacia, creemos que los usuarios no deben tener miedo a la hora de expresar sus preferencias. Para ello, la informaci on personal que est a incluida en los per les de usuario debe ser protegida y la privacidad del usuario garantizada. El segundo desafi o desde el punto de vista de la seguridad implica un nuevo tipo de ataque. Dado que la prevenci on de la distribuci on ilegal de documentos con derechos de autor por medio de soluciones t ecnicas no ha sido efi caz, los titulares de derechos de autor cambiaron sus objetivos para atacar a los proveedores de documentos y cualquier otro participante que ayude en el proceso de distribuci on de documentos. Adem as, tratados y leyes como ACTA, la ley SOPA de EEUU o la ley "Sinde-Wert" en España ponen de manfi esto el inter es de los estados de todo el mundo para controlar y procesar a estos nodos intermedios. Los juicios recientes como MegaUpload, PirateBay o el caso contra el Sr. Pablo Soto en España muestran que estas amenazas son una realidad

    Adaptive P2P platform for data sharing

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Motivational visualization for resources-sharing online communities

    Get PDF
    As online applications such as online newsgroups, internet game-rooms, online chat-rooms, and peer-to-peer (P2P) resources-sharing systems become popular, online community visualization became a hot research topic. Different forms and metaphors of visualizations focused on various aspects of online communities have been proposed. In this thesis, I propose one prototype of online community visualization which is designed to motivate user contributions in various aspects and stimulate users to participate in the online community more actively. The uneven participation is a well known problem in human society; according to the 80-20 rule, 20% of the people make 80% of contributions, for example, 20% of the employees in a company do 80% of the work. This problem exits in all kinds of online communities, e.g. newsgroups, chat-rooms, but it is particularly crucial for P2P online resources-sharing communities. Such communities do not have a central server and rely solely on the peers not just to provide contributions, but also to ensure the infrastructure. Large P2P file-sharing communities like KaZaA and Limewire can provide the redundancy of peers and resources needed to support the infrastructure and availability of resources. However, when an online community is small, for example, the students in a class, a research group, a department, or a school, the problem of lack of users it is hard to reach a “critical mass” of user participation, leading to poor service and resource availability, which reduces users’ interest in participating in the system. To attract users and motivate them to make more contributions into an online resources-sharing community, I propose to use motivational visualization of the community and the contributions of its members. The motivational effect of the visualization is grounded on two theories in social psychology which explain how individuals align their behaviour with each other and with their group (community). In this thesis, I discuss three stages in the design of the visualization and the subsequent redesigns following results from evaluation and user feedback

    Analyzing and Modeling Real-World Phenomena with Complex Networks: A Survey of Applications

    Get PDF
    The success of new scientific areas can be assessed by their potential for contributing to new theoretical approaches and in applications to real-world problems. Complex networks have fared extremely well in both of these aspects, with their sound theoretical basis developed over the years and with a variety of applications. In this survey, we analyze the applications of complex networks to real-world problems and data, with emphasis in representation, analysis and modeling, after an introduction to the main concepts and models. A diversity of phenomena are surveyed, which may be classified into no less than 22 areas, providing a clear indication of the impact of the field of complex networks.Comment: 103 pages, 3 figures and 7 tables. A working manuscript, suggestions are welcome

    On social and technical aspects of managing mobile Ad-hoc communities

    Get PDF
    Soziale Software beschreibt eine Klasse von Anwendungen, die es Benutzern erlaubt ueber das Internet mit Freunden zu kommunizieren und Informationen auszutauschen. Mit zunehmender Leistungsfaehigkeit mobiler Prozessoren verwandeln sich Mobiltelefone in vollwertige Computer und eroeffnen neue Moeglichkeiten fuer die mobile Nutzung sozialer Software. Da Menschen Mobiltelefone haeufig bei sich fuehren, koennen vergleichbare mobile Anwendungen staerker auf ihre unmittelbare Umgebungssituation zugeschnitten werden. Moegliche Szenarien sind die Unterstuetzung realer Treffen und damit verbundenen Mitgliederinteraktionen. Client-Server-Plattformen, die dabei haeufig zum Einsatz kommen wurden allerdings nie fuer solche hochflexiblen Gruppensituationen konstruiert. Mobile Encounter Netzwerke (MENe) verprechen hier mehr Flexibilitaet. Ein MEN stellt eine mobiler Peer-to-Peer-Plattformen dar, das ueber ein kurzreichweitiges Funknetz betrieben wird. Mit diesem Netzwerk werden Beitraege ueber einen raeumlichen Diffusionsprozess von einem mobilen Endgeraet zum naechsten verbreitet. Das hat zwei entscheidende Vorteile: Zunaechst ist der direkte Nachrichtenaustausch besser geeignet zur Verbreitung von situationsspezifischer Information, da die Informationsrelevanz mit ihrer Entfehrnung abnimmt. Gleichzeitig koennen aber auch Inhalte, die fuer einen breiten Interessenkreis bestimmt sind ueber Mitglieder mit herausragenden Mobilitaetscharakteristik in weit entfernte Gebiete transportiert werden. Ein Nachteil ist jedoch der hohe Ressourcenverbrauch. Zur Loesung dieses Problems entwickeln wir ein Rahmenwerk zur Unterstuetzung mobiler ad-hoc Gruppen, das es uns erlaubt, Gruppensynergien gezielt auszunutzen. Dieses Rahmenwerk bietet Dienstleistungen zur Verwaltung der Gruppendynamik und zur Verbreitung von Inhalten an. Mittels soziale Netzwerkanalyse wird die technische Infrastruktur ohne notwendige Benutzereingriffe kontinuierlich an die reale Umgebungssituation angepasst. Dabei werden moegliche Beziehungen zwischen benachbarten Personen anhand frueher Begegnungen analysiert, spontane Gruppenbildungen mit Clusterverfahren identifiziert und jedem Gruppenmitglied eine geeignete Rolle durch eine Positionsanalyse zugewiesen. Eine Grundvorraussetzung fuer eine erfolgreiche Kooperation ist ein effizienter Wissensaustausch innerhalb einer Gemeinschaft. Wie die Small World-Theorie zeigt, koennen Menschen Wissen auch dann effizient verbreiten, wenn ihre Entscheidung nur auf lokaler Umgebungsinformation basiert. Verschiedene Forscher machten sich das zu nutze, indem sie kurze Verbreitungspfade durch eine Verkettung hochvernetzter Mitglieder innerhalb einer Gemeinschaft konstruierten. Allerdings laesst sich dieses Verfahren nicht einfach auf MENe uebertragen, da die Transferzeit im Gegensatz zu dem drahtgebundenen Internet beschraenkt ist. Unser Ansatz beruht daher, auf der von Reagan et al. vorgestellten Least Effort Transfer-Hypothese. Diese Hypothese besagt, dass Menschen Wissen nur dann weitergeben, wenn sich der Aufwand zur Informationsuebertragung innerhalb bestimmter Grenzen bewegt. Eine erfolgreiche Wissensuebertragung haengt in diesem Fall vom Hintergrundwissen aller Beteiligter ab, was wiederum von unterschiedlichen kognitiven und sozialen Faktoren abhaengt. Entsprechend leiten wir ein Diffusionsverfahren ab, dass in der Lage ist, Inhalte in verschiedene Kompexitaetstufen einzuteilen und Datenuebertragungen an die vorgefundene soziale Situation anzupassen. Mit einem Prototyp evaluieren wir die Machbarkeit der Gruppen- und Informationsmanagementkomponente unseres Rahmenwerkes. Da Laborexperimente keinen ausreichenden Aufschluss ueber Diffusionseigenschaften im groesseren Massstab geben koennen, simulieren wir die Beitragsdiffusion. Dazu dient uns eine Verkehrsimulation, bei der Agenten zusaetzlich mit aktivitaetsbezogenen, sozialen und territorialen Modellen erweitern werden. Um eine realitaetsnahe Simulation zu gewaehrleisten, werden diese Modelle in Uebereinstimmung mit verschiedenen Studien zum Stadtleben generiert. Der technische Uebertragungsprozess wird anhand der Ergebnisse einer vorangegangenen Prototypuntersuchung parametrisiert. Waehrend eines Simulationslaufes bewegen sich Agenten auf einem Stadtplan und sammeln Kontakt- und Beitragsdaten. Analysiert man anschliessend die Netzwerktopologie auf Small World-Eigenschaften, so findet man eine Netzstruktur mit einer ausgepraegten Neigung zum Clustering (Freundschaftsnetzwerke) und einer ueberdurschnittlichen kurzen Weglaenge. Offensichtlich reicht die Alltagsmobilitaet aus, um ausreichend viele Verknuepfungen zwischen Gemeinschaftmitgliedern zu bilden. Die nachfolgende Diffusionsanalyse zeigt, dass vergleichbare Reichweiten wie bei einem flutungsbasierten Ansatz erzielt werden, allerdings mit anfaenglichen Verzoegerungen. Da unser Verfahren bei einem Ortswechsel die Anzahl der Informationsuebermittler auf zentrale Gruppenmitglieder begrenzt, steht mehr Bandbreite fuer den Datenaustausch zur Verfuegung. Herkoemliche Mitglieder (ohne Leitungsaufgaben) tauschen Inhalte vornehmlich in zeitunkritschen Situationen aus. Das hat den positiven Nebeneffekt, dass im Cache erheblich weniger Kopien aussortiert werden muessen. Wechselt man waehrend der Simulation die Beitragskategorie so erkennt man, dass zeitabhaengige Inhalte besser ueber regelmaessige Kontakte und zeitunabhaengig Inhalte durch zufaellige Kontakte verbreitet werden. Eine abschliessende Precision-Recall Analyse zeigt, dass herkoemmliche Gruppenmitglieder eine bessere Genauigkeit (Precision), und zentrale Mitglieder eine bessere Trefferquote (Recall) im Vergleich zu traditionellen Ansaetzen besitzen. Eine Erklaerung dafuer ist, dass der von uns gewaehlte gruppenbasierte Cacheansatz zu weniger Saeuberungszyklen aller Gruppenmitglieder fuehrt und somit nachhaltiger ausgerichtet ist.Social software encompasses a range of software systems that allow users to interact and share data. This computer-mediated communication has become very popular with social networking sites like Facebook and Twitter. The evolvement of smart phones toward mobile computers opens new possibilities to use social software also in mobile usage scenarios. Since mobile phones are permanently carried by their owners, the support focus is, however, much stronger set on promoting and augmenting real group gatherings. Traditional client-server platforms are not flexible enough to support complex and dynamic human encounter behavior. Mobile encounter networks (MENs) which represent a mobile peer-to-peer platform on top of a short range wireless network promise better flexibility. MENs diffuse content from neighbor-to-neighbor in a spatial diffusion process. For physical group gatherings this is advantageous for two reasons. Direct device-to-device interactions encourage sharing of situation-dependent content. Moreover, content is not necessarily locked within friend groups and may trigger networking effects by reaching larger audiences through user mobility. One disadvantage is, however, the high resource usage. We develop a social software framework for mobile ad-hoc groups, which partly solves this problem. This framework supports services for the management of group dynamics and content diffusion within and between groups. Social network analysis as an inherent part of the framework is used to adapt internal community states continuously with real world encounter situations. We hereby qualify interpersonal relationships based on encounter and communication statistics, identify social groups through incremental clustering and assign diffusion roles through position analysis. To achieve efficient content dissemination we make use of social diffusion phenomena. Other researchers have experimented extensively with the small world model as it proofs that people transfer knowledge based on local knowledge but are still capable of diffusing it efficiently on a global scale. Their approach is often based on identifying short paths through member connectivity. However, this scenario is not applicable in MENs as transfer time is limited in contrast to the wired Internet. Our approach is therefore based on the least effort transfer theory. Following Reagan et al., who first postulated this hypothesis, people transfer knowledge only if the transfer effort is within specific limits, which depends on different social and cognitive factors. We derive routing mechanisms, which are capable of distinguishing between different content complexities and apply information about peer's expertise and social network to identify advantageous paths and content transfers options. We evaluate the feasibility of the group management and content transfer component with prototypes. Since labor settings do not allow to obtain information about large scale diffusion experiences, we also conduct a multi-agent simulation to evaluate the diffusion capabilities of the system. Experiences from an earlier prototype implementation have been used to quantify the technical routing process. To emulate realistic community life, we assigned to each agent an individual daily agenda, social contacts and territory preferences specified according to outcomes from different urban city life surveys. During the simulation agents move on a city map according to these models and collect contact and content specific data. Analyzing the network topology according to small world characteristics shows a structure with a high tendency for clustering (friend networks) and a short average path length. Daily urban mobility creates enough opportunities to form shortcuts through the community. Content diffusion analysis shows that our approach reaches a similar amount of peers as network flooding but with delays in the beginning. Since our approach artificially limits the number of intermediates to central community peers more bandwidth is available during traveling and more content can be transferred as in the case of the flooding approach. Ordinary peers seem to have significantly fewer content replications if an unlimited cache is assumed proofing that our mechanism is more efficient. By varying the content type used during the simulation we recognize that time dependent content is better disseminated through frequent contacts and time independent content through random contacts. Performing a precision-recall analysis on peers caches shows that ordinary peers gain an overall better context precision, and central peers a better community recall. One explanation is that the shared cache approach leads to fewer content replacements in the cache as for instance the least recently used cache strategy

    Engineering a semantic web trust infrastructure

    No full text
    The ability to judge the trustworthiness of information is an important and challenging problem in the field of Semantic Web research. In this thesis, we take an end-to-end look at the challenges posed by trust on the Semantic Web, and present contributions in three areas: a Semantic Web identity vocabulary, a system for bootstrapping trust environments, and a framework for trust aware information management. Typically Semantic Web agents, which consume and produce information, are not described with sufficient information to permit those interacting with them to make good judgements of trustworthiness. A descriptive vocabulary for agent identity is required to enable effective inter agent discourse, and the growth of trust and reputation within the Semantic Web; we therefore present such a foundational identity ontology for describing web-based agents.It is anticipated that the Semantic Web will suffer from a trust network bootstrapping problem. In this thesis, we propose a novel approach which harnesses open data to bootstrap trust in new trust environments. This approach brings together public records published by a range of trusted institutions in order to encourage trust in identities within new environments. Information integrity and provenance are both critical prerequisites for well-founded judgements of information trustworthiness. We propose a modification to the RDF Named Graph data model in order to address serious representational limitations with the named graph proposal, which affect the ability to cleanly represent claims and provenance records. Next, we propose a novel graph based approach for recording the provenance of derived information. This approach offers computational and memory savings while maintaining the ability to answer graph-level provenance questions. In addition, it allows new optimisations such as strategies to avoid needless repeat computation, and a delta-based storage strategy which avoids data duplication.<br/
    corecore