78 research outputs found

    Uniform and Ergodic Sampling in Unstructured Peer-to-Peer Systems with Malicious Nodes

    Get PDF
    ISBN: 978-3-642-17652-4International audienceWe consider the problem of uniform sampling in large scale open systems. Uniform sampling is a fundamental schema that guarantees that any individual in a population has the same probability to be selected as sample. An important issue that seriously hampers the feasibility of uniform sampling in open and large scale systems is the inevitable presence of malicious nodes. In this paper we show that restricting the number of requests that malicious nodes can issue and allowing for a full knowledge of the composition of the system is a necessary and sufficient condition to guarantee uniform and ergodic sampling. In a nutshell, a uniform and ergodic sampling guarantees that any node in the system is equally likely to appear as a sample at any non malicious node in the system and that infinitely often any nodes have a non null probability to appear as a sample at any honest nodes

    A correlation-aware data placement strategy for key-value stores

    Get PDF
    Key-value stores hold the unprecedented bulk of the data produced by applications such as social networks. Their scalability and availability requirements often outweigh sacri cing richer data and pro- cessing models, and even elementary data consistency. Moreover, existing key-value stores have only random or order based placement strategies. In this paper we exploit arbitrary data relations easily expressed by the application to foster data locality and improve the performance of com- plex queries common in social network read-intensive workloads. We present a novel data placement strategy, supporting dynamic tags, based on multidimensional locality-preserving mappings. We compare our data placement strategy with the ones used in existing key-value stores under the workload of a typical social network appli- cation and show that the proposed correlation-aware data placement strategy o ers a major improvement on the system's overall response time and network requirements

    On the Power of the Adversary to Solve the Node Sampling Problem

    Get PDF
    International audienceWe study the problem of achieving uniform and fresh peer sampling in large scale dynamic systems under adversarial behaviors. Briefly, uniform and fresh peer sampling guarantees that any node in the system is equally likely to appear as a sample at any non malicious node in the system and that infinitely often any node has a non-null probability to appear as a sample of honest nodes. This sample is built locally out of a stream of node identifiers received at each node. An important issue that seriously hampers the feasibility of node sampling in open and large scale systems is the unavoidable presence of malicious nodes. The objective of malicious nodes mainly consists in continuously and largely biasing the input data stream out of which samples are obtained, to prevent (honest) nodes from being selected as samples. First we demonstrate that restricting the number of requests that malicious nodes can issue and providing a full knowledge of the composition of the system is a necessary and sufficient condition to guarantee uniform and fresh sampling. We also define and study two types of adversary models: an omniscient adversary that has the capacity to eavesdrop on all the messages that are exchanged within the system, and a blind adversary that can only observe messages that have been sent or received by nodes it controls. The former model allows us to derive lower bounds on the impact that the adversary has on the sampling functionality while the latter one corresponds to a more realistic setting. Given any sampling strategy, we quantify the minimum effort exerted by both types of adversary on any input stream to prevent this sampling strategy from outputting a uniform and fresh sample

    Towards a Framework for DHT Distributed Computing

    Get PDF
    Distributed Hash Tables (DHTs) are protocols and frameworks used by peer-to-peer (P2P) systems. They are used as the organizational backbone for many P2P file-sharing systems due to their scalability, fault-tolerance, and load-balancing properties. These same properties are highly desirable in a distributed computing environment, especially one that wants to use heterogeneous components. We show that DHTs can be used not only as the framework to build a P2P file-sharing service, but as a P2P distributed computing platform. We propose creating a P2P distributed computing framework using distributed hash tables, based on our prototype system ChordReduce. This framework would make it simple and efficient for developers to create their own distributed computing applications. Unlike Hadoop and similar MapReduce frameworks, our framework can be used both in both the context of a datacenter or as part of a P2P computing platform. This opens up new possibilities for building platforms to distributed computing problems. One advantage our system will have is an autonomous load-balancing mechanism. Nodes will be able to independently acquire work from other nodes in the network, rather than sitting idle. More powerful nodes in the network will be able use the mechanism to acquire more work, exploiting the heterogeneity of the network. By utilizing the load-balancing algorithm, a datacenter could easily leverage additional P2P resources at runtime on an as needed basis. Our framework will allow MapReduce-like or distributed machine learning platforms to be easily deployed in a greater variety of contexts

    Scalable discovery of networked data : Algorithms, Infrastructure, Applications

    Get PDF
    Harmelen, F.A.H. van [Promotor]Siebes, R.M. [Copromotor

    Proof of Latency Using a Verifiable Delay Function

    Get PDF
    In this thesis I present an interactive public-coin protocol called Proof of Latency (PoL) that aims to improve connections in peer-to-peer networks by measuring latencies with logical clocks built from verifiable delay functions (VDF). PoL is a tuple of three algorithms, Setup(e, λ), VCOpen(c, e), and Measure(g, T, l_p, l_v). Setup creates a vector commitment (VC), from which a vector commitment opening corresponding to a collaborator's public key is taken in VCOpen, which then gets used to create a common reference string used in Measure. If no collusion gets detected by neither party, a signed proof is ready for advertising. PoL is agnostic in terms of the individual implementations of the VC or VDF used. This said, I present a proof of concept in the form of a state machine implemented in Rust that uses RSA-2048, Catalano-Fiore vector commitments and Wesolowski's VDF to demonstrate PoL. As VDFs themselves have been shown to be useful in timestamping, they seem to work as a measurement of time in this context as well, albeit requiring a public performance metric for each peer to compare to during the measurement. I have imagined many use cases for PoL, like proving a geographical location, working as a benchmark query, or using the proofs to calculate VDFs with the latencies between peers themselves. As it stands, PoL works as a distance bounding protocol between two participants, considering their computing performance is relatively similar. More work is needed to verify the soundness of PoL as a publicly verifiable proof that a third party can believe in.TÀssÀ tutkielmassa esitÀn interaktiivisen protokollan nimeltÀ Proof of latency (PoL), joka pyrkii parantamaan yhteyksiÀ vertaisverkoissa mittaamalla viivettÀ todennettavasta viivefunktiosta rakennetulla loogisella kellolla. Proof of latency koostuu kolmesta algoritmista, Setup(e, λ), VCOpen(c, e) ja Measure(g, T, l_p, l_v). Setup luo vektorisitoumuksen, josta luodaan avaus algoritmissa VCOpen avaamalla vektorisitoumus indeksistÀ, joka kuvautuu toisen mittaavan osapuolen julkiseen avaimeen. TÀtÀ avausta kÀytetÀÀn luomaan yleinen viitemerkkijono, jota kÀytetÀÀn algoritmissa Measure alkupisteenÀ molempien osapuolien todennettavissa viivefunktioissa mittaamaan viivettÀ. Jos kumpikin osapuoli ei huomaa virheitÀ mittauksessa, on heidÀn allekirjoittama todistus valmis mainostettavaksi vertaisverkossa. PoL ei ota kantaa sen kÀyttÀmien kryptografisten funktioiden implementaatioon. TÀstÀ huolimatta olen ohjelmoinut protokollasta prototyypin Rust-ohjelmointikielellÀ kÀyttÀen RSA-2048:tta, Catalano-Fiore--vektorisitoumuksia ja Wesolowskin todennettavaa viivefunktiota protokollan esittelyyn. Todistettavat viivefunktiot ovat osoittaneet hyödyllisiksi aikaleimauksessa, mikÀ nÀyttÀisi osoittavan niiden soveltumisen myös ajan mittaamiseen tÀssÀ konteksissa, huolimatta siitÀ ettÀ jokaisen osapuolen tulee ilmoittaa julkisesti teholukema, joka kuvaa niiden tehokkuutta viivefunktioiden laskemisessa. Toinen osapuoli kÀyttÀÀ tÀtÀ lukemaa arvioimaan valehteliko toinen viivemittauksessa. Olen kuvitellut monta kÀyttökohdetta PoL:lle, kuten maantieteellisen sijainnin todistaminen, suorituskykytestaus, tai itse viivetodistuksien kÀyttÀminen uusien viivetodistusten laskemisessa vertaisverkon osallistujien vÀlillÀ. TÀllÀ hetkellÀ PoL toimii etÀisyydenmittausprotokollana kahden osallistujan vÀlillÀ, jos niiden suorituskyvyt ovat tarpeeksi lÀhellÀ toisiaan. Protokolla tarvitsee lisÀtutkimusta sen suhteen, voiko se toimia uskottavana todistuksena kolmansille osapuolille kahden vertaisverkon osallistujan vÀlisestÀ viiveestÀ

    A peer-to-peer file search and download protocol for wireless ad-hoc networks

    Get PDF
    Deployment of traditional peer-to-peer file sharing systems on a wireless ad-hoc network introduces several challenges. Information and workload distribution as well as routing are major problems for members of a wireless ad-hoc network, which are only aware of their immediate neighborhood. In this paper, we propose a file sharing system that is able to answer location queries, and also discover and maintain the routing information that is used to transfer files from a source peer to another peer. We present a cross-layer design, where the lookup and routing functionality are unified. The system works according to peer-to-peer principles, distributes the location information of the shared files among the members of the network. The paper includes a sample scenario to make the operations of the system clearer. The performance of the system is evaluated using simulation results and analysis is provided for comparing our approach with a flooding-based, unstructured approach. © 2008 Elsevier B.V. All rights reserved

    RACED: Routing in Payment Channel Networks Using Distributed Hash Tables

    Full text link
    The Bitcoin scalability problem has led to the development of off-chain financial mechanisms such as payment channel networks (PCNs) which help users process transactions of varying amounts, including micro-payment transactions, without writing each transaction to the blockchain. Since PCNs only allow path-based transactions, effective, secure routing protocols that find a path between a sender and receiver are fundamental to PCN operations. In this paper, we propose RACED, a routing protocol that leverages the idea of Distributed Hash Tables (DHTs) to route transactions in PCNs in a fast and secure way. Our experiments on real-world transaction datasets show that RACED gives an average transaction success ratio of 98.74%, an average pathfinding time of 31.242 seconds, which is 1.65∗1031.65*10^3, 1.8∗1031.8*10^3, and 4∗1024*10^2 times faster than three other recent routing protocols that offer comparable security/privacy properties. We rigorously analyze and prove the security of RACED in the Universal Composability framework.Comment: A short version of this work has been accepted to the 19th ACM ASIA Conference on Computer and Communications Security (ACM ASIACCS 2024

    Self-management for large-scale distributed systems

    Get PDF
    Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

    Support Efficient, Scalable, and Online Social Spam Detection in System

    Get PDF
    The broad success of online social networks (OSNs) has created fertile soil for the emergence and fast spread of social spam. Fake news, malicious URL links, fraudulent advertisements, fake reviews, and biased propaganda are bringing serious consequences for both virtual social networks and human life in the real world. Effectively detecting social spam is a hot topic in both academia and industry. However, traditional social spam detection techniques are limited to centralized processing on top of one specific data source but ignore the social spam correlations of distributed data sources. Moreover, a few research efforts are conducting in integrating the stream system (e.g., Storm, Spark) with the large-scale social spam detection, but they typically ignore the specific details in managing and recovering interim states during the social stream data processing. We observed that social spammers who aim to advertise their products or post victim links are more frequently spreading malicious posts during a very short period of time. They are quite smart to adapt themselves to old models that were trained based on historical records. Therefore, these bring a question: how can we uncover and defend against these online spam activities in an online and scalable manner? In this dissertation, we present there systems that support scalable and online social spam detection from streaming social data: (1) the first part introduces Oases, a scalable system that can support large-scale online social spam detection, (2) the second part introduces a system named SpamHunter, a novel system that supports efficient online scalable spam detection in social networks. The system gives novel insights in guaranteeing the efficiency of the modern stream applications by leveraging the spam correlations at scale, and (3) the third part refers to the state recovery during social spam detection, it introduces a customizable state recovery framework that provides fast and scalable state recovery mechanisms for protecting large distributed states in social spam detection applications
    • 

    corecore