18 research outputs found

    Gozar: NAT-friendly Peer Sampling with One-Hop Distributed NAT Traversal

    Get PDF
    Gossip-based peer sampling protocols have been widely used as a building block for many large-scale distributed applications. However, Network Address Translation gateways (NATs) cause most existing gossiping protocols to break down, as nodes cannot establish direct connections to nodes behind NATs (private nodes). In addition, most of the existing NAT traversal algorithms for establishing connectivity to private nodes rely on third party servers running at a well-known, public IP addresses. In this paper, we present Gozar, a gossip-based peer sampling service that: (i) provides uniform random samples in the presence of NATs, and (ii) enables direct connectivity to sampled nodes using a fully distributed NAT traversal service, where connection messages require only a single hop to connect to private nodes. We show in simulation that Gozar preserves the randomness properties of a gossip-based peer sampling service. We show the robustness of Gozar when a large fraction of nodes reside behind NATs and also in catastrophic failure scenarios. For example, if 80% of nodes are behind NATs, and 80% of the nodes fail, more than 92% of the remaining nodes stay connected. In addition, we compare Gozar with existing NAT-friendly gossip-based peer sampling services, Nylon and ARRG. We show that Gozar is the only system that supports one-hop NAT traversal, and its overhead is roughly half of Nylon’s

    BitTorious volunteer: server-side extensions for centrally-managed volunteer storage in BitTorrent swarms

    Get PDF
    abstract: Background Our publication of the BitTorious portal [1] demonstrated the ability to create a privatized distributed data warehouse of sufficient magnitude for real-world bioinformatics studies using minimal changes to the standard BitTorrent tracker protocol. In this second phase, we release a new server-side specification to accept anonymous philantropic storage donations by the general public, wherein a small portion of each user’s local disk may be used for archival of scientific data. We have implementated the server-side announcement and control portions of this BitTorrent extension into v3.0.0 of the BitTorious portal, upon which compatible clients may be built. Results Automated test cases for the BitTorious Volunteer extensions have been added to the portal’s v3.0.0 release, supporting validation of the “peer affinity” concept and announcement protocol introduced by this specification. Additionally, a separate reference implementation of affinity calculation has been provided in C++ for informaticians wishing to integrate into libtorrent-based projects. Conclusions The BitTorrent “affinity” extensions as provided in the BitTorious portal reference implementation allow data publishers to crowdsource the extreme storage prerequisites for research in “big data” fields. With sufficient awareness and adoption of BitTorious Volunteer-based clients by the general public, the BitTorious portal may be able to provide peta-scale storage resources to the scientific community at relatively insignificant financial cost.The electronic version of this article is the complete one and can be found online at: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0779-

    Mass Adoption of NATs: Survey and experiments on carrier-grade NATs

    Full text link
    In recent times, the prevalence of home NATs and the widespread implementation of Carrier-Grade NATs have posed significant challenges to various applications, particularly those relying on Peer-to-Peer communication. This paper addresses these issues by conducting a thorough review of related literature and exploring potential techniques to mitigate the problems. The literature review focuses on the disruptive effects of home NATs and CGNATs on application performance. Additionally, the study examines existing approaches used to alleviate these disruptions. Furthermore, this paper presents a comprehensive guide on how to puncture a NAT and facilitate direct communication between two peers behind any type of NAT. The techniques outlined in the guide are rigorously tested using a simple application running the IPv8 network overlay, along with their built-in NAT penetration procedures. To evaluate the effectiveness of the proposed techniques, 5G communication is established between two phones using four different Dutch telephone carriers. The results indicate successful cross-connectivity with three out of the four carriers tested, showcasing the practical applicability of the suggested methods.Comment: 12 pages, 9 figure

    Mathematical analysis of scheduling policies in peer-to-peer video streaming networks

    Get PDF
    Las redes de pares son comunidades virtuales autogestionadas, desarrolladas en la capa de aplicación sobre la infraestructura de Internet, donde los usuarios (denominados pares) comparten recursos (ancho de banda, memoria, procesamiento) para alcanzar un fin común. La distribución de video representa la aplicación más desafiante, dadas las limitaciones de ancho de banda. Existen básicamente tres servicios de video. El más simple es la descarga, donde un conjunto de servidores posee el contenido original, y los usuarios deben descargar completamente este contenido previo a su reproducción. Un segundo servicio se denomina video bajo demanda, donde los pares se unen a una red virtual siempre que inicien una solicitud de un contenido de video, e inician una descarga progresiva en línea. El último servicio es video en vivo, donde el contenido de video es generado, distribuido y visualizado simultáneamente. En esta tesis se estudian aspectos de diseño para la distribución de video en vivo y bajo demanda. Se presenta un análisis matemático de estabilidad y capacidad de arquitecturas de distribución bajo demanda híbridas, asistidas por pares. Los pares inician descargas concurrentes de múltiples contenidos, y se desconectan cuando lo desean. Se predice la evolución esperada del sistema asumiendo proceso Poisson de arribos y egresos exponenciales, mediante un modelo determinístico de fluidos. Un sub-modelo de descargas secuenciales (no simultáneas) es globalmente y estructuralmente estable, independientemente de los parámetros de la red. Mediante la Ley de Little se determina el tiempo medio de residencia de usuarios en un sistema bajo demanda secuencial estacionario. Se demuestra teóricamente que la filosofía híbrida de cooperación entre pares siempre desempeña mejor que la tecnología pura basada en cliente-servidor

    Distributed Optimization of P2P Media Delivery Overlays

    Get PDF
    Media streaming over the Internet is becoming increasingly popular. Currently, most media is delivered using global content-delivery networks, providing a scalable and robust client-server model. However, content delivery infrastructures are expensive. One approach to reduce the cost of media delivery is to use peer-to-peer (P2P) overlay networks, where nodes share responsibility for delivering the media to one another. The main challenges in P2P media streaming using overlay networks include: (i) nodes should receive the stream with respect to certain timing constraints, (ii) the overlay should adapt to the changes in the network, e.g., varying bandwidth capacity and join/failure of nodes, (iii) nodes should be intentivized to contribute and share their resources, and (iv) nodes should be able to establish connectivity to the other nodes behind NATs. In this work, we meet these requirements by presenting P2P solutions for live media streaming, as well as proposing a distributed NAT traversal solution. First of all, we introduce a distributed market model to construct an approximately minimal height multiple-tree streaming overlay for content delivery, in gradienTv. In this system, we assume all the nodes are cooperative and execute the protocol. However, in reality, there may exist some opportunistic nodes, free-riders, that take advantage of the system, without contributing to content distribution. To overcome this problem, we extend our market model in Sepidar to be effective in deterring free-riders. However, gradienTv and Sepidar are tree-based solutions, which are fragile in high churn and failure scenarios. We present a solution to this problem in GLive that provides a more robust overlay by replacing the tree structure with a mesh. We show in simulation, that the mesh-based overlay outperforms the multiple-tree overlay. Moreover, we compare the performance of all our systems with the state-of-the-art NewCoolstreaming, and observe that they provide better playback continuity and lower playback latency than that of NewCoolstreaming under a variety of experimental scenarios. Although our distributed market model can be run against a random sample of nodes, we improve its convergence time by executing it against a sample of nodes taken from the Gradient overlay. The Gradient overlay organizes nodes in a topology using a local utility value at each node, such that nodes are ordered in descending utility values away from a core of the highest utility nodes. The evaluations show that the streaming overlays converge faster when our market model works on top of the Gradient overlay. We use a gossip-based peer sampling service in our streaming systems to provide each node with a small list of live nodes. However, in the Internet, where a high percentage of nodes are behind NATs, existing gossiping protocols break down. To solve this problem, we present Gozar, a NAT-friendly gossip-based peer sampling service that: (i) provides uniform random samples in the presence of NATs, and (ii) enables direct connectivity to sampled nodes using a fully distributed NAT traversal service. We compare Gozar with the state-of-the-art NAT-friendly gossip-based peer sampling service, Nylon, and show that only Gozar supports one-hop NAT traversal, and its overhead is roughly half of Nylon’s

    Experimental analysis of the socio-economic phenomena in the BitTorrent ecosystem

    Get PDF
    BitTorrent is the most successful Peer-to-Peer (P2P) application and is responsible for a major portion of Internet traffic. It has been largely studied using simulations, models and real measurements. Although simulations and modelling are easier to perform, they typically simplify analysed problems and in case of BitTorrent they are likely to miss some of the effects which occur in real swarms. Thus, in this thesis we rely on real measurements. In the first part of the thesis we present the summary of measurement techniques used so far and we use it as a base to design our tools that allow us to perform different types of analysis at different resolution level. Using these tools we collect several large-scale datasets to study different aspects of BitTorrent with a special focus on socio-economic aspects. Using our datasets, we first investigate the topology of real BitTorrent swarms and how the traffic is actually exchanged among peers. Our analysis shows that the resilience of BitTorrent swarms is lower than corresponding random graphs. We also observe that ISP policies, locality-aware clients and network events (e.g., network congestion) lead to locality-biased composition of neighbourhood in the swarms. This means that the peer contains more neighbours from local provider than expected from purely random neighbours selection process. Those results are of interest to the companies which use BitTorrent for daily operations as well as for ISPs which carry BitTorrent traffic. In the next part of the thesis we look at the BitTorrent from the perspective of the content and content publishers in a major BitTorrent portals. We focus on the factors that seem to drive the popularity of the BitTorrent and, as a result, could affect its associated traffic in the Internet. We show that a small fraction of publishers (around 100 users) is responsible for more than two-thirds of the published content. Those publishers can be divided into two groups: (i) profit driven and (ii)fake publishers. The former group leverages the published copyrighted content (typically very popular) on BitTorrent portals to attract content consumers to their web sites for financial gain. Removing this group may have a significant impact on the popularity of BitTorrent portals and, as a result, may affect a big portion of the Internet traffic associated to BitTorrent. The latter group is responsible for fake content, which is mostly linked to malicious activity and creates a serious threat for the Bit- Torrent ecosystem and for the Internet in general. To mitigate this threat, in the last part of the thesis we present a new tool named TorrentGuard for the early detection of fake content that could help to significantly reduce the number of computer infections and scams suffered by BitTorrent users. This tool is available through web portal and as a plugin to Vuze, a popular BitTorrent client. Finally, we present MYPROBE, the web portal that allows to query our database and to gather different pieces of information regarding BitTorrent content publishers. ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------BitTorrent es la aplicación peer-to-peer para compartición de ficheros de mayor éxito y responsable de una fracción importante del tráfico de Internet. Trabajos previos han estudiado BitTorrent usando técnicas de simulación, modelos analíticos y medidas reales. Aunque las técnicas analíticas y de simulación son más sencillas de aplicar, típicamente presentan versiones simplificadas de los problemas analizados y en el caso concreto de BitTorrent pueden obviar aspectos o interacciones fundamentales que ocurren en los enjambres de BitTorrent. Por lo tanto, en esta tesis utilizaremos como pilar de nuestra investigación técnicas de medidas reales. En primer lugar presentaremos un resumen de las técnicas de medidas usadas hasta el momento en el ámbito de BitTorrent que suponen la base teórica para el diseño de nuestras propias herramientas de medida que nos permitirán analizar enjambres reales de BitTorrent. Usando los datos obtenidos con estas herramientas estudiaremos aspectos diferentes de BitTorrent con un enfoque especial de los aspectos socioeconómicos. En la primera parte de la tesis, realizaremos un estudio detallado de la topología de los enjambres reales de BitTorrent así como de detalles acerca de las interacciones entre peers. Nuestro análisis demuestra que la resistencia de la topología de los enjambres reales de BitTorrent es menor que la ofrecida por grafos aleatorios equivalentes. Además, los resultados revelan que las políticas de los Provedores de Internet junto con la incipiente utilización de clientes de BitTorrent modificados y otros efectos en la red (p.ej. congestión) hacen que los enjambres reales de BitTorrent presentan una composicin de localidad. Es decir, un nodo tiene un número de vecinos dentro de su mismo Proveedor de Internet mayor del que obtendría en una topología puramente aleatoria. Estos resultados son de interés para las empresas que utilizan BitTorrent en sus operaciones, así como para los Provedores de Internet responsables de transportar el tráfico de BitTorrent. En la segunda parte de la tesis, analizamos los aspectos de publicación de contenido en los mayores portales de BitTorrent. En concreto, los resultados presentados muestran que sólo un pequeño grupo de publicadores (alrededor de 100) es responsable de hacer disponible más de dos tercios del contenido publicado. Además estos publicadores se pueden dividir en dos grupos: (i) aquellos con incentivos económicos y (ii) publicadores de contenido falso. El primer grupo hace disponible contenido protegido por derechos de autor (que es típicamente muy popular) en los principales portales de BitTorrent con el objetivo de atraer a los consumidores de dicho contenido a sus propios sitios web y obtener un beneficio económico. La eliminación de este grupo puede tener un impacto importante en la popularidad de los principales portales de BitTorrent así como en el tráfico generado por BitTorrent en Internet. El segundo grupo es responsable de la publicación de contenidos falsos. La mayor parte de dichos contenidos están asociados a una actividad maliciosa (p.ej. la distribución de software malicioso) y por tanto suponen una seria amenaza para el ecosistema de BitTorrent, en particular, y para Internet en general. Para minimizar los efectos de la amenaza que presentan estos publicadores, en la última parte de la tesis presentaremos una nueva herramienta denominada TorrentGuard para la pronta detección de contenidos falsos. Esta herramienta puede accederse a través de un portal web y a través de un plugin del cliente de BitTorrent Vuze. Finalmente, presentamos MYPROBE, un portal web que permite consultar una base de datos con información actualizada sobre los publicadores de contenidos en BitTorrent

    Data management in dynamic distributed computing environments

    Get PDF
    Data management in parallel computing systems is a broad and increasingly important research topic. As network speeds have surged, so too has the movement to transition storage and computation loads to wide-area network resources. The Grid, the Cloud, and Desktop Grids all represent different aspects of this movement towards highly-scalable, distributed, and utility computing. This dissertation contends that a peer-to-peer (P2P) networking paradigm is a natural match for data sharing within and between these heterogeneous network architectures. Peer-to-peer methods such as dynamic discovery, fault-tolerance, scalability, and ad-hoc security infrastructures provide excellent mappings for many of the requirements in today’s distributed computing environment. In recent years, volunteer Desktop Grids have seen a growth in data throughput as application areas expand and new problem sets emerge. These increasing data needs require storage networks that can scale to meet future demand while also facilitating expansion into new data-intensive research areas. Current practices are to mirror data from centralized locations, a technique that is not practical for growing data sets, dynamic projects, or data-intensive applications. The fusion of Desktop and Service Grids provides an ideal use-case to research peer-to-peer data distribution strategies in a hybrid environment. Desktop Grids have a data management gap, while integration with Service Grids raises new challenges with regard to cross-platform design. The work undertaken here is two-fold: first it explores how P2P techniques can be leveraged to meet the data management needs of Desktop Grids, and second, it shows how the same distribution paradigm can provide migration paths for Service Grid data. The result of this research is a Peer-to-Peer Architecture for Data-Intensive Cycle Sharing (ADICS) that is capable not only of distributing volunteer computing data, but also of providing a transitional platform and storage space for migrating Service Grid jobs to Desktop Grid environments

    Live Streaming with Gossip

    Get PDF
    Peer-to-peer (P2P) architectures have emerged as a popular paradigm to support the dynamic and scalable nature of distributed systems. This is particularly relevant today, given the tremendous increase in the intensity of information exchanged over the Internet. A P2P system is typically composed of participants that are willing to contribute resources, such as memory or bandwidth, in the execution of a collaborative task providing a benefit to all participants. File sharing is probably the most widely used collaborative task, where each participant wants to receive an individual copy of some file. Users collaborate by sending fragments of the file they have already downloaded to other participants. Sharing files containing multimedia content, files that typically reach the hundreds of megabytes to gigabytes, introduces a number of challenges. Given typical bandwidths of participants of hundreds of kilobits per second to a couple of megabits per second, it is unacceptable to wait until completion of the download before actually being able to use the file as the download represents a non negligible time. From the point of view of the participant, getting the (entire) file as fast as possible is typically not good enough. As one example, Video on Demand (VoD) is a scenario where a participant would like to start previewing the multimedia content (the stream), offered by a source, even though only a fraction of it has been received, and then continue the viewing while the rest of the content is being received. Following the same line of reasoning, new applications have emerged that rely on live streaming: the source does not own a file that it wants to share with others, but shares content as soon as it is produced. In other words, the content to distribute is live, not pre-recorded and stored. Typical examples include the broadcasting of live sports events, conferences or interviews. The gossip paradigm is a type of data dissemination that relies on random communication between participants in a P2P system, sharing similarities with the epidemic dissemination of diseases. An epidemic starts to spread when the source randomly chooses a set of communication partners, of size fanout, and infects them, i.e., it shares a rumor with them. This set of participants, in turn, randomly picks fanout communication partners each and infects them, i.e., share with them the same rumor. This paradigm has many advantages including fast propagation of rumors, a probabilistic guarantee that each rumor reaches all participants, high resilience to churn (i.e., participants that join and leave) and high scalability. Gossip therefore constitutes a candidate of choice for live streaming in large-scale systems. These advantages, however, come at a price. While disseminating data, gossip creates many duplicates of the same rumor and participants usually receive multiple copies of the same rumor. While this is obviously a feature when it comes to guaranteeing good dissemination of the rumor when churn is high, it is a clear disadvantage when spreading large amounts of multimedia data (i.e., ordered and time-critical) to participants with limited resources, namely upload bandwidth in the case of high-bandwidth content dissemination. This thesis therefore investigates if and how the gossip paradigm can be used as a highly effcient communication system for live streaming under the following specific scenarios: (i) where participants can only contribute limited resources, (ii) when these limited resources are heterogeneously distributed among nodes, and (iii) where only a fraction of participants are contributing their fair share of work while others are freeriding. To meet these challenges, this thesis proposes (i) gossip++: a gossip-based protocol especially tailored for live streaming that separates the dissemination of metadata, i.e., the location of the data, and the dissemination of the data itself. By first spreading the location of the content to interested participants, the protocol avoids wasted bandwidth in sending and receiving duplicates of the payload, (ii) HEAP: a fanout adaptation mechanism that enables gossip to adapt participants' contribution with respect to their resources while still preserving its reliability, and (iii) LiFT: a protocol to secure high-bandwidth gossip-based dissemination protocols against freeriders

    Network Coding Applications

    Get PDF
    Network coding is an elegant and novel technique introduced at the turn of the millennium to improve network throughput and performance. It is expected to be a critical technology for networks of the future. This tutorial deals with wireless and content distribution networks, considered to be the most likely applications of network coding, and it also reviews emerging applications of network coding such as network monitoring and management. Multiple unicasts, security, networks with unreliable links, and quantum networks are also addressed. The preceding companion deals with theoretical foundations of network coding
    corecore