90 research outputs found

    Reducing the cumulative file download time and variance in a P2P overlay via proximity based peer selection

    Get PDF
    The time it takes to download a file in a peer-to-peer (P2P) overlay network is dependent on several factors. These factors include the quality of the network between peers (e.g. packet loss, latency, and link failures), distance, peer selection technique, and packet loss due to Internet Service Providers (ISPs) engaging in traffic shaping. Recent research shows that P2P download time is adversely impacted by the presence of distant peers, particularly when traffic goes across an ISP that could be engaging in P2P traffic throttling activities. It has also been observed that additional delays are introduced when distant candidate nodes for exchanging data are included during the formation of a P2P network overlay. Researchers have shifted their attention to the mechanism for peer selection. They started questioning the random technique because it ignores the location of nodes in the topology of the underlying physical network. Therefore, selecting nodes for interaction in a distributed system based on their position in the network continues to be an active area of research. The goal of this work was to reduce the cumulative file download time and variance for the majority of participating peers in a P2P network by using a peer selection mechanism that favors nearby nodes. In this proposed proximity strategy, the Internet address space is separated by IP blocks that belong to different Autonomous Systems (AS). IP blocks are further broken up into subsets named zones. Each zone is given a landmark (a.k.a. beacon), for example routers or DNS servers, with a known geographical location. At the time peers joined the network, peers were grouped into zones based on their geographical distance to the selected beacons. Peers that end up in the same zone were put at the top of the list of available nodes for interactions during the formation of the overlay. Experiments were conducted to compare the proposed proximity based peer selection strategy to the random peer selection strategy. The results indicate that the proximity technique outperforms the random approach for peer selection in a network with low packet loss and latency and also in a more realistic network subject to packet loss, traffic shaping and long distances. However, this improved performance came at the cost of additional memory (230 megabytes) and to a lesser extent some additional CPU cycles to run the additional subroutines needed to group peers into zones. The framework and algorithms developed for this work made it possible to implement a fully functioning prototype that implements the proximity strategy. This prototype enabled high fidelity testing with a real client implementation in real networks including the Internet. This made it possible to test without having to rely exclusively on event-driven simulations to prove the hypothesis

    Mobile Data Repositories at the Edge.

    Get PDF
    In a future IoT-dominated environment the majority of data will be produced at the edge, which may be moved to the network core. We argue that this reverses today’s “core-to-edge” data flow to an “edge-to-core” model and puts severe stress on edge access/cellular links. In this paper, we propose a data-centric communication approach which treats storage and wire the same as far as their ability to supply the requested data is concerned. Given that storage is cheaper to provide and scales better than wires, we argue for enhancing network connectivity with local storage services (e.g., in WiFi Access Points, or similar) at the edge of the network. Such local storage services can be used to buffer IoT and user-generated data at the edge, prior to data-cloud synchronization

    Mathematical analysis of scheduling policies in peer-to-peer video streaming networks

    Get PDF
    Las redes de pares son comunidades virtuales autogestionadas, desarrolladas en la capa de aplicación sobre la infraestructura de Internet, donde los usuarios (denominados pares) comparten recursos (ancho de banda, memoria, procesamiento) para alcanzar un fin común. La distribución de video representa la aplicación más desafiante, dadas las limitaciones de ancho de banda. Existen básicamente tres servicios de video. El más simple es la descarga, donde un conjunto de servidores posee el contenido original, y los usuarios deben descargar completamente este contenido previo a su reproducción. Un segundo servicio se denomina video bajo demanda, donde los pares se unen a una red virtual siempre que inicien una solicitud de un contenido de video, e inician una descarga progresiva en línea. El último servicio es video en vivo, donde el contenido de video es generado, distribuido y visualizado simultáneamente. En esta tesis se estudian aspectos de diseño para la distribución de video en vivo y bajo demanda. Se presenta un análisis matemático de estabilidad y capacidad de arquitecturas de distribución bajo demanda híbridas, asistidas por pares. Los pares inician descargas concurrentes de múltiples contenidos, y se desconectan cuando lo desean. Se predice la evolución esperada del sistema asumiendo proceso Poisson de arribos y egresos exponenciales, mediante un modelo determinístico de fluidos. Un sub-modelo de descargas secuenciales (no simultáneas) es globalmente y estructuralmente estable, independientemente de los parámetros de la red. Mediante la Ley de Little se determina el tiempo medio de residencia de usuarios en un sistema bajo demanda secuencial estacionario. Se demuestra teóricamente que la filosofía híbrida de cooperación entre pares siempre desempeña mejor que la tecnología pura basada en cliente-servidor

    Performance Optimization and Dynamics Control for Large-scale Data Transfer in Wide-area Networks

    Get PDF
    Transport control plays an important role in the performance of large-scale scientific and media streaming applications involving transfer of large data sets, media streaming, online computational steering, interactive visualization, and remote instrument control. In general, these applications have two distinctive classes of transport requirements: large-scale scientific applications require high bandwidths to move bulk data across wide-area networks, while media streaming applications require stable bandwidths to ensure smooth media playback. Unfortunately, the widely deployed Transmission Control Protocol is inadequate for such tasks due to its performance limitations. The purpose of this dissertation is to conduct rigorous analytical study of the design and performance of transport solutions, and develop an integrated transport solution in a systematical way to overcome the limitations of current transport methods. One of the primary challenges is to explore and compose a set of feasible route options with multiple constraints. Another challenge essentially arises from the randomness inherent in wide-area networks, particularly the Internet. This randomness must be explicitly accounted for to achieve both goodput maximization and stabilization over the constructed routes by suitably adjusting the source rate in response to both network and host dynamics.The superior and robust performance of the proposed transport solution is extensively evaluated in a simulated environment and further verified through real-life implementations and deployments over both Internet and dedicated connections under disparate network conditions in comparison with existing transport methods

    Hierarchical network topographical routing

    Get PDF
    Within the last 10 years the content consumption model that underlies many of the assumptions about traffic aggregation within the Internet has changed; the previous short burst transfer followed by longer periods of inactivity that allowed for statistical aggregation of traffic has been increasingly replaced by continuous data transfer models. Approaching this issue from a clean slate perspective; this work looks at the design of a network routing structure and supporting protocols for assisting in the delivery of large scale content services. Rather than approaching a content support model through existing IP models the work takes a fresh look at Internet routing through a hierarchical model in order to highlight the benefits that can be gained with a new structural Internet or through similar modifications to the existing IP model. The work is divided into three major sections: investigating the existing UK based Internet structure as compared to the traditional Autonomous System (AS) Internet structural model; a localised hierarchical network topographical routing model; and intelligent distributed localised service models. The work begins by looking at the United Kingdom (UK) Internet structure as an example of a current generation technical and economic model with shared access to the last mile connectivity and a large scale wholesale network between Internet Service Providers (ISPs) and the end user. This model combined with the Internet Protocol (IP) address allocation and transparency of the wholesale network results in an enforced inefficiency within the overall network restricting the ability of ISPs to collaborate. From this model a core / edge separation hierarchical virtual tree based routing protocol based on the physical network topography (layers 2 and 3) is developed to remove this enforced inefficiency by allowing direct management and control at the lowest levels of the network. This model acts as the base layer for further distributed intelligent services such as management and content delivery to enable both ISPs and third parties to actively collaborate and provide content from the most efficient source

    Profiling Large-scale Live Video Streaming and Distributed Applications

    Get PDF
    PhDToday, distributed applications run at data centre and Internet scales, from intensive data analysis, such as MapReduce; to the dynamic demands of a worldwide audience, such as YouTube. The network is essential to these applications at both scales. To provide adequate support, we must understand the full requirements of the applications, which are revealed by the workloads. In this thesis, we study distributed system applications at different scales to enrich this understanding. Large-scale Internet applications have been studied for years, such as social networking service (SNS), video on demand (VoD), and content delivery networks (CDN). An emerging type of video broadcasting on the Internet featuring crowdsourced live video streaming has garnered attention allowing platforms such as Twitch to attract over 1 million concurrent users globally. To better understand Twitch, we collected real-time popularity data combined with metadata about the contents and found the broadcasters rather than the content drives its popularity. Unlike YouTube and Netflix where content can be cached, video streaming on Twitch is generated instantly and needs to be delivered to users immediately to enable real-time interaction. Thus, we performed a large-scale measurement of Twitchs content location revealing the global footprint of its infrastructure as well as discovering the dynamic stream hosting and client redirection strategies that helped Twitch serve millions of users at scale. We next consider applications that run inside the data centre. Distributed computing applications heavily rely on the network due to data transmission needs and the scheduling of resources and tasks. One successful application, called Hadoop, has been widely deployed for Big Data processing. However, little work has been devoted to understanding its network. We found the Hadoop behaviour is limited by hardware resources and processing jobs presented. Thus, after characterising the Hadoop traffic on our testbed with a set of benchmark jobs, we built a simulator to reproduce Hadoops job traffic With the simulator, users can investigate the connections between Hadoop traffic and network performance without additional hardware cost. Different network components can be added to investigate the performance, such as network topologies, queue policies, and transport layer protocols. In this thesis, we extended the knowledge of networking by investigated two widelyused applications in the data centre and at Internet scale. We (i)studied the most popular live video streaming platform Twitch as a new type of Internet-scale distributed application revealing that broadcaster factors drive the popularity of such platform, and we (ii)discovered the footprint of Twitch streaming infrastructure and the dynamic stream hosting and client redirection strategies to provide an in-depth example of video streaming delivery occurring at the Internet scale, also we (iii)investigated the traffic generated by a distributed application by characterising the traffic of Hadoop under various parameters, (iv)with such knowledge, we built a simulation tool so users can efficiently investigate the performance of different network components under distributed applicationQueen Mary University of Londo

    Flexible Application-Layer Multicast in Heterogeneous Networks

    Get PDF
    This work develops a set of peer-to-peer-based protocols and extensions in order to provide Internet-wide group communication. The focus is put to the question how different access technologies can be integrated in order to face the growing traffic load problem. Thereby, protocols are developed that allow autonomous adaptation to the current network situation on the one hand and the integration of WiFi domains where applicable on the other hand

    Incentive-driven QoS in peer-to-peer overlays

    Get PDF
    A well known problem in peer-to-peer overlays is that no single entity has control over the software, hardware and configuration of peers. Thus, each peer can selfishly adapt its behaviour to maximise its benefit from the overlay. This thesis is concerned with the modelling and design of incentive mechanisms for QoS-overlays: resource allocation protocols that provide strategic peers with participation incentives, while at the same time optimising the performance of the peer-to-peer distribution overlay. The contributions of this thesis are as follows. First, we present PledgeRoute, a novel contribution accounting system that can be used, along with a set of reciprocity policies, as an incentive mechanism to encourage peers to contribute resources even when users are not actively consuming overlay services. This mechanism uses a decentralised credit network, is resilient to sybil attacks, and allows peers to achieve time and space deferred contribution reciprocity. Then, we present a novel, QoS-aware resource allocation model based on Vickrey auctions that uses PledgeRoute as a substrate. It acts as an incentive mechanism by providing efficient overlay construction, while at the same time allocating increasing service quality to those peers that contribute more to the network. The model is then applied to lagsensitive chunk swarming, and some of its properties are explored for different peer delay distributions. When considering QoS overlays deployed over the best-effort Internet, the quality received by a client cannot be adjudicated completely to either its serving peer or the intervening network between them. By drawing parallels between this situation and well-known hidden action situations in microeconomics, we propose a novel scheme to ensure adherence to advertised QoS levels. We then apply it to delay-sensitive chunk distribution overlays and present the optimal contract payments required, along with a method for QoS contract enforcement through reciprocative strategies. We also present a probabilistic model for application-layer delay as a function of the prevailing network conditions. Finally, we address the incentives of managed overlays, and the prediction of their behaviour. We propose two novel models of multihoming managed overlay incentives in which overlays can freely allocate their traffic flows between different ISPs. One is obtained by optimising an overlay utility function with desired properties, while the other is designed for data-driven least-squares fitting of the cross elasticity of demand. This last model is then used to solve for ISP profit maximisation

    The Centripetal Network: How the Internet Holds Itself Together, and the Forces Tearing It Apart

    Get PDF
    Two forces are in tension as the Internet evolves. One pushes toward interconnected common platforms; the other pulls toward fragmentation and proprietary alternatives. Their interplay drives many of the contentious issues in cyberlaw, intellectual property, and telecommunications policy, including the fight over network neutrality for broadband providers, debates over global Internet governance, and battles over copyright online. These are more than just conflicts between incumbents and innovators, or between openness and deregulation. Their roots lie in the fundamental dynamics of interconnected networks. Fortunately, there is an interdisciplinary literature on network properties, albeit one virtually unknown to legal scholars. The emerging field of network formation theory explains the pressures threatening to pull the Internet apart, and suggests responses. The Internet as we know it is surprisingly fragile. To continue the extraordinary outpouring of creativity and innovation that the Internet fosters, policy-makers must protect its composite structure against both fragmentation and excessive concentration of power. This paper, the first to apply network formation models to Internet law, shows how the Internet pulls itself together as a coherent whole. This very process, however, creates and magnifies imbalances that encourage balkanization. By understanding how networks behave, governments and other legal decision-makers can avoid unintended consequences and target their actions appropriately. A network-theoretic perspective holds great promise to inform the law and policy of the information economy

    On the scalability of LISP and advanced overlaid services

    Get PDF
    In just four decades the Internet has gone from a lab experiment to a worldwide, business critical infrastructure that caters to the communication needs of almost a half of the Earth's population. With these figures on its side, arguing against the Internet's scalability would seem rather unwise. However, the Internet's organic growth is far from finished and, as billions of new devices are expected to be joined in the not so distant future, scalability, or lack thereof, is commonly believed to be the Internet's biggest problem. While consensus on the exact form of the solution is yet to be found, the need for a semantic decoupling of a node's location and identity, often called a location/identity separation, is generally accepted as a promising way forward. Typically, this requires the introduction of new network elements that provide the binding of the two names-paces and caches that avoid hampering router packet forwarding speeds. But due to this increased complexity the solution's scalability is itself questioned. This dissertation evaluates the suitability of using the Locator/ID Separation Protocol (LISP), one of the most successful proposals to follow the location/identity separation guideline, as a solution to the Internet's scalability problem. However, because the deployment of any new architecture depends not only on solving the incumbent's technical problems but also on the added value that it brings, our approach follows two lines. In the first part of the thesis, we develop the analytical tools to evaluate LISP's control plane scalability while in the second we show that the required control/data plane separation provides important benefits that could drive LISP's adoption. As a first step to evaluating LISP's scalability, we propose a methodology for an analytical analysis of cache performance that relies on the working-set theory to estimate traffic locality of reference. One of our main contribution is that we identify the conditions network traffic must comply with for the theory to be applicable and then use the result to develop a model that predicts average cache miss rates. Furthermore, we study the model's suitability for long term cache provisioning and assess the cache's vulnerability in front of malicious users through an extension that accounts for cache polluting traffic. As a last step, we investigate the main sources of locality and their impact on the asymptotic scalability of the LISP cache. An important finding here is that destination popularity distribution can accurately describe cache performance, independent of the much harder to model short term correlations. Under a small set of assumptions, this result finally enables us to characterize asymptotic scalability with respect to the amount of prefixes (Internet growth) and users (growth of the LISP site). We validate the models and discuss the accuracy of our assumptions using several one-day-long packet traces collected at the egress points of a campus and an academic network. To show the added benefits that could drive LISP's adoption, in the second part of the thesis we investigate the possibilities of performing inter-domain multicast and improving intra-domain routing. Although the idea of using overlaid services to improve underlay performance is not new, this dissertation argues that LISP offers the right tools to reliably and easily implement such services due to its reliance on network instead of application layer support. In particular, we present and extensively evaluate Lcast, a network-layer single-source multicast framework designed to merge the robustness and efficiency of IP multicast with the configurability and low deployment cost of application-layer overlays. Additionally, we describe and evaluate LISP-MPS, an architecture capable of exploiting LISP to minimize intra-domain routing tables and ensure, among other, support for multi protocol switching and virtual networks.En menos de cuatro décadas Internet ha evolucionado desde un experimento de laboratorio hasta una infraestructura de alcance mundial, de importancia crítica para negocios y que atiende a las necesidades de casi un tercio de los habitantes del planeta. Con estos números, es difícil tratar de negar la necesidad de escalabilidad de Internet. Sin embargo, el crecimiento orgánico de Internet está aún lejos de finalizar ya que se espera que mil millones de dispositivos nuevos se conecten en el futuro cercano. Así pues, la falta de escalabilidad es el mayor problema al que se enfrenta Internet hoy en día. Aunque la solución definitiva al problema está aún por definir, la necesidad de desacoplar semánticamente la localización e identidad de un nodo, a menudo llamada locator/identifier separation, es generalmente aceptada como un camino prometedor a seguir. Sin embargo, esto requiere la introducción de nuevos dispositivos en la red que unan los dos espacios de nombres disjuntos resultantes y de cachés que almacenen los enlaces temporales entre ellos con el fin de aumentar la velocidad de transmisión de los enrutadores. A raíz de esta complejidad añadida, la escalabilidad de la solución en si misma es también cuestionada. Este trabajo evalúa la idoneidad de utilizar Locator/ID Separation Protocol (LISP), una de las propuestas más exitosas que siguen la pauta locator/identity separation, como una solución para la escalabilidad de la Internet. Con tal fin, desarrollamos las herramientas analíticas para evaluar la escalabilidad del plano de control de LISP pero también para mostrar que la separación de los planos de control y datos proporciona un importante valor añadido que podría impulsar la adopción de LISP. Como primer paso para evaluar la escalabilidad de LISP, proponemos una metodología para un estudio analítico del rendimiento de la caché que se basa en la teoría del working-set para estimar la localidad de referencias. Identificamos las condiciones que el tráfico de red debe cumplir para que la teoría sea aplicable y luego desarrollamos un modelo que predice las tasas medias de fallos de caché con respecto a parámetros de tráfico fácilmente medibles. Por otra parte, para demostrar su versatilidad y para evaluar la vulnerabilidad de la caché frente a usuarios malintencionados, extendemos el modelo para considerar el rendimiento frente a tráfico generado por usuarios maliciosos. Como último paso, investigamos como usar la popularidad de los destinos para estimar el rendimiento de la caché, independientemente de las correlaciones a corto plazo. Bajo un pequeño conjunto de hipótesis conseguimos caracterizar la escalabilidad con respecto a la cantidad de prefijos (el crecimiento de Internet) y los usuarios (crecimiento del sitio LISP). Validamos los modelos y discutimos la exactitud de nuestras suposiciones utilizando varias trazas de paquetes reales. Para mostrar los beneficios adicionales que podrían impulsar la adopción de LISP, también investigamos las posibilidades de realizar multidifusión inter-dominio y la mejora del enrutamiento dentro del dominio. Aunque la idea de utilizar servicios superpuestos para mejorar el rendimiento de la capa subyacente no es nueva, esta tesis sostiene que LISP ofrece las herramientas adecuadas para poner en práctica de forma fiable y fácilmente este tipo de servicios debido a que LISP actúa en la capa de red y no en la capa de aplicación. En particular, presentamos y evaluamos extensamente Lcast, un marco de multidifusión con una sola fuente diseñado para combinar la robustez y eficiencia de la multidifusión IP con la capacidad de configuración y bajo coste de implementación de una capa superpuesta a nivel de aplicación. Además, describimos y evaluamos LISP-MPS, una arquitectura capaz de explotar LISP para minimizar las tablas de enrutamiento intra-dominio y garantizar, entre otras, soporte para conmutación multi-protocolo y redes virtuales
    corecore