21 research outputs found

    Inferring persistent interdomain congestion

    Get PDF
    There is significant interest in the technical and policy communities regarding the extent, scope, and consumer harm of persistent interdomain congestion. We provide empirical grounding for discussions of interdomain congestion by developing a system and method to measure congestion on thousands of interdomain links without direct access to them. We implement a system based on the Time Series Latency Probes (TSLP) technique that identifies links with evidence of recurring congestion suggestive of an under-provisioned link. We deploy our system at 86 vantage points worldwide and show that congestion inferred using our lightweight TSLP method correlates with other metrics of interconnection performance impairment. We use our method to study interdomain links of eight large U.S. broadband access providers from March 2016 to December 2017, and validate our inferences against ground-truth traffic statistics from two of the providers. For the period of time over which we gathered measurements, we did not find evidence of widespread endemic congestion on interdomain links between access ISPs and directly connected transit and content providers, although some such links exhibited recurring congestion patterns. We describe limitations, open challenges, and a path toward the use of this method for large-scale third-party monitoring of the Internet interconnection ecosystem

    BGP-Multipath Routing in the Internet

    Get PDF
    BGP-Multipath, or BGP-M, is a routing technique for balancing traffic load in the Internet. It enables a Border Gateway Protocol (BGP) border router to install multiple ‘equally-good’ paths to a destination prefix. While other multipath routing techniques are deployed at internal routers, BGP-M is deployed at border routers where traffic is shared on multiple border links between Autonomous Systems (ASes). Although there are a considerable number of research efforts on multipath routing, there is so far no dedicated measurement or study on BGP-M in the literature. This thesis presents the first systematic study on BGP-M. I proposed a novel approach to inferring the deployment of BGP-M by querying Looking Glass (LG) servers. I conducted a detailed investigation on the deployment of BGP-M in the Internet. I also analysed BGP-M’s routing properties based on traceroute measurements using RIPE Atlas probes. My research has revealed that BGP-M has already been used in the Internet. In particular, Hurricane Electric (AS6939), a Tier-1 network operator, has deployed BGP-M at border routers across its global network to hundreds of its neighbour ASes on both IPv4 and IPv6 Internet. My research has provided the state-of-the-art knowledge and insights in the deployment, configuration and operation of BGP-M. The data, methods and analysis introduced in this thesis can be immensely valuable to researchers, network operators and regulators who are interested in improving the performance and security of Internet routing. This work has raised awareness of BGP-M and may promote more deployment of BGP-M in future because BGP-M not only provides all benefits of multipath routing but also has distinct advantages in terms of flexibility, compatibility and transparency

    Traffic Re-engineering: Extending Resource Pooling Through the Application of Re-feedback

    Get PDF
    Parallelism pervades the Internet, yet efficiently pooling this increasing path diversity has remained elusive. With no holistic solution for resource pooling, each layer of the Internet architecture attempts to balance traffic according to its own needs, potentially at the expense of others. From the edges, traffic is implicitly pooled over multiple paths by retrieving content from different sources. Within the network, traffic is explicitly balanced across multiple links through the use of traffic engineering. This work explores how the current architecture can be realigned to facilitate resource pooling at both network and transport layers, where tension between stakeholders is strongest. The central theme of this thesis is that traffic engineering can be performed more efficiently, flexibly and robustly through the use of re-feedback. A cross-layer architecture is proposed for sharing the responsibility for resource pooling across both hosts and network. Building on this framework, two novel forms of traffic management are evaluated. Efficient pooling of traffic across paths is achieved through the development of an in-network congestion balancer, which can function in the absence of multipath transport. Network and transport mechanisms are then designed and implemented to facilitate path fail-over, greatly improving resilience without requiring receiver side cooperation. These contributions are framed by a longitudinal measurement study which provides evidence for many of the design choices taken. A methodology for scalably recovering flow metrics from passive traces is developed which in turn is systematically applied to over five years of interdomain traffic data. The resulting findings challenge traditional assumptions on the preponderance of congestion control on resource sharing, with over half of all traffic being constrained by limits other than network capacity. All of the above represent concerted attempts to rethink and reassert traffic engineering in an Internet where competing solutions for resource pooling proliferate. By delegating responsibilities currently overloading the routing architecture towards hosts and re-engineering traffic management around the core strengths of the network, the proposed architectural changes allow the tussle surrounding resource pooling to be drawn out without compromising the scalability and evolvability of the Internet

    Tackling mobile traffic critical path analysis with passive and active measurements

    Get PDF
    Critical Path Analysis (CPA) studies the delivery of webpages to identify page resources, their interrelations, as well as their impact on the page loading latency. Despite CPA being a generic methodology, its mechanisms have been applied only to browsers and web traffic, but those do not directly apply to study generic mobile apps. Likewise, web browsing represents only a small fraction of the overall mobile traffic. In this paper, we take a first step towards filling this gap by exploring how CPA can be performed for generic mobile applications. We propose Mobile Critical Path Analysis (MCPA), a methodology based on passive and active network measurements that is applicable to a broad set of apps to expose a fine-grained view of their traffic dynamics. We validate MCPA on popular apps across different categories and usage scenarios. We show that MCPA can identify user interactions with mobile apps only based on traffic monitoring, and the relevant network activities that are bottlenecks. Overall, we observe that apps spend 60% of time and 84% of bytes on critical traffic on average, corresponding to +22% time and +13% bytes than what observed for browsing

    Methods for revealing and reshaping the African Internet Ecosystem as a case study for developing regions: from isolated networks to a connected continent

    Get PDF
    Mención Internacional en el título de doctorWhile connecting end-users worldwide, the Internet increasingly promotes local development by making challenges much simpler to overcome, regardless of the field in which it is used: governance, economy, education, health, etc. However, African Network Information Centre (AfriNIC), the Regional Internet Registry (RIR) of Africa, is characterized by the lowest Internet penetration: 28.6% as of March 2017 compared to an average of 49.7% worldwide according to the International Telecommunication Union (ITU) estimates [139]. Moreover, end-users experience a poor Quality of Service (QoS) provided at high costs. It is thus of interest to enlarge the Internet footprint in such under-connected regions and determine where the situation can be improved. Along these lines, this doctoral thesis thoroughly inspects, using both active and passive data analysis, the critical aspects of the African Internet ecosystem and outlines the milestones of a methodology that could be adopted for achieving similar purposes in other developing regions. The thesis first presents our efforts to help build measurements infrastructures for alleviating the shortage of a diversified range of Vantage Points (VPs) in the region, as we cannot improve what we can not measure. It then unveils our timely and longitudinal inspection of the African interdomain routing using the enhanced RIPE Atlas measurements infrastructure for filling the lack of knowledge of both IPv4 and IPv6 topologies interconnecting local Internet Service Providers (ISPs). It notably proposes reproducible data analysis techniques suitable for the treatment of any set of similar measurements to infer the behavior of ISPs in the region. The results show a large variety of transit habits, which depend on socio-economic factors such as the language, the currency area, or the geographic location of the country in which the ISP operates. They indicate the prevailing dominance of ISPs based outside Africa for the provision of intracontinental paths, but also shed light on the efforts of stakeholders for traffic localization. Next, the thesis investigates the causes and impacts of congestion in the African IXP substrate, as the prevalence of this endemic phenomenon in local Internet markets may hinder their growth. Towards this end, Ark monitors were deployed at six strategically selected local Internet eXchange Points (IXPs) and used for collecting Time-Sequence Latency Probes (TSLP) measurements during a whole year. The analysis of these datasets reveals no evidence of widespread congestion: only 2.2% of the monitored links experienced noticeable indication of congestion, thus promoting peering. The causes of these events were identified during IXP operator interviews, showing how essential collaboration with stakeholders is to understanding the causes of performance degradations. As part of the Internet Society (ISOC) strategy to allow the Internet community to profile the IXPs of a particular region and monitor their evolution, a route-collector data analyzer was then developed and afterward, it was deployed and tested in AfriNIC. This open source web platform titled the “African” Route-collectors Data Analyzer (ARDA) provides metrics, which picture in real-time the status of interconnection at different levels, using public routing information available at local route-collectors with a peering viewpoint of the Internet. The results highlight that a small proportion of Autonomous System Numbers (ASNs) assigned by AfriNIC (17 %) are peering in the region, a fraction that remained static from April to September 2017 despite the significant growth of IXPs in some countries. They show how ARDA can help detect the impact of a policy on the IXP substrate and help ISPs worldwide identify new interconnection opportunities in Africa, the targeted region. Since broadening the underlying network is not useful without appropriately provisioned services to exploit it, the thesis then delves into the availability and utilization of the web infrastructure serving the continent. Towards this end, a comprehensive measurement methodology is applied to collect data from various sources. A focus on Google reveals that its content infrastructure in Africa is, indeed, expanding; nevertheless, much of its web content is still served from the United States (US) and Europe, although being the most popular content source in many African countries. Further, the same analysis is repeated across top global and regional websites, showing that even top African websites prefer to host their content abroad. Following that, the primary bottlenecks faced by Content Providers (CPs) in the region such as the lack of peering between the networks hosting our probes and poorly configured DNS resolvers are explored to outline proposals for further ISP and CP deployments. Considering the above, an option to enrich connectivity and incentivize CPs to establish a presence in the region is to interconnect ISPs present at isolated IXPs by creating a distributed IXP layout spanning the continent. In this respect, the thesis finally provides a four-step interconnection scheme, which parameterizes socio-economic, geographical, and political factors using public datasets. It demonstrates that this constrained solution doubles the percentage of continental intra-African paths, reduces their length, and drastically decreases the median of their Round Trip Times (RTTs) as well as RTTs to ASes hosting the top 10 global and top 10 regional Alexa websites. We hope that quantitatively demonstrating the benefits of this framework will incentivize ISPs to intensify peering and CPs to increase their presence, for enabling fast, affordable, and available access at the Internet frontier.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: David Fernández Cambronero.- Secretario: Alberto García Martínez.- Vocal: Cristel Pelsse

    Anatomy of Multipath BGP Deployment in a Large ISP Network

    Get PDF
    Multipath routing is useful for networks to achieve load sharing among multiple routing paths. Multipath BGP (M-BGP) is a technique to realize inter-domain multipath routing by enabling a BGP router to install multiple equally-good routes to a destination prefix. Most of previous works did not distinguish between intra-domain and inter-domain multipath routing. In this paper, we present a measurement study on the deployment of M-BGP in a large Internet service provider (ISP) network. Our method combines control-plane BGP measurements using Looking Glasses (LG), and data-plane traceroute measurements using RIPE Atlas. We focus on Hurricane Electric (AS6939) because it is a global ISP that connects with hundreds of major exchange points and exchanges IP traffic with thousands of different networks. And more importantly, we find that this ISP has by far the largest number of M-BGP deployments among autonomous systems with LG servers. Specifically, Hurricane Electric has deployed M-BGP with 512 of its peering ASes at 58 PoPs around the world, including many top ASes and content providers. We also observe that most of its M-BGP deployments involve IXP interconnections. Our work provides insights into the latest deployment of M-BGP in a major ISP network and it highlights the characteristics and effectiveness of M-BGP as a means to realize load sharing

    Provider and peer selection in the evolving internet ecosystem

    Get PDF
    The Internet consists of thousands of autonomous networks connected together to provide end-to-end reachability. Networks of different sizes, and with different functions and business objectives, interact and co-exist in the evolving "Internet Ecosystem". The Internet ecosystem is highly dynamic, experiencing growth (birth of new networks), rewiring (changes in the connectivity of existing networks), as well as deaths (of existing networks). The dynamics of the Internet ecosystem are determined both by external "environmental" factors (such as the state of the global economy or the popularity of new Internet applications) and the complex incentives and objectives of each network. These dynamics have major implications on how the future Internet will look like. How does the Internet evolve? What is the Internet heading towards, in terms of topological, performance, and economic organization? How do given optimization strategies affect the profitability of different networks? How do these strategies affect the Internet in terms of topology, economics, and performance? In this thesis, we take some steps towards answering the above questions using a combination of measurement and modeling approaches. We first study the evolution of the Autonomous System (AS) topology over the last decade. In particular, we classify ASes and inter-AS links according to their business function, and study separately their evolution over the last 10 years. Next, we focus on enterprise customers and content providers at the edge of the Internet, and propose algorithms for a stub network to choose its upstream providers to maximize its utility (either monetary cost, reliability or performance). Third, we develop a model for interdomain network formation, incorporating the effects of economics, geography, and the provider/peer selections strategies of different types of networks. We use this model to examine the "outcome" of these strategies, in terms of the topology, economics and performance of the resulting internetwork. We also investigate the effect of external factors, such as the nature of the interdomain traffic matrix, customer preferences in provider selection, and pricing/cost structures. Finally, we focus on a recent trend due to the increasing amount of traffic flowing from content providers (who generate content), to access providers (who serve end users). This has led to a tussle between content providers and access providers, who have threatened to prioritize certain types of traffic, or charge content providers directly -- strategies that are viewed as violations of "network neutrality". In our work, we evaluate various pricing and connection strategies that access providers can use to remain profitable without violating network neutrality.Ph.D.Committee Chair: Dovrolis, Constantine; Committee Member: Ammar, Mostafa; Committee Member: Feamster, Nick; Committee Member: Willinger, Walter; Committee Member: Zegura, Elle

    Machine Learning and Big Data Methodologies for Network Traffic Monitoring

    Get PDF
    Over the past 20 years, the Internet saw an exponential grown of traffic, users, services and applications. Currently, it is estimated that the Internet is used everyday by more than 3.6 billions users, who generate 20 TB of traffic per second. Such a huge amount of data challenge network managers and analysts to understand how the network is performing, how users are accessing resources, how to properly control and manage the infrastructure, and how to detect possible threats. Along with mathematical, statistical, and set theory methodologies machine learning and big data approaches have emerged to build systems that aim at automatically extracting information from the raw data that the network monitoring infrastructures offer. In this thesis I will address different network monitoring solutions, evaluating several methodologies and scenarios. I will show how following a common workflow, it is possible to exploit mathematical, statistical, set theory, and machine learning methodologies to extract meaningful information from the raw data. Particular attention will be given to machine learning and big data methodologies such as DBSCAN, and the Apache Spark big data framework. The results show that despite being able to take advantage of mathematical, statistical, and set theory tools to characterize a problem, machine learning methodologies are very useful to discover hidden information about the raw data. Using DBSCAN clustering algorithm, I will show how to use YouLighter, an unsupervised methodology to group caches serving YouTube traffic into edge-nodes, and latter by using the notion of Pattern Dissimilarity, how to identify changes in their usage over time. By using YouLighter over 10-month long races, I will pinpoint sudden changes in the YouTube edge-nodes usage, changes that also impair the end users’ Quality of Experience. I will also apply DBSCAN in the deployment of SeLINA, a self-tuning tool implemented in the Apache Spark big data framework to autonomously extract knowledge from network traffic measurements. By using SeLINA, I will show how to automatically detect the changes of the YouTube CDN previously highlighted by YouLighter. Along with these machine learning studies, I will show how to use mathematical and set theory methodologies to investigate the browsing habits of Internauts. By using a two weeks dataset, I will show how over this period, the Internauts continue discovering new websites. Moreover, I will show that by using only DNS information to build a profile, it is hard to build a reliable profiler. Instead, by exploiting mathematical and statistical tools, I will show how to characterize Anycast-enabled CDNs (A-CDNs). I will show that A-CDNs are widely used either for stateless and stateful services. That A-CDNs are quite popular, as, more than 50% of web users contact an A-CDN every day. And that, stateful services, can benefit of A-CDNs, since their paths are very stable over time, as demonstrated by the presence of only a few anomalies in their Round Trip Time. Finally, I will conclude by showing how I used BGPStream an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data. By using BGPStream in real-time mode I will show how I detected a Multiple Origin AS (MOAS) event, and how I studies the black-holing community propagation, showing the effect of this community in the network. Then, by using BGPStream in historical mode, and the Apache Spark big data framework over 16 years of data, I will show different results such as the continuous growth of IPv4 prefixes, and the growth of MOAS events over time. All these studies have the aim of showing how monitoring is a fundamental task in different scenarios. In particular, highlighting the importance of machine learning and of big data methodologies

    Improving the Accuracy of the Internet Cartography

    Get PDF
    As the global Internet expands to satisfy the demands of the ever-increasing connected population, profound changes are occurring in its interconnection structure. The pervasive growth of IXPs and CDNs, two initially independent but synergistic infrastructure sectors, have contributed to the gradual flattening of the Internet’s inter-domain hierarchy with primary routing paths shifting from backbone networks to peripheral peering links. At the same time the IPv6 deployment has taken off due to the depletion of unallocated IPv4 addresses. These fundamental changes in Internet dynamics has obvious implications for network engineering and operations, which can be benefited by accurate topology maps to understand the properties of this critical infrastructure. This thesis presents a set of new measurement techniques and inference algorithms to construct a new type of semantically rich Internet map, and improve the state of the art in Internet cartography. The author first develops a methodology to extract large-scale validation data from the Communities BGP attribute, which encodes rich routing meta-data on BGP messages. Based on this better-informed dataset the author proceeds to analyse popular assumptions about inter-domain routing policies and devise a more accurate model to describe inter-AS business relationships. Accordingly, the thesis proposes a new relationship inference algorithm to accurately capture both simple and complex AS relationships across two dimensions: prefix type, and geographic location. Validation against three sources of ground-truth data reveals that the proposed algorithm achieves a near-perfect accuracy. However, any inference approach is constrained by the inability of the existing topology data sources to provide a complete view of the inter-domain topology. To limit the topology incompleteness problem the author augments traditional BGP data with routing policy data obtained directly from IXPs to discover massive peering meshes which have thus far been largely invisible
    corecore