59 research outputs found

    Mining Network Events using Traceroute Empathy

    Full text link
    In the never-ending quest for tools that enable an ISP to smooth troubleshooting and improve awareness of network behavior, very much effort has been devoted in the collection of data by active and passive measurement at the data plane and at the control plane level. Exploitation of collected data has been mostly focused on anomaly detection and on root-cause analysis. Our objective is somewhat in the middle. We consider traceroutes collected by a network of probes and aim at introducing a practically applicable methodology to quickly spot measurements that are related to high-impact events happened in the network. Such filtering process eases further in- depth human-based analysis, for example with visual tools which are effective only when handling a limited amount of data. We introduce the empathy relation between traceroutes as the cornerstone of our formal characterization of the traceroutes related to a network event. Based on this model, we describe an algorithm that finds traceroutes related to high-impact events in an arbitrary set of measurements. Evidence of the effectiveness of our approach is given by experimental results produced on real-world data.Comment: 8 pages, 7 figures, extended version of Discovering High-Impact Routing Events using Traceroutes, in Proc. 20th International Symposium on Computers and Communications (ISCC 2015

    The Impact of IXPs on the AS-level Topology Structure of the Internet

    Get PDF
    The AS-level topology of the Internet has been quite a hot research topic in the last few years. However, only a small number of studies have been developed that give a structural interpretation of this graph. Such an interpretation is crucially important in order to test protocols and optimal routing algorithms, to design efficient networks, and for failure detection purposes. Moreover, most research does not highlight the role that IXPs have on the AS-level structure of the Internet, although their role is recognized as fundamental. The initial contribution of this study is an analysis of the most important AS-level topologies that are publicly found on the web and an analysis of the topology obtained when they are merged. We compiled structural information from this topology making considerable use of the k-core decomposition technique to delineate various particular classes of nodes. Next, we associated node properties with a reasonable modus operandi of the ASs on the Internet. The second contribution is a study of the impact that ASs connected to IXPs and BGP connections crossing IXPs have on the AS-level topology. To achieve this, we developed a procedure to gather reliable information related to IXPs and their participants

    ISP Probing Reduction with Anaximander

    Full text link
    peer reviewedSince the early 2000's, Internet topology discovery has been an active research topic, providing data for various studies such as Internet modeling, network management, or to assist and support network protocol design. Within this research area, ISP mapping at the router level has attracted little interest despite its utility to perform intra-domain routing evaluation. Since Rocketfuel (and, to a smaller extent, mrinfo), no new tool or method has emerged for systematically mapping intra-domain topologies. In this paper, we introduce Anaximander, a new efficient approach for probing and discovering a targeted ISP in particular. Considering a given set of vantage points, we implement and combine several predictive strategies to mitigate the number of probes to be sent without sacrificing the ISP coverage. To assess the ability of our method to efficiently retrieve an ISP map, we rely on a large dataset of ISPs having distinct nature and demonstrate how Anaximander can be tuned with a simple parameter to control the trade-off between coverage and probing budget

    Rusty Clusters? Dusting an IPv6 Research Foundation

    Get PDF
    The long-running IPv6 Hitlist service is an important foundation for IPv6 measurement studies. It helps to overcome infeasible, complete address space scans by collecting valuable, unbiased IPv6 address candidates and regularly testing their responsiveness. However, the Internet itself is a quickly changing ecosystem that can affect longrunning services, potentially inducing biases and obscurities into ongoing data collection means. Frequent analyses but also updates are necessary to enable a valuable service to the community. In this paper, we show that the existing hitlist is highly impacted by the Great Firewall of China, and we offer a cleaned view on the development of responsive addresses. While the accumulated input shows an increasing bias towards some networks, the cleaned set of responsive addresses is well distributed and shows a steady increase. Although it is a best practice to remove aliased prefixes from IPv6 hitlists, we show that this also removes major content delivery networks. More than 98% of all IPv6 addresses announced by Fastly were labeled as aliased and Cloudflare prefixes hosting more than 10M domains were excluded. Depending on the hitlist usage, e.g., higher layer protocol scans, inclusion of addresses from these providers can be valuable. Lastly, we evaluate different new address candidate sources, including target generation algorithms to improve the coverage of the current IPv6 Hitlist. We show that a combination of different methodologies is able to identify 5.6M new, responsive addresses. This accounts for an increase by 174% and combined with the current IPv6 Hitlist, we identify 8.8M responsive addresses

    Advanced Network Inference Techniques Based on Network Protocol Stack Information Leaks

    Get PDF
    Side channels are channels of implicit information flow that can be used to find out information that is not allowed to flow through explicit channels. This thesis focuses on network side channels, where information flow occurs in the TCP/IP network stack implementations of operating systems. I will describe three new types of idle scans: a SYN backlog idle scan, a RST rate-limit idle scan, and a hybrid idle scan. Idle scans are special types of side channels that are designed to help someone performing a network measurement (typically an attacker or a researcher) to infer something about the network that they are not otherwise able to see from their vantage point. The thesis that this dissertation tests is this: because modern network stacks have shared resources, there is a wealth of information that can be inferred off-path by both attackers and Internet measurement researchers. With respect to attackers, no matter how carefully the security model is designed, the non-interference property is unlikely to hold, i.e., an attacker can easily find side channels of information flow to learn about the network from the perspective of the system remotely. One suggestion is that trust relationships for using resources be made explicit all the way down to IP layer with the goal of dividing resources and removing sharendess to prevent advanced network reconnaissance. With respect to Internet measurement researchers, in this dissertation I show that the information flow is rich enough to test connectivity between two arbitrary hosts on the Internet and even infer in which direction any blocking is occurring. To explore this thesis, I present three research efforts: --- First, I modeled a typical TCP/IP network stack. The building process for this modeling effort led to the discovery of two new idles scans: a SYN backlog idle scan and a RST rate-limited idle scan. The SYN backlog scan is particularly interesting because it does not require whoever is performing the measurements (i.e., the attacker or researcher) to send any packets to the victim (or target) at all. --- Second, I developed a hybrid idle scan that combines elements of the SYN backlog idle scan with Antirez\u27s original IPID-based idle scan. This scan enables researchers to test whether two arbitrary machines in the world are able to communicate via TCP/IP, and, if not, in which direction the communication is being prevented. To test the efficacy of the hybrid idle scan, I tested three different kinds of servers (Tor bridges, Tor directory servers, and normal web servers) both inside and outside China. The results were congruent with published understandings of global Internet censorship, demonstrating that the hybrid idle scan is effective. --- Third, I applied the hybrid idle scan to the difficult problem of characterizing inconsistencies in the Great Firewall of China (GFW), which is the largest firewall in the world. This effort resolved many open questions about the GFW. The result of my dissertation work is an effective method for measuring Internet censorship around the world, without requiring any kind of distributed measurement platform or access to any of the machines that connectivity is tested to or from

    Leveraging Conventional Internet Routing Protocol Behavior to Defeat DDoS and Adverse Networking Conditions

    Get PDF
    The Internet is a cornerstone of modern society. Yet increasingly devastating attacks against the Internet threaten to undermine the Internet\u27s success at connecting the unconnected. Of all the adversarial campaigns waged against the Internet and the organizations that rely on it, distributed denial of service, or DDoS, tops the list of the most volatile attacks. In recent years, DDoS attacks have been responsible for large swaths of the Internet blacking out, while other attacks have completely overwhelmed key Internet services and websites. Core to the Internet\u27s functionality is the way in which traffic on the Internet gets from one destination to another. The set of rules, or protocol, that defines the way traffic travels the Internet is known as the Border Gateway Protocol, or BGP, the de facto routing protocol on the Internet. Advanced adversaries often target the most used portions of the Internet by flooding the routes benign traffic takes with malicious traffic designed to cause widespread traffic loss to targeted end users and regions. This dissertation focuses on examining the following thesis statement. Rather than seek to redefine the way the Internet works to combat advanced DDoS attacks, we can leverage conventional Internet routing behavior to mitigate modern distributed denial of service attacks. The research in this work breaks down into a single arc with three independent, but connected thrusts, which demonstrate that the aforementioned thesis is possible, practical, and useful. The first thrust demonstrates that this thesis is possible by building and evaluating Nyx, a system that can protect Internet networks from DDoS using BGP, without an Internet redesign and without cooperation from other networks. This work reveals that Nyx is effective in simulation for protecting Internet networks and end users from the impact of devastating DDoS. The second thrust examines the real-world practicality of Nyx, as well as other systems which rely on real-world BGP behavior. Through a comprehensive set of real-world Internet routing experiments, this second thrust confirms that Nyx works effectively in practice beyond simulation as well as revealing novel insights about the effectiveness of other Internet security defensive and offensive systems. We then follow these experiments by re-evaluating Nyx under the real-world routing constraints we discovered. The third thrust explores the usefulness of Nyx for mitigating DDoS against a crucial industry sector, power generation, by exposing the latent vulnerability of the U.S. power grid to DDoS and how a system such as Nyx can protect electric power utilities. This final thrust finds that the current set of exposed U.S. power facilities are widely vulnerable to DDoS that could induce blackouts, and that Nyx can be leveraged to reduce the impact of these targeted DDoS attacks

    Machine Learning and Big Data Methodologies for Network Traffic Monitoring

    Get PDF
    Over the past 20 years, the Internet saw an exponential grown of traffic, users, services and applications. Currently, it is estimated that the Internet is used everyday by more than 3.6 billions users, who generate 20 TB of traffic per second. Such a huge amount of data challenge network managers and analysts to understand how the network is performing, how users are accessing resources, how to properly control and manage the infrastructure, and how to detect possible threats. Along with mathematical, statistical, and set theory methodologies machine learning and big data approaches have emerged to build systems that aim at automatically extracting information from the raw data that the network monitoring infrastructures offer. In this thesis I will address different network monitoring solutions, evaluating several methodologies and scenarios. I will show how following a common workflow, it is possible to exploit mathematical, statistical, set theory, and machine learning methodologies to extract meaningful information from the raw data. Particular attention will be given to machine learning and big data methodologies such as DBSCAN, and the Apache Spark big data framework. The results show that despite being able to take advantage of mathematical, statistical, and set theory tools to characterize a problem, machine learning methodologies are very useful to discover hidden information about the raw data. Using DBSCAN clustering algorithm, I will show how to use YouLighter, an unsupervised methodology to group caches serving YouTube traffic into edge-nodes, and latter by using the notion of Pattern Dissimilarity, how to identify changes in their usage over time. By using YouLighter over 10-month long races, I will pinpoint sudden changes in the YouTube edge-nodes usage, changes that also impair the end users’ Quality of Experience. I will also apply DBSCAN in the deployment of SeLINA, a self-tuning tool implemented in the Apache Spark big data framework to autonomously extract knowledge from network traffic measurements. By using SeLINA, I will show how to automatically detect the changes of the YouTube CDN previously highlighted by YouLighter. Along with these machine learning studies, I will show how to use mathematical and set theory methodologies to investigate the browsing habits of Internauts. By using a two weeks dataset, I will show how over this period, the Internauts continue discovering new websites. Moreover, I will show that by using only DNS information to build a profile, it is hard to build a reliable profiler. Instead, by exploiting mathematical and statistical tools, I will show how to characterize Anycast-enabled CDNs (A-CDNs). I will show that A-CDNs are widely used either for stateless and stateful services. That A-CDNs are quite popular, as, more than 50% of web users contact an A-CDN every day. And that, stateful services, can benefit of A-CDNs, since their paths are very stable over time, as demonstrated by the presence of only a few anomalies in their Round Trip Time. Finally, I will conclude by showing how I used BGPStream an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data. By using BGPStream in real-time mode I will show how I detected a Multiple Origin AS (MOAS) event, and how I studies the black-holing community propagation, showing the effect of this community in the network. Then, by using BGPStream in historical mode, and the Apache Spark big data framework over 16 years of data, I will show different results such as the continuous growth of IPv4 prefixes, and the growth of MOAS events over time. All these studies have the aim of showing how monitoring is a fundamental task in different scenarios. In particular, highlighting the importance of machine learning and of big data methodologies

    Distributed Internet security and measurement

    Get PDF
    The Internet has developed into an important economic, military, academic, and social resource. It is a complex network, comprised of tens of thousands of independently operated networks, called Autonomous Systems (ASes). A significant strength of the Internet\u27s design, one which enabled its rapid growth in terms of users and bandwidth, is that its underlying protocols (such as IP, TCP, and BGP) are distributed. Users and networks alike can attach and detach from the Internet at will, without causing major disruptions to global Internet connectivity. This dissertation shows that the Internet\u27s distributed, and often redundant structure, can be exploited to increase the security of its protocols, particularly BGP (the Internet\u27s interdomain routing protocol). It introduces Pretty Good BGP, an anomaly detection protocol coupled with an automated response that can protect individual networks from BGP attacks. It also presents statistical measurements of the Internet\u27s structure and uses them to create a model of Internet growth. This work could be used, for instance, to test upcoming routing protocols on ensemble of large, Internet-like graphs. Finally, this dissertation shows that while the Internet is designed to be agnostic to political influence, it is actually quite centralized at the country level. With the recent rise in country-level Internet policies, such as nation-wide censorship and warrantless wiretaps, this centralized control could have significant impact on international reachability

    Methods for revealing and reshaping the African Internet Ecosystem as a case study for developing regions: from isolated networks to a connected continent

    Get PDF
    Mención Internacional en el título de doctorWhile connecting end-users worldwide, the Internet increasingly promotes local development by making challenges much simpler to overcome, regardless of the field in which it is used: governance, economy, education, health, etc. However, African Network Information Centre (AfriNIC), the Regional Internet Registry (RIR) of Africa, is characterized by the lowest Internet penetration: 28.6% as of March 2017 compared to an average of 49.7% worldwide according to the International Telecommunication Union (ITU) estimates [139]. Moreover, end-users experience a poor Quality of Service (QoS) provided at high costs. It is thus of interest to enlarge the Internet footprint in such under-connected regions and determine where the situation can be improved. Along these lines, this doctoral thesis thoroughly inspects, using both active and passive data analysis, the critical aspects of the African Internet ecosystem and outlines the milestones of a methodology that could be adopted for achieving similar purposes in other developing regions. The thesis first presents our efforts to help build measurements infrastructures for alleviating the shortage of a diversified range of Vantage Points (VPs) in the region, as we cannot improve what we can not measure. It then unveils our timely and longitudinal inspection of the African interdomain routing using the enhanced RIPE Atlas measurements infrastructure for filling the lack of knowledge of both IPv4 and IPv6 topologies interconnecting local Internet Service Providers (ISPs). It notably proposes reproducible data analysis techniques suitable for the treatment of any set of similar measurements to infer the behavior of ISPs in the region. The results show a large variety of transit habits, which depend on socio-economic factors such as the language, the currency area, or the geographic location of the country in which the ISP operates. They indicate the prevailing dominance of ISPs based outside Africa for the provision of intracontinental paths, but also shed light on the efforts of stakeholders for traffic localization. Next, the thesis investigates the causes and impacts of congestion in the African IXP substrate, as the prevalence of this endemic phenomenon in local Internet markets may hinder their growth. Towards this end, Ark monitors were deployed at six strategically selected local Internet eXchange Points (IXPs) and used for collecting Time-Sequence Latency Probes (TSLP) measurements during a whole year. The analysis of these datasets reveals no evidence of widespread congestion: only 2.2% of the monitored links experienced noticeable indication of congestion, thus promoting peering. The causes of these events were identified during IXP operator interviews, showing how essential collaboration with stakeholders is to understanding the causes of performance degradations. As part of the Internet Society (ISOC) strategy to allow the Internet community to profile the IXPs of a particular region and monitor their evolution, a route-collector data analyzer was then developed and afterward, it was deployed and tested in AfriNIC. This open source web platform titled the “African” Route-collectors Data Analyzer (ARDA) provides metrics, which picture in real-time the status of interconnection at different levels, using public routing information available at local route-collectors with a peering viewpoint of the Internet. The results highlight that a small proportion of Autonomous System Numbers (ASNs) assigned by AfriNIC (17 %) are peering in the region, a fraction that remained static from April to September 2017 despite the significant growth of IXPs in some countries. They show how ARDA can help detect the impact of a policy on the IXP substrate and help ISPs worldwide identify new interconnection opportunities in Africa, the targeted region. Since broadening the underlying network is not useful without appropriately provisioned services to exploit it, the thesis then delves into the availability and utilization of the web infrastructure serving the continent. Towards this end, a comprehensive measurement methodology is applied to collect data from various sources. A focus on Google reveals that its content infrastructure in Africa is, indeed, expanding; nevertheless, much of its web content is still served from the United States (US) and Europe, although being the most popular content source in many African countries. Further, the same analysis is repeated across top global and regional websites, showing that even top African websites prefer to host their content abroad. Following that, the primary bottlenecks faced by Content Providers (CPs) in the region such as the lack of peering between the networks hosting our probes and poorly configured DNS resolvers are explored to outline proposals for further ISP and CP deployments. Considering the above, an option to enrich connectivity and incentivize CPs to establish a presence in the region is to interconnect ISPs present at isolated IXPs by creating a distributed IXP layout spanning the continent. In this respect, the thesis finally provides a four-step interconnection scheme, which parameterizes socio-economic, geographical, and political factors using public datasets. It demonstrates that this constrained solution doubles the percentage of continental intra-African paths, reduces their length, and drastically decreases the median of their Round Trip Times (RTTs) as well as RTTs to ASes hosting the top 10 global and top 10 regional Alexa websites. We hope that quantitatively demonstrating the benefits of this framework will incentivize ISPs to intensify peering and CPs to increase their presence, for enabling fast, affordable, and available access at the Internet frontier.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: David Fernández Cambronero.- Secretario: Alberto García Martínez.- Vocal: Cristel Pelsse
    • …
    corecore