21 research outputs found

    Latency-Based Anycast Geolocation: Algorithms, Software, and Datasets

    Get PDF
    International audienceUse of IP-layer anycast has increased in the last few years beyond the DNS realm. Yet, existing measurement techniques to identify and enumerate anycast replicas exploit specifics of the DNS protocol, which limits their applicability to this particular service. With this paper, we not only propose and thoroughly validate a protocol-agnostic technique for anycast replicas discovery and geolocation, but also provide the community with open source software and datasets to replicate our experimental results, as well as facilitating the development of new techniques such as ours. In particular, our proposed method achieves thorough enumer-ation and city-level geolocalization of anycast instances from a set of known vantage points. The algorithm features an iterative workflow, pipelining enumeration (an optimization problem using latency as input) and geolocalization (a classification problem using side channel information such as city population) of anycast replicas. Results of a thorough validation campaign show our algorithm to be robust to measurement noise, and very lightweight as it requires only a handful of latency measurements

    A First Characterization of Anycast Traffic from Passive Traces

    Get PDF
    Abstract—IP anycast routes packets to the topologically nearest server according to BGP proximity. In the last years, new players have started adopting this technology to serve web content via Anycast-enabled CDNs (A-CDN). To the best of our knowledge, in the literature, there are studies that focus on a specific A-CDN deployment, but little is known about the users and the services that A-CDNs are serving in the Internet at large. This prompted us to perform a passive characterization study, bringing out the principal A-CDN actors in our monitored setup, the services they offer, their penetration, etc. Results show a very heterogeneous picture, with A-CDN empowered services that are very popular (e.g., Twitter or Bing), serve a lot of different contents (e.g., Wordpress or adult content), and even include audio/video streaming (e.g., Soundcloud, or Vine). Our measurements show that the A-CDN technology is quite mature and popular, with more than 50% of web users that access content served by a A-CDN during peak time

    Anycast on the Move: A Look at Mobile Anycast Performance

    Get PDF
    International audienceThe appeal and clear operational and economic benefits of anycast to service providers have motivated a number of recent experimental studies on its potential performance impact for end users. For CDNs on mobile networks, in particular, anycast provides a simpler alternative to existing routing systems challenged by a growing, complex, and commonly opaque cellular infrastructure. This paper presents the first analysis of anycast performance for mobile users. In particular, our evaluation focuses on two distinct anycast services, both providing part of the DNS Root zone and together covering all major geographical regions. Our results show that mobile clients tend to be routed to suboptimal replicas in terms of geographical distance, more frequently while on a cellular connection than on WiFi, with a significant impact on latency. We find that this is not simply an issue of lacking better alternatives, and that the problem is not specific to particular geographic areas or autonomous systems. We close with a first analysis of the root causes of this phenomenon and describe some of the major classes of anycast anomalies revealed during our study, additionally including a systematic approach to automatically detect such anomalies without any sort of training or annotated measurements. We release our datasets to the networking community

    The Impact of DNSSEC on the Internet Landscape

    Get PDF
    In this dissertation we investigate the security deficiencies of the Domain Name System (DNS) and assess the impact of the DNSSEC security extensions. DNS spoofing attacks divert an application to the wrong server, but are also used routinely for blocking access to websites. We provide evidence for systematic DNS spoofing in China and Iran with measurement-based analyses, which allow us to examine the DNS spoofing filters from vantage points outside of the affected networks. Third-parties in other countries can be affected inadvertently by spoofing-based domain filtering, which could be averted with DNSSEC. The security goals of DNSSEC are data integrity and authenticity. A point solution called NSEC3 adds a privacy assertion to DNSSEC, which is supposed to prevent disclosure of the domain namespace as a whole. We present GPU-based attacks on the NSEC3 privacy assertion, which allow efficient recovery of the namespace contents. We demonstrate with active measurements that DNSSEC has found wide adoption after initial hesitation. At server-side, there are more than five million domains signed with DNSSEC. A portion of them is insecure due to insufficient cryptographic key lengths or broken due to maintenance failures. At client-side, we have observed a worldwide increase of DNSSEC validation over the last three years, though not necessarily on the last mile. Deployment of DNSSEC validation on end hosts is impaired by intermediate caching components, which degrade the availability of DNSSEC. However, intermediate caches contribute to the performance and scalability of the Domain Name System, as we show with trace-driven simulations. We suggest that validating end hosts utilize intermediate caches by default but fall back to autonomous name resolution in case of DNSSEC failures.In dieser Dissertation werden die Sicherheitsdefizite des Domain Name Systems (DNS) untersucht und die Auswirkungen der DNSSEC-Sicherheitserweiterungen bewertet. DNS-Spoofing hat den Zweck eine Anwendung zum falschen Server umzuleiten, wird aber auch regelmäßig eingesetzt, um den Zugang zu Websites zu sperren. Durch messbasierte Analysen wird in dieser Arbeit die systematische Durchführung von DNS-Spoofing-Angriffen in China und im Iran belegt, wobei sich die Messpunkte außerhalb der von den Sperrfiltern betroffenen Netzwerke befinden. Es wird gezeigt, dass Dritte in anderen Ländern durch die Spoofing-basierten Sperrfilter unbeabsichtigt beeinträchtigt werden können, was mit DNSSEC verhindert werden kann. Die Sicherheitsziele von DNSSEC sind Datenintegrität und Authentizität. Die NSEC3-Erweiterung sichert zudem die Privatheit des Domainnamensraums, damit die Inhalte eines DNSSEC-Servers nicht in Gänze ausgelesen werden können. In dieser Arbeit werden GPU-basierte Angriffsmethoden auf die von NSEC3 zugesicherte Privatheit vorgestellt, die eine effiziente Wiederherstellung des Domainnamensraums ermöglichen. Ferner wird mit aktiven Messmethoden die Verbreitung von DNSSEC untersucht, die nach anfänglicher Zurückhaltung deutlich zugenommen hat. Auf der Serverseite gibt es mehr als fünf Millionen mit DNSSEC signierte Domainnamen. Ein Teil davon ist aufgrund von unzureichenden kryptographischen Schlüssellängen unsicher, ein weiterer Teil zudem aufgrund von Wartungsfehlern nicht mit DNSSEC erreichbar. Auf der Clientseite ist der Anteil der DNSSEC-Validierung in den letzten drei Jahren weltweit gestiegen. Allerdings ist hierbei offen, ob die Validierung nahe bei den Endgeräten stattfindet, um unvertraute Kommunikationspfade vollständig abzusichern. Der Einsatz von DNSSEC-Validierung auf Endgeräten wird durch zwischengeschaltete DNS-Cache-Komponenten erschwert, da hierdurch die Verfügbarkeit von DNSSEC beeinträchtigt wird. Allerdings tragen zwischengeschaltete Caches zur Performance und Skalierbarkeit des Domain Name Systems bei, wie in dieser Arbeit mit messbasierten Simulationen gezeigt wird. Daher sollten Endgeräte standardmäßig die vorhandene DNS-Infrastruktur nutzen, bei Validierungsfehlern jedoch selbständig die DNSSEC-Zielserver anfragen, um im Cache gespeicherte, fehlerhafte DNS-Antworten zu umgehen

    Machine Learning and Big Data Methodologies for Network Traffic Monitoring

    Get PDF
    Over the past 20 years, the Internet saw an exponential grown of traffic, users, services and applications. Currently, it is estimated that the Internet is used everyday by more than 3.6 billions users, who generate 20 TB of traffic per second. Such a huge amount of data challenge network managers and analysts to understand how the network is performing, how users are accessing resources, how to properly control and manage the infrastructure, and how to detect possible threats. Along with mathematical, statistical, and set theory methodologies machine learning and big data approaches have emerged to build systems that aim at automatically extracting information from the raw data that the network monitoring infrastructures offer. In this thesis I will address different network monitoring solutions, evaluating several methodologies and scenarios. I will show how following a common workflow, it is possible to exploit mathematical, statistical, set theory, and machine learning methodologies to extract meaningful information from the raw data. Particular attention will be given to machine learning and big data methodologies such as DBSCAN, and the Apache Spark big data framework. The results show that despite being able to take advantage of mathematical, statistical, and set theory tools to characterize a problem, machine learning methodologies are very useful to discover hidden information about the raw data. Using DBSCAN clustering algorithm, I will show how to use YouLighter, an unsupervised methodology to group caches serving YouTube traffic into edge-nodes, and latter by using the notion of Pattern Dissimilarity, how to identify changes in their usage over time. By using YouLighter over 10-month long races, I will pinpoint sudden changes in the YouTube edge-nodes usage, changes that also impair the end users’ Quality of Experience. I will also apply DBSCAN in the deployment of SeLINA, a self-tuning tool implemented in the Apache Spark big data framework to autonomously extract knowledge from network traffic measurements. By using SeLINA, I will show how to automatically detect the changes of the YouTube CDN previously highlighted by YouLighter. Along with these machine learning studies, I will show how to use mathematical and set theory methodologies to investigate the browsing habits of Internauts. By using a two weeks dataset, I will show how over this period, the Internauts continue discovering new websites. Moreover, I will show that by using only DNS information to build a profile, it is hard to build a reliable profiler. Instead, by exploiting mathematical and statistical tools, I will show how to characterize Anycast-enabled CDNs (A-CDNs). I will show that A-CDNs are widely used either for stateless and stateful services. That A-CDNs are quite popular, as, more than 50% of web users contact an A-CDN every day. And that, stateful services, can benefit of A-CDNs, since their paths are very stable over time, as demonstrated by the presence of only a few anomalies in their Round Trip Time. Finally, I will conclude by showing how I used BGPStream an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data. By using BGPStream in real-time mode I will show how I detected a Multiple Origin AS (MOAS) event, and how I studies the black-holing community propagation, showing the effect of this community in the network. Then, by using BGPStream in historical mode, and the Apache Spark big data framework over 16 years of data, I will show different results such as the continuous growth of IPv4 prefixes, and the growth of MOAS events over time. All these studies have the aim of showing how monitoring is a fundamental task in different scenarios. In particular, highlighting the importance of machine learning and of big data methodologies

    Improving Anycast with Measurements

    Get PDF
    Since the first Distributed Denial-of-Service (DDoS) attacks were launched, the strength of such attacks has been steadily increasing, from a few megabits per second to well into the terabit/s range. The damage that these attacks cause, mostly in terms of financial cost, has prompted researchers and operators alike to investigate and implement mitigation strategies. Examples of such strategies include local filtering appliances, Border Gateway Protocol (BGP)-based blackholing and outsourced mitigation in the form of cloud-based DDoS protection providers. Some of these strategies are more suited towards high bandwidth DDoS attacks than others. For example, using a local filtering appliance means that all the attack traffic will still pass through the owner's network. This inherently limits the maximum capacity of such a device to the bandwidth that is available. BGP Blackholing does not have such limitations, but can, as a side-effect, cause service disruptions to end-users. A different strategy, that has not attracted much attention in academia, is based on anycast. Anycast is a technique that allows operators to replicate their service across different physical locations, while keeping that service addressable with just a single IP-address. It relies on the BGP to effectively load balance users. In practice, it is combined with other mitigation strategies to allow those to scale up. Operators can use anycast to scale their mitigation capacity horizontally. Because anycast relies on BGP, and therefore in essence on the Internet itself, it can be difficult for network engineers to fine tune this balancing behavior. In this thesis, we show that that is indeed the case through two different case studies. In the first, we focus on an anycast service during normal operations, namely the Google Public DNS, and show that the routing within this service is far from optimal, for example in terms of distance between the client and the server. In the second case study, we observe the root DNS, while it is under attack, and show that even though in aggregate the bandwidth available to this service exceeds the attack we observed, clients still experienced service degradation. This degradation was caused due to the fact that some sites of the anycast service received a much higher share of traffic than others. In order for operators to improve their anycast networks, and optimize it in terms of resilience against DDoS attacks, a method to assess the actual state of such a network is required. Existing methodologies typically rely on external vantage points, such as those provided by RIPE Atlas, and are therefore limited in scale, and inherently biased in terms of distribution. We propose a new measurement methodology, named Verfploeter, to assess the characteristics of anycast networks in terms of client to Point-of-Presence (PoP) mapping, i.e. the anycast catchment. This method does not rely on external vantage points, is free of bias and offers a much higher resolution than any previous method. We validated this methodology by deploying it on a testbed that was locally developed, as well as on the B root DNS. We showed that the increased \textit{resolution} of this methodology improved our ability to assess the impact of changes in the network configuration, when compared to previous methodologies. As final validation we implement Verfploeter on Cloudflare's global-scale anycast Content Delivery Network (CDN), which has almost 200 global Points-of-Presence and an aggregate bandwidth of 30 Tbit/s. Through three real-world use cases, we demonstrate the benefits of our methodology: Firstly, we show that changes that occur when withdrawing routes from certain PoPs can be accurately mapped, and that in certain cases the effect of taking down a combination of PoPs can be calculated from individual measurements. Secondly, we show that Verfploeter largely reinstates the ping to its former glory, showing how it can be used to troubleshoot network connectivity issues in an anycast context. Thirdly, we demonstrate how accurate anycast catchment maps offer operators a new and highly accurate tool to identify and filter spoofed traffic. Where possible, we make datasets collected over the course of the research in this thesis available as open access data. The two best (open) dataset awards that were awarded for these datasets confirm that they are a valued contribution. In summary, we have investigated two large anycast services and have shown that their deployments are not optimal. We developed a novel measurement methodology, that is free of bias and is able to obtain highly accurate anycast catchment mappings. By implementing this methodology and deploying it on a global-scale anycast network we show that our method adds significant value to the fast-growing anycast CDN industry and enables new ways of detecting, filtering and mitigating DDoS attacks

    The Cooperative Defense Overlay Network: A Collaborative Automated Threat Information Sharing Framework for a Safer Internet

    Get PDF
    With the ever-growing proliferation of hardware and software-based computer security exploits and the increasing power and prominence of distributed attacks, network and system administrators are often forced to make a difficult decision: expend tremendous resources on defense from sophisticated and continually evolving attacks from an increasingly dangerous Internet with varying levels of success; or expend fewer resources on defending against common attacks on "low hanging fruit," hoping to avoid the less common but incredibly devastating zero-day worm or botnet attack. Home networks and small organizations are usually forced to choose the latter option and in so doing are left vulnerable to all but the simplest of attacks. While automated tools exist for sharing information about network-based attacks, this sharing is typically limited to administrators of large networks and dedicated security-conscious users, to the exclusion of smaller organizations and novice home users. In this thesis we propose a framework for a cooperative defense overlay network (CODON) in which participants with varying technical abilities and resources can contribute to the security and health of the internet via automated crowdsourcing, rapid information sharing, and the principle of collateral defense
    corecore