6 research outputs found

    Machine Learning and Big Data Methodologies for Network Traffic Monitoring

    Get PDF
    Over the past 20 years, the Internet saw an exponential grown of traffic, users, services and applications. Currently, it is estimated that the Internet is used everyday by more than 3.6 billions users, who generate 20 TB of traffic per second. Such a huge amount of data challenge network managers and analysts to understand how the network is performing, how users are accessing resources, how to properly control and manage the infrastructure, and how to detect possible threats. Along with mathematical, statistical, and set theory methodologies machine learning and big data approaches have emerged to build systems that aim at automatically extracting information from the raw data that the network monitoring infrastructures offer. In this thesis I will address different network monitoring solutions, evaluating several methodologies and scenarios. I will show how following a common workflow, it is possible to exploit mathematical, statistical, set theory, and machine learning methodologies to extract meaningful information from the raw data. Particular attention will be given to machine learning and big data methodologies such as DBSCAN, and the Apache Spark big data framework. The results show that despite being able to take advantage of mathematical, statistical, and set theory tools to characterize a problem, machine learning methodologies are very useful to discover hidden information about the raw data. Using DBSCAN clustering algorithm, I will show how to use YouLighter, an unsupervised methodology to group caches serving YouTube traffic into edge-nodes, and latter by using the notion of Pattern Dissimilarity, how to identify changes in their usage over time. By using YouLighter over 10-month long races, I will pinpoint sudden changes in the YouTube edge-nodes usage, changes that also impair the end users’ Quality of Experience. I will also apply DBSCAN in the deployment of SeLINA, a self-tuning tool implemented in the Apache Spark big data framework to autonomously extract knowledge from network traffic measurements. By using SeLINA, I will show how to automatically detect the changes of the YouTube CDN previously highlighted by YouLighter. Along with these machine learning studies, I will show how to use mathematical and set theory methodologies to investigate the browsing habits of Internauts. By using a two weeks dataset, I will show how over this period, the Internauts continue discovering new websites. Moreover, I will show that by using only DNS information to build a profile, it is hard to build a reliable profiler. Instead, by exploiting mathematical and statistical tools, I will show how to characterize Anycast-enabled CDNs (A-CDNs). I will show that A-CDNs are widely used either for stateless and stateful services. That A-CDNs are quite popular, as, more than 50% of web users contact an A-CDN every day. And that, stateful services, can benefit of A-CDNs, since their paths are very stable over time, as demonstrated by the presence of only a few anomalies in their Round Trip Time. Finally, I will conclude by showing how I used BGPStream an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data. By using BGPStream in real-time mode I will show how I detected a Multiple Origin AS (MOAS) event, and how I studies the black-holing community propagation, showing the effect of this community in the network. Then, by using BGPStream in historical mode, and the Apache Spark big data framework over 16 years of data, I will show different results such as the continuous growth of IPv4 prefixes, and the growth of MOAS events over time. All these studies have the aim of showing how monitoring is a fundamental task in different scenarios. In particular, highlighting the importance of machine learning and of big data methodologies

    Mobile Oriented Future Internet (MOFI)

    Get PDF
    This Special Issue consists of seven papers that discuss how to enhance mobility management and its associated performance in the mobile-oriented future Internet (MOFI) environment. The first two papers deal with the architectural design and experimentation of mobility management schemes, in which new schemes are proposed and real-world testbed experimentations are performed. The subsequent three papers focus on the use of software-defined networks (SDN) for effective service provisioning in the MOFI environment, together with real-world practices and testbed experimentations. The remaining two papers discuss the network engineering issues in newly emerging mobile networks, such as flying ad-hoc networks (FANET) and connected vehicular networks

    Distribution efficace des contenus dans les réseaux : partage de ressources sans fil, planification et sécurité

    Get PDF
    In recent years, the amount of traffic requests that Internet users generate on a daily basis has increased exponentially, mostly due to the worldwide success of video streaming services, such as Netflix and YouTube. While Content-Delivery Networks (CDNs) are the de-facto standard used nowadays to serve the ever increasing users’ demands, the scientific community has formulated proposals known under the name of Content-Centric Networks (CCN) to change the network protocol stack in order to turn the network into a content distribution infrastructure. In this context this Ph.D. thesis studies efficient techniques to foster content distribution taking into account three complementary problems:1) We consider the scenario of a wireless heterogeneous network, and we formulate a novel mechanism to motivate wireless access point owners to lease their unexploited bandwidth and cache storage, in exchange for an economic incentive.2) We study the centralized network planning problem and (I) we analyze the migration to CCN; (II) we compare the performance bounds for a CDN with those of a CCN, and (III) we take into account a virtualized CDN and study the stochastic planning problem for one such architecture.3) We investigate the security properties on access control and trackability and formulate ConfTrack-CCN: a CCN extension to enforce confidentiality, trackability and access policy evolution in the presence of distributed caches.Au cours de ces dernières années, la quantité de trafic que les utilisateurs Internet produisent sur une base quotidienne a augmenté de façon exponentielle, principalement en raison du succès des services de streaming vidéo, tels que Netflix et YouTube. Alors que les réseaux de diffusion de contenu (Content-Delivery Networks, CDN) sont la technique standard utilisée actuellement pour servir les demandes des utilisateurs, la communauté scientifique a formulé des propositions connues sous le nom de Content-Centric Networks (CCN) pour changer la pile de protocoles réseau afin de transformer Internet en une infrastructure de distribution de contenu. Dans ce contexte, cette thèse de doctorat étudie des techniques efficaces pour la distribution de contenu numérique en tenant compte de trois problèmes complémentaires : 1) Nous considérons le scénario d’un réseau hétérogène sans fil, et nous formulons un mécanisme pour motiver les propriétaires des points d’accès à partager leur capacité WiFi et stockage cache inutilisés, en échange d’une contribution économique.2) Nous étudions le problème centralisé de planification du réseau en présence de caches distribuées et (I) nous analysons la migration optimale du réseau à CCN; (II) nous comparons les bornes de performance d’un réseau CDN avec ceux d’un CCN, et (III) nous considérons un réseau CDN virtualisé et étudions le problème stochastique de planification d’une telle infrastructure.3) Nous considérons les implications de sécurité sur le contrôle d’accès et la traçabilité, et nous formulons ConfTrack-CCN, une extension deCCN utilisée pour garantir la confidentialité, traçabilité et l’évolution de la politique d’accès, en présence de caches distribuées

    H3N - Analysewerkzeuge für hybride Wegewahl in heterogenen, unterbrechungstoleranten Ad-Hoc-Netzen für Rettungskräfte

    Get PDF
    Rettungskräfte müssen unter widrigen Bedingungen zuverlässig kommunizieren können, um in Rettungseinsätzen effizient arbeiten zu können und somit Leben zu retten. Idealerweise ist dazu ein selbstorganisiertes Ad-Hoc-Netz notwendig, weil die Kommunikationsinfrastruktur ggf. beschädigt oder überlastet sein kann. Um die geforderte Robustheit der Kommunikation auch in Szenarien mit größeren zu überbrückenden Entfernungen zu gewährleisten, werden zusätzlich Mechanismen benötigt, die eine Unterbrechungstoleranz ermöglichen. Verzögerungstolerante Netze (engl. Delay Tolerant Networks, kurz: DTN) stellen solche Mechanismen bereit, erfordern aber zusätzliche Verzögerungen, die für Rettungskommunikation nachteilig sind. Deshalb werden intelligente hybride Wegewahlverfahren benötigt, um die Verzögerung durch DTN-Mechanismen zu begrenzen. Außerdem sollten entsprechende Verfahren heterogene Netze unterstützen. Das ermöglicht zusätzlich eine effizientere Weiterleitung durch die Nutzung von Geräten mit unterschiedlichen Kommunikationstechnologien und damit auch Reichweiten. Um solche Systeme und die dafür benötigten Kommunikationsprotokolle zu entwickeln, werden verschiedene Analysewerkzeuge genutzt. Dazu gehören analytische Modelle, Simulationen und Experimente auf der Zielsystemhardware. Für jede Kategorie gibt es verschiedene Werkzeuge und Frameworks, die sich auf unterschiedliche Aspekte fokussieren. Dadurch unterstützen diese herkömmlichen Analysemethoden jedoch meistens nur einen der oben genannten Punkte, während die Untersuchung von hybriden und/oder heterogenen Ansätzen und Szenarien nicht ohne weiteres möglich ist. Im Falle von Rettungskräften kommt hinzu, dass die charakteristischen Merkmale hinsichtlich der Bewegung der Knoten und des erzeugten Datenverkehrs während eines Einsatzes ebenfalls nicht modelliert werden können. In dieser Arbeit werden deshalb verschiedene Erweiterungen zu existierenden Analysewerkzeugen sowie neue Werkzeuge zur Analyse und Modelle zur Nachbildung realistischer Rettungsmissionen untersucht und entwickelt. Ziel ist es, die Vorteile existierender Werkzeuge miteinander zu kombinieren, um ganzheitliche, realitätsnahe Untersuchungen von hybriden Protokollen für heterogene Netze zu ermöglichen. Die Kombination erfolgt in Form von gezielten Erweiterungen und der Entwicklung ergänzender komplementärer Werkzeuge unter Verwendung existierender Schnittstellen. Erste Ergebnisse unter Verwendung der entwickelten Werkzeuge zeigen Verbesserungspotentiale bei der Verwendung traditioneller Protokolle und erlauben die Bewertung zusätzlicher Maßnahmen, um die Kommunikation zu verbessern. Szenarien zur Kommunikation von Rettungskräften werden dabei als ein Beispiel verwendet, die Tools sind jedoch nicht auf die Analyse dieses Anwendungsfalls beschränkt. Über die reine Analyse verschiedener existierender Ansätze hinaus bildet die entwickelte Evaluationsumgebung eine Grundlage für die Entwicklung und Verifikation von neuartigen hybriden Protokollen für die entsprechenden Systeme.Communication between participating first responders is essential for efficient coordination of rescue missions and thus allowing to save human lives. Ideally, ad hoc-style communication networks are applied to this as the first responders cannot rely on infrastructure-based communication for two reasons. First, the infrastructure could be damaged by the disastrous event or not be available for economic reasons. Second, even if public infrastructure is available and functional, it might be overloaded by users. To guarantee the robustness and reliability requirements of first responders, the Mobile Ad Hoc Networks (MANETs) have to be combined with an approach to mitigate intermittent connectivity due to otherwise limited connectivity. Delay Tolerant Networks (DTNs) provide such a functionality but introduce additional delay which is problematic. Therefore, intelligent hybrid routing approaches are required to limit the delay introduced by DTN mechanisms. Besides that, the approach should be applicable to heterogeneous networks in terms of communication technologies and device capabilities. This is required for cross multi-agency and volunteer communication but also enables the opportunistic exploitation of any given communication option. To evaluate such systems and develop the corresponding communication protocols, various tools for the analysis are available. This includes analytical models, simulations and real-world experiments on target hardware. In each category a wide set of tools is available already. However, each tool is focused on specific aspects usually and thus does not provide methods to analyze hybrid approaches out of the box. Even if the tools are modular and allow an extension, there are often other tools that are better suited for partial aspects of hybrid systems. In addition to this, few tools exist to model the characteristics of first responder networks. Especially the generalized movement during missions and the generated data traffic are difficult to model and integrate into analyses. The focus of this project is therefore to develop selected extensions to existing analysis and simulation tools as well as additional tools and models to realistically capture the characteristics of first responder networks. The goal is to combine the advantages of existing specialized simulation tools to enable thorough evaluations of hybrid protocols for heterogeneous networks based on realistic assumptions. To achieve this, the tools are extended by specifically designing tools that enable the interaction between tools and new tools that complement the existing analysis capabilities. First results obtained via the resulting toolbox clearly indicate further research directions as well as a potential for protocol enhancements. Besides that, the toolbox was used to evaluate various methods to enhance the connectivity between nodes in first responder networks. First responder scenarios are used as an example here. The toolbox itself is however not limited to this use case. In addition to the analysis of existing approaches for hybrid and heterogeneous networks, the developed toolbox provides a base framework for the development and verification of newly developed protocols for such use cases

    ネットワークログの因果解析による障害の原因究明支援技術に関する研究

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 稲葉 雅幸, 東京大学教授 千葉 滋, 東京大学教授 山西 健司, 東京大学教授 江崎 浩, 東京大学准教授 中山 雅哉, 東京大学講師 中山 英樹University of Tokyo(東京大学
    corecore