32 research outputs found

    Hypersparse Neural Network Analysis of Large-Scale Internet Traffic

    Full text link
    The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data containing 50 billion packets. Utilizing a novel hypersparse neural network analysis of "video" streams of this traffic using 10,000 processors in the MIT SuperCloud reveals a new phenomena: the importance of otherwise unseen leaf nodes and isolated links in Internet traffic. Our neural network approach further shows that a two-parameter modified Zipf-Mandelbrot distribution accurately describes a wide variety of source/destination statistics on moving sample windows ranging from 100,000 to 100,000,000 packets over collections that span years and continents. The inferred model parameters distinguish different network streams and the model leaf parameter strongly correlates with the fraction of the traffic in different underlying network topologies. The hypersparse neural network pipeline is highly adaptable and different network statistics and training models can be incorporated with simple changes to the image filter functions.Comment: 11 pages, 10 figures, 3 tables, 60 citations; to appear in IEEE High Performance Extreme Computing (HPEC) 201

    Going beyond diffServ in IP traffic classification

    Get PDF
    Quality of Service (QoS) management in IP networks today relies on static configuration of classes of service definitions and related forwarding priorities. Packets are actually classified according to the DiffServ architecture based on the RFC 4594, typically thanks to static configuration or filters matching packet features, at network access equipment. In this paper, we propose a dynamic classification procedure, referred to as Learning-powered DiffServ (L-DiffServ), able to detect the distinctive characteristics of traffic and to dynamically assign service classes to IP packets. The idea is to apply semi-unsupervised Machine Learning techniques, such as Linear Discriminant Analysis (LDA) and K-Means, with a proper customization to take into account the issues related to packet-level analysis, i.e. unbalanced distribution of traffic among classes and selection of proper IP header related features. The performance evaluation highlights that L-DiffServ is able to change dynamically the classification outcome, providing an higher number of classes than DiffServ. This last result represents the first step toward a more granular differentiation of IP traffic

    Observations of IPv6 Addresses

    Get PDF
    IPv6 addresses are longer than IPv4 addresses, and are so capable of greater expression. Given an IPv6 address, conventions and standards allow us to draw conclusions about how IPv6 is being used on the node with that address. We show a technique for analysing IPv6 addresses and apply it to a number of datasets. The datasets include addresses seen at a busy mirror server, at an IPv6-enabled TLD DNS server and when running traceroute across the production IPv6 network. The technique quantifies differences in these datasets that we intuitively expect, and shows that IPv6 is being used in different ways by different groups

    Observations of IPv6 Addresses

    Get PDF
    IPv6 addresses are longer than IPv4 addresses, and are so capable of greater expression. Given an IPv6 address, conventions and standards allow us to draw conclusions about how IPv6 is being used on the node with that address. We show a technique for analysing IPv6 addresses and apply it to a number of datasets. The datasets include addresses seen at a busy mirror server, at an IPv6-enabled TLD DNS server and when running traceroute across the production IPv6 network. The technique quantifies differences in these datasets that we intuitively expect, and shows that IPv6 is being used in different ways by different groups

    Multi-Temporal Analysis and Scaling Relations of 100,000,000,000 Network Packets

    Full text link
    Our society has never been more dependent on computer networks. Effective utilization of networks requires a detailed understanding of the normal background behaviors of network traffic. Large-scale measurements of networks are computationally challenging. Building on prior work in interactive supercomputing and GraphBLAS hypersparse hierarchical traffic matrices, we have developed an efficient method for computing a wide variety of streaming network quantities on diverse time scales. Applying these methods to 100,000,000,000 anonymized source-destination pairs collected at a network gateway reveals many previously unobserved scaling relationships. These observations provide new insights into normal network background traffic that could be used for anomaly detection, AI feature engineering, and testing theoretical models of streaming networks.Comment: 6 pages, 6 figures,3 tables, 49 references, accepted to IEEE HPEC 202

    GNOSIS: Global Network Operations Status Information System

    Get PDF
    Monitoring the global state of a network is a continuing challenge for network operators and users. It has become still harder with increases in scale and heterogeneity. Monitoring requires status information for each node and to construct the global picture at a monitoring point. GNOSIS, the Global Network Operations Status Information System, achieves a global view by careful extraction and presentation of locally available node data. The GNOSIS model improves on the traditional polling model of monitoring schemes by 1.) collecting accurate data 2.) decreasing the granularity with which network applications can detect change in the network and 3.) displaying status information in near real-time. We define the Network Snapshot as the basic unit of information capture and display in GNOSIS. A Network Snapshot is a visualization of locally available state collected during a common time interval. A sequence of these Network Snapshots over time represent the evolution of network state. In this paper, we motivate the need for a network monitoring system that can detect global problems, in spite of both scale and heterogeneity. We present three design criteria, Accuracy, Continuity and Timeliness for a global monitoring system. Finally, we present the GNOSIS architecture and demonstrate how it better detects network problems which are currently of concern. The goal of GNOSIS is to present a stream of consistent, accurate local data in a timely manner
    corecore