4 research outputs found

    Assessing Internet-wide Cyber Situational Awareness of Critical Sectors

    Get PDF
    In this short paper, we take a first step towards empirically assessing Internet-wide malicious activities generated from and targeted towards Internet-scale business sectors (i.e., financial, health, education, etc.) and critical infrastructure (i.e., utilities, manufacturing, government, etc.). Facilitated by an innovative and a collaborative large-scale effort, we have conducted discussions with numerous Internet entities to obtain rare and private information related to allocated IP blocks pertaining to the aforementioned sectors and critical infrastructure. To this end, we employ such information to attribute Internet-scale maliciousness to such sectors and realms, in an attempt to provide an in-depth analysis of the global cyber situational posture. We draw upon close to 16.8 TB of darknet data to infer probing activities (typically generated by malicious/infected hosts) and DDoS backscatter, from which we distill IP addresses of victims. By executing week-long measurements, we observed an alarming number of more than 11,000 probing machines and 300 DDoS attack victims hosted by critical sectors. We also generate rare insights related to the maliciousness of various business sectors, including financial, which typically do not report their hosted and targeted illicit activities for reputation-preservation purposes. While we treat the obtained results with strict confidence due to obvious sensitivity reasons, we postulate that such generated cyber threat intelligence could be shared with sector/critical infrastructure operators, backbone networks and Internet service providers to contribute to the overall threat remediation objective

    Hypersparse Neural Network Analysis of Large-Scale Internet Traffic

    Full text link
    The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data containing 50 billion packets. Utilizing a novel hypersparse neural network analysis of "video" streams of this traffic using 10,000 processors in the MIT SuperCloud reveals a new phenomena: the importance of otherwise unseen leaf nodes and isolated links in Internet traffic. Our neural network approach further shows that a two-parameter modified Zipf-Mandelbrot distribution accurately describes a wide variety of source/destination statistics on moving sample windows ranging from 100,000 to 100,000,000 packets over collections that span years and continents. The inferred model parameters distinguish different network streams and the model leaf parameter strongly correlates with the fraction of the traffic in different underlying network topologies. The hypersparse neural network pipeline is highly adaptable and different network statistics and training models can be incorporated with simple changes to the image filter functions.Comment: 11 pages, 10 figures, 3 tables, 60 citations; to appear in IEEE High Performance Extreme Computing (HPEC) 201

    Data-driven curation, learning and analysis for inferring evolving IoT botnets in the wild

    Get PDF
    The insecurity of the Internet-of-Things (IoT) paradigm continues to wreak havoc in consumer and critical infrastructure realms. Several challenges impede addressing IoT security at large, including, the lack of IoT-centric data that can be collected, analyzed and correlated, due to the highly heterogeneous nature of such devices and their widespread deployments in Internet-wide environments. To this end, this paper explores macroscopic, passive empirical data to shed light on this evolving threat phenomena. This not only aims at classifying and inferring Internet-scale compromised IoT devices by solely observing such one-way network traffic, but also endeavors to uncover, track and report on orchestrated "in the wild" IoT botnets. Initially, to prepare the effective utilization of such data, a novel probabilistic model is designed and developed to cleanse such traffic from noise samples (i.e., misconfiguration traffic). Subsequently, several shallow and deep learning models are evaluated to ultimately design and develop a multi-window convolution neural network trained on active and passive measurements to accurately identify compromised IoT devices. Consequently, to infer orchestrated and unsolicited activities that have been generated by well-coordinated IoT botnets, hierarchical agglomerative clustering is deployed by scrutinizing a set of innovative and efficient network feature sets. By analyzing 3.6 TB of recent darknet traffic, the proposed approach uncovers a momentous 440,000 compromised IoT devices and generates evidence-based artifacts related to 350 IoT botnets. While some of these detected botnets refer to previously documented campaigns such as the Hide and Seek, Hajime and Fbot, other events illustrate evolving threats such as those with cryptojacking capabilities and those that are targeting industrial control system communication and control services
    corecore