225 research outputs found

    Predicting DDoS Attacks Preventively Using Darknet Time-Series Dataset

    Get PDF
    The cyber crimes in today’s world have been a major concern for network administrators. The number of DDoS attacks in the last few decades is increasing at the fastest pace. Hackers are attacking the network, small or large with this common attacks named as DDoS. The consequences of this attack are worse as it disrupts the service provider’s trust among its customers. This article employs machine learning methods to estimate short-term consequences on the number and dimension of hosts that an assault may target. KDD Cup 99, CIC IDS 2017 and CIC Darknet 2020 datasets are used for building a prediction model. The feature selection for prediction is based on KDD Cup 99 and CIC IDS 2017 dataset; CIC Darknet 2020 dataset is used for prediction of impact of DDoS attack by employing LSTM (Long Short Term Memory) algorithm. This model can help network administrators to identify and preventively predict the attacks within five minutes of the commencement of the potential attack

    Approaches and Techniques for Fingerprinting and Attributing Probing Activities by Observing Network Telescopes

    Get PDF
    The explosive growth, complexity, adoption and dynamism of cyberspace over the last decade has radically altered the globe. A plethora of nations have been at the very forefront of this change, fully embracing the opportunities provided by the advancements in science and technology in order to fortify the economy and to increase the productivity of everyday's life. However, the significant dependence on cyberspace has indeed brought new risks that often compromise, exploit and damage invaluable data and systems. Thus, the capability to proactively infer malicious activities is of paramount importance. In this context, generating cyber threat intelligence related to probing or scanning activities render an effective tactic to achieve the latter. In this thesis, we investigate such malicious activities, which are typically the precursors of various amplified, debilitating and disrupting cyber attacks. To achieve this task, we analyze real Internet-scale traffic targeting network telescopes or darknets, which are defined by routable, allocated yet unused Internet Protocol addresses. First, we present a comprehensive survey of the entire probing topic. Specifically, we categorize this topic by elaborating on the nature, strategies and approaches of such probing activities. Additionally, we provide the reader with a classification and an exhaustive review of various techniques that could be employed in such malicious activities. Finally, we depict a taxonomy of the current literature by focusing on distributed probing detection methods. Second, we focus on the problem of fingerprinting probing activities. To this end, we design, develop and validate approaches that can identify such activities targeting enterprise networks as well as those targeting the Internet-space. On one hand, the corporate probing detection approach uniquely exploits the information that could be leaked to the scanner, inferred from the internal network topology, to perform the detection. On the other hand, the more darknet tailored probing fingerprinting approach adopts a statistical approach to not only detect the probing activities but also identify the exact technique that was employed in the such activities. Third, for attribution purposes, we propose a correlation approach that fuses probing activities with malware samples. The approach aims at detecting whether Internet-scale machines are infected or not as well as pinpointing the exact malware type/family, if the machines were found to be compromised. To achieve the intended goals, the proposed approach initially devises a probabilistic model to filter out darknet misconfiguration traffic. Consequently, probing activities are correlated with malware samples by leveraging fuzzy hashing and entropy based techniques. To this end, we also investigate and report a rare Internet-scale probing event by proposing a multifaceted approach that correlates darknet, malware and passive dns traffic. Fourth, we focus on the problem of identifying and attributing large-scale probing campaigns, which render a new era of probing events. These are distinguished from previous probing incidents as (1) the population of the participating bots is several orders of magnitude larger, (2) the target scope is generally the entire Internet Protocol (IP) address space, and (3) the bots adopt well-orchestrated, often botmaster coordinated, stealth scan strategies that maximize targets' coverage while minimizing redundancy and overlap. To this end, we propose and validate three approaches. On one hand, two of the approaches rely on a set of behavioral analytics that aim at scrutinizing the generated traffic by the probing sources. Subsequently, they employ data mining and graph theoretic techniques to systematically cluster the probing sources into well-defined campaigns possessing similar behavioral similarity. The third approach, on the other hand, exploit time series interpolation and prediction to pinpoint orchestrated probing campaigns and to filter out non-coordinated probing flows. We conclude this thesis by highlighting some research gaps that pave the way for future work

    Data-Driven Approaches for Detecting Malware-Infected IoT Devices and Characterizing Their Unsolicited Behaviors by Leveraging Passive Internet Measurements

    Get PDF
    Despite the benefits of Internet of Things (IoT) devices, the insecurity of IoT and their deployment nature have turned them into attractive targets for adversaries, which contributed to the rise of IoT-tailored malware as a major threat to the Internet ecosystem. In this thesis, we address the threats associated with the emerging IoT malware, which utilize exploited devices to perform large-scale cyber attacks (e.g., DDoS). To mitigate such threat, there is a need to possess an Internet perspective of the deployed IoT devices while building a better understanding about the behavioral characteristic of malware-infected devices, which is challenging due to the lack of empirical data and knowledge about the deployed IoT devices and their behavioral characteristics. To address these challenges, in this thesis, we leverage passive Internet measurements and IoT device information to detect exploited IoT devices and investigate their generated traffic at the network telescope (darknet). We aim at proposing data-driven approaches for effective and near real-time IoT threat detection and characterization. Additionally, we leverage a specialized IoT Honeypot to analyze a large corpus of real IoT malware binary executable. We aim at building a better understanding about the current state of IoT malware while addressing the problems of IoT malware classification and family attribution. To this end, we perform the following to achieve our objectives: First, we address the lack of empirical data and knowledge about IoT devices and their activities. To this end, we leverage an online IoT search engine (e.g., Shodan.io) to obtain publicly available device information in the realms of consumer and cyber-physical system (CPS), while utilizing passive network measurements collected at a large-scale network telescope (CAIDA), to infer compromised devices and their unsolicited activities. Indeed, we were among the first to report experimental results on detecting compromised IoT devices and their behavioral characteristics in the wild, while demonstrating their active involvement in large-scale malware-generated malicious activities such as Internet scanning. Additionally, we leverage the IoT-generated backscatter traffic towards the network telescope to shed light on IoT devices that were victims of intensive Denial of Service (DoS) attacks. Second, given the highly orchestrated nature of IoT-driven cyber-attacks, we focus on the analysis of IoT-generated scanning activities to detect and characterize scanning campaigns generated by IoT botnets. To this end, we aggregate IoT-generated traffic and performing association rules mining to infer campaigns through common scanning objectives represented by targeted destination ports. Further, we leverage behavioural characteristics and aggregated flow features to correlate IoT devices using DBSCAN clustering algorithm. Indeed, our findings shed light on compromised IoT devices, which tend to operate within well coordinated IoT botnets. Third, considering the huge number of IoT devices and the magnitude of their malicious scanning traffic, we focus on addressing the operational challenges to automate large-scale campaign detection and analysis while generating threat intelligence in a timely manner. To this end, we leverage big data analytic frameworks such as Apache Spark to develop a scalable system for automated detection of infected IoT devices and characterization of their scanning activities using our proposed approach. Our evaluation results with over 4TB of IoT traffic demonstrated the effectiveness of the system to infer scanning campaigns generated by IoT botnets. Moreover, we demonstrate the feasibility of the implemented system/framework as a platform for implementing further supporting applications, which leverage passive Internet measurement for characterizing IoT traffic and generating IoT-related threat intelligence. Fourth, we take first steps towards mitigating threats associated with the rise of IoT malware by creating a better understanding about the characteristics and inter-relations of IoT malware. To this end, we analyze about 70,000 IoT malware binaries obtained by a specialized IoT honeypot in the past two years. We investigate the distribution of IoT malware across known families, while exploring their detection timeline and persistent. Moreover, while we shed light on the effectiveness of IoT honeypots in detecting new/unknown malware samples, we utilize static and dynamic malware analysis techniques to uncover adversarial infrastructure and investigate functional similarities. Indeed, our findings enable unknown malware labeling/attribution while identifying new IoT malware variants. Additionally, we collect malware-generated scanning traffic (whenever available) to explore behavioral characteristics and associated threats/vulnerabilities. We conclude this thesis by discussing research gaps that pave the way for future work

    Detection of Sparse Anomalies in High-Dimensional Network Telescope Signals

    Full text link
    Network operators and system administrators are increasingly overwhelmed with incessant cyber-security threats ranging from malicious network reconnaissance to attacks such as distributed denial of service and data breaches. A large number of these attacks could be prevented if the network operators were better equipped with threat intelligence information that would allow them to block or throttle nefarious scanning activities. Network telescopes or "darknets" offer a unique window into observing Internet-wide scanners and other malicious entities, and they could offer early warning signals to operators that would be critical for infrastructure protection and/or attack mitigation. A network telescope consists of unused or "dark" IP spaces that serve no users, and solely passively observes any Internet traffic destined to the "telescope sensor" in an attempt to record ubiquitous network scanners, malware that forage for vulnerable devices, and other dubious activities. Hence, monitoring network telescopes for timely detection of coordinated and heavy scanning activities is an important, albeit challenging, task. The challenges mainly arise due to the non-stationarity and the dynamic nature of Internet traffic and, more importantly, the fact that one needs to monitor high-dimensional signals (e.g., all TCP/UDP ports) to search for "sparse" anomalies. We propose statistical methods to address both challenges in an efficient and "online" manner; our work is validated both with synthetic data as well as real-world data from a large network telescope

    Monitoring Network Telescopes and Inferring Anomalous Traffic Through the Prediction of Probing Rates

    Get PDF
    International audienceNetwork reconnaissance is the first step precedinga cyber-attack. Hence, monitoring the probing activities is im-perative to help security practitioners enhancing their awarenessabout Internet’s large-scale events or peculiar events targetingtheir network. In this paper, we present a framework foran improved and efficient monitoring of the probing activi-ties targeting network telescopes. Particularly, we model theprobing rates which are a good indicator for measuring thecyber-security risk targeting network services. The approachconsists of first inferring groups of network ports sharing similarprobing characteristics through a new affinity metric capturingboth temporal and semantic similarities between ports. Then,sequences of probing rates targeting similar ports are used asinputs to stacked Long Short-Term Memory (LSTM) neuralnetworks to predict probing rates 1 hour and 1 day in advance.Finally, we describe two monitoring indicators that use theprediction models to infer anomalous probing traffic and toraise early threat warnings. We show that LSTM networkscan accurately predict probing rates, outperforming the non-stationary autoregressive model, and we demonstrate that themonitoring indicators are efficient in assessing the cyber-securityrisk related to vulnerability disclosur

    ThreatPredict: From Global Social and Technical Big Data to Cyber Threat Forecast

    Get PDF
    International audiencePredicting the next threats that may occurs in the Internet is a multifaceted problem as the predictions must be enough precise and given as most as possible in advance to be exploited efficiently, for example to setup defensive measures. The ThreatPredict project aims at building predictive models by integrating exogenous sources of data using machine learning algorithms. This paper reports the most notable results using technical data from security sensors or contextual information about darkweb cyber-criminal markets and data breaches

    Snowball-Miner: Integration of Deep Learning for Extraction of Cyber Threat Intelligence from Dark Web

    Get PDF
    In Cyber threat intelligence is a crucial component in defending against cybersecurity threats. Cyber security dark web, security Blogs, Hackers’ community, news forums, Open-Source Intelligence (OSINT) are known as the harbor of illicit activities and serve as a breeding ground for cybercriminals. Extracting actionable intelligence from the dark web is challenging due to its anonymous and encrypted nature. State-of-art work proposed machine learning and deep learning approach to aggregate the dark web for cyber threat intelligence from data present in the dark web.  This paper proposes, a novel approach utilizing Snowball-Miner for cyber threat intelligence discovery from the dark web. The model is trained on a diverse dataset consisting of dark web forums, hidden .onion based marketplaces and other underground platforms using Snowball-crawler. However, we have employed hybrid convolutional model CNN-LSTM and CNN-GRU adopting doc2vec word embedding to classify into four domains viz Energy Sector, Finance, Illicit Activities and illegal Services. From our experiment it emerged that, CNN-LSTM outperforms as 96.37% for classification of domain specific threat documents. Furthermore, after data preparation we implemented NLP technique and extracted the domain specific Indicator of Compromise (IoCs) using RegEx parser and Subject, Object and Verb (SOV) semantics dependency analysis. Finally, we have integrated IoCs and Threat keywords with respective domains to generate domain specific threat intelligence which enhance the quality of the domain specific CTI based on R-dimension (Relevance)

    i-DarkVec: Incremental Embeddings for Darknet Traffic Analysis

    Get PDF
    Darknets are probes listening to traffic reaching IP addresses that host no services. Traffic reaching a darknet results from the actions of internet scanners, botnets, and possibly misconfigured hosts. Such peculiar nature of the darknet traffic makes darknets a valuable instrument to discover malicious online activities, e.g., identifying coordinated actions performed by bots or scanners. However, the massive amount of packets and sources that darknets observe makes it hard to extract meaningful insights, calling for scalable tools to automatically identify and group sources that share similar behaviour. We here present i-DarkVec, a methodology to learn meaningful representations of Darknet traffic. i-DarkVec leverages Natural Language Processing techniques (e.g., Word2Vec) to capture the co-occurrence patterns that emerge when scanners or bots launch coordinated actions. As in NLP problems, the embeddings learned with i-DarkVec enable several new machine learning tasks on the darknet traffic, such as identifying clusters of senders engaged in similar activities. We extensively test i-DarkVec and explore its design space in a case study using real darknets. We show that with a proper definition of services, the learned embeddings can be used to (i) solve the classification problem to associate unknown sources’ IP addresses to the correct classes of coordinated actors and (ii) automatically identify clusters of previously unknown sources performing similar attacks and scans, easing the security analyst’s job. i-DarkVec leverages a novel incremental embedding learning approach that is scalable and robust to traffic changes, making it applicable to dynamic and large-scale scenarios

    Scalable Automation of Online Network Attack Characterization

    Get PDF
    Cyber attacks to enterprise networks and critical infrastructures are becoming more prevalent and diverse. Timely recognition of attack strategies and behaviors will assist analysts or resilient network defense systems in deploying effective means in anticipation of future threats. An attack can be characterized by the sequences of observed events that are relevant to critical assets. Earlier work has developed a semi-supervised learning framework to process large-scale events and extract attack behaviors. While the framework is designed to support online processing, the implementation requires extension and restructuring to support scalable automation of sustainable online network attack characterization. This work builds upon the semi-supervised Bayesian classification framework, and aims at providing a modular and scalable system that supports a variety of features to describe attacks, ranging from packet level information to metadata produced by sensors, such as Snort and Bro. The system will continuously process data streams, generating newly learned models, as well as record critical information of aged behavior models. These behavior models will reflect the attack strategies that are relevant to the critical assets, enhancing the situational awareness and enabling predictive and resilient network defense. The accuracy of the models is demonstrated through comparisons to network topologies and scenarios provided from the source of the dataset utilized. These scenarios often encapsulate multiple complex network attack behaviors allowing for more realistic representations of network traffic over time and better test cases for experimentation
    • …
    corecore