334 research outputs found

    Tensor Based Monitoring of Large-Scale Network Traffic

    Get PDF
    Network monitoring systems are important for network operators to easily analyze behavioral trends in flow data. As networks become larger and more complex, the data becomes more complex with increased size and more variables. This increase in dimensionality lends itself to tensor-based analysis of network data as tensors are arbitrarily sized multi-dimensional objects. Tensor-based network monitoring methods have been explored in recent years through work at Carnegie Mellon University through their algorithm DenseAlert. DenseAlert identifies events anomalous events in tensors through quick detection of dense sub-tensors in positive-valued tensors. However, from experimentation, DenseAlert fails on larger datasets. Drawing from RED Alert, we developed an algorithm called RED Alert that uses recursive filtering and expansion to handle anomaly detection in large tensors of positive and negative valued data. This is done through the use of network parameters that are structured in a hierarchical fashion. That is, network traffic is first modeled at low granular data (e.g. host country), and events detected as anomalous in lower spaces are tracked down to higher granular data (e.g. host IP). The tensors are built on-the-fly in streaming data, filtering data to only consider the parameters deemed anomalous in previous granularity levels. RED Alert is showcased on two network monitoring examples, packet loss detection and botnet detection, comparing results to DenseAlert. In both cases, RED Alert was able to detect suspicious events and identify the root cause of the behavior from a sole IP. RED Alert was developed as part of a greater project, InSight2, that provides several different network monitoring dashboards to aid network operators. This required additional development of a tensor library that worked in the context of InSight2 as well as the development of a dashboard that could run the algorithm and display the results in meaningful ways

    ViSNet: an equivariant geometry-enhanced graph neural network with vector-scalar interactive message passing for molecules

    Full text link
    Geometric deep learning has been revolutionizing the molecular modeling field. Despite the state-of-the-art neural network models are approaching ab initio accuracy for molecular property prediction, their applications, such as drug discovery and molecular dynamics (MD) simulation, have been hindered by insufficient utilization of geometric information and high computational costs. Here we propose an equivariant geometry-enhanced graph neural network called ViSNet, which elegantly extracts geometric features and efficiently models molecular structures with low computational costs. Our proposed ViSNet outperforms state-of-the-art approaches on multiple MD benchmarks, including MD17, revised MD17 and MD22, and achieves excellent chemical property prediction on QM9 and Molecule3D datasets. Additionally, ViSNet achieved the top winners of PCQM4Mv2 track in the OGB-LCS@NeurIPS2022 competition. Furthermore, through a series of simulations and case studies, ViSNet can efficiently explore the conformational space and provide reasonable interpretability to map geometric representations to molecular structures

    Evolving IoT honeypots

    Get PDF
    The Internet of Things (IoT) is the emerging world where arbitrary objects from our everyday lives gain basic computational and networking capabilities to become part of the Internet. Researchers are estimating between 25 and 35 billion devices will be part of Internet by 2022. Unlike conventional computers where one hardware platform (Intel x86) and three operating systems (Windows, Linux and OS X) dominate the market, the IoT landscape is far more heterogeneous. To meet the growth demand the number of The System-on-Chip (SoC) manufacturers has seen a corresponding exponential growth making embedded platforms based on ARM, MIPS or SH4 processors abundant. The pursuit for market share is further leading to a price war and cost-cutting ultimately resulting in cheap systems with limited hardware resources and capabilities. The frugality of IoT hardware has a domino effect. Due to resource constraints vendors are packaging devices with custom, stripped-down Linux-based firmwares optimized for performing the device’s primary function. Device management, monitoring and security features are by and far absent from IoT devices. This created an asymmetry favouring attackers and disadvantaging defenders. This research sets out to reduce the opacity and identify a viable strategy, tactics and tooling for gaining insight into the IoT threat landscape by leveraging honeypots to build and deploy an evolving world-wide Observatory, based on cloud platforms, to help with studying attacker behaviour and collecting IoT malware samples. The research produces useful tools and techniques for identifying behavioural differences between Medium-Interaction honeypots and real devices by replaying interactive attacker sessions collected from the Honeypot Network. The behavioural delta is used to evolve the Honeypot Network and improve its collection capabilities. Positive results are obtained with respect to effectiveness of the above technique. Findings by other researchers in the field are also replicated. The complete dataset and source code used for this research is made publicly available on the Open Science Framework website at https://osf.io/vkcrn/.Thesis (MSc) -- Faculty of Science, Computer Science, 202

    Evolving IoT honeypots

    Get PDF
    The Internet of Things (IoT) is the emerging world where arbitrary objects from our everyday lives gain basic computational and networking capabilities to become part of the Internet. Researchers are estimating between 25 and 35 billion devices will be part of Internet by 2022. Unlike conventional computers where one hardware platform (Intel x86) and three operating systems (Windows, Linux and OS X) dominate the market, the IoT landscape is far more heterogeneous. To meet the growth demand the number of The System-on-Chip (SoC) manufacturers has seen a corresponding exponential growth making embedded platforms based on ARM, MIPS or SH4 processors abundant. The pursuit for market share is further leading to a price war and cost-cutting ultimately resulting in cheap systems with limited hardware resources and capabilities. The frugality of IoT hardware has a domino effect. Due to resource constraints vendors are packaging devices with custom, stripped-down Linux-based firmwares optimized for performing the device’s primary function. Device management, monitoring and security features are by and far absent from IoT devices. This created an asymmetry favouring attackers and disadvantaging defenders. This research sets out to reduce the opacity and identify a viable strategy, tactics and tooling for gaining insight into the IoT threat landscape by leveraging honeypots to build and deploy an evolving world-wide Observatory, based on cloud platforms, to help with studying attacker behaviour and collecting IoT malware samples. The research produces useful tools and techniques for identifying behavioural differences between Medium-Interaction honeypots and real devices by replaying interactive attacker sessions collected from the Honeypot Network. The behavioural delta is used to evolve the Honeypot Network and improve its collection capabilities. Positive results are obtained with respect to effectiveness of the above technique. Findings by other researchers in the field are also replicated. The complete dataset and source code used for this research is made publicly available on the Open Science Framework website at https://osf.io/vkcrn/.Thesis (MSc) -- Faculty of Science, Computer Science, 202

    A Survey on Explainability of Graph Neural Networks

    Full text link
    Graph neural networks (GNNs) are powerful graph-based deep-learning models that have gained significant attention and demonstrated remarkable performance in various domains, including natural language processing, drug discovery, and recommendation systems. However, combining feature information and combinatorial graph structures has led to complex non-linear GNN models. Consequently, this has increased the challenges of understanding the workings of GNNs and the underlying reasons behind their predictions. To address this, numerous explainability methods have been proposed to shed light on the inner mechanism of the GNNs. Explainable GNNs improve their security and enhance trust in their recommendations. This survey aims to provide a comprehensive overview of the existing explainability techniques for GNNs. We create a novel taxonomy and hierarchy to categorize these methods based on their objective and methodology. We also discuss the strengths, limitations, and application scenarios of each category. Furthermore, we highlight the key evaluation metrics and datasets commonly used to assess the explainability of GNNs. This survey aims to assist researchers and practitioners in understanding the existing landscape of explainability methods, identifying gaps, and fostering further advancements in interpretable graph-based machine learning.Comment: submitted to Bulletin of the IEEE Computer Society Technical Committee on Data Engineerin

    Darknet as a Source of Cyber Threat Intelligence: Investigating Distributed and Reflection Denial of Service Attacks

    Get PDF
    Cyberspace has become a massive battlefield between computer criminals and computer security experts. In addition, large-scale cyber attacks have enormously matured and became capable to generate, in a prompt manner, significant interruptions and damage to Internet resources and infrastructure. Denial of Service (DoS) attacks are perhaps the most prominent and severe types of such large-scale cyber attacks. Furthermore, the existence of widely available encryption and anonymity techniques greatly increases the difficulty of the surveillance and investigation of cyber attacks. In this context, the availability of relevant cyber monitoring is of paramount importance. An effective approach to gather DoS cyber intelligence is to collect and analyze traffic destined to allocated, routable, yet unused Internet address space known as darknet. In this thesis, we leverage big darknet data to generate insights on various DoS events, namely, Distributed DoS (DDoS) and Distributed Reflection DoS (DRDoS) activities. First, we present a comprehensive survey of darknet. We primarily define and characterize darknet and indicate its alternative names. We further list other trap-based monitoring systems and compare them to darknet. In addition, we provide a taxonomy in relation to darknet technologies and identify research gaps that are related to three main darknet categories: deployment, traffic analysis, and visualization. Second, we characterize darknet data. Such information could generate indicators of cyber threat activity as well as provide in-depth understanding of the nature of its traffic. Particularly, we analyze darknet packets distribution, its used transport, network and application layer protocols and pinpoint its resolved domain names. Furthermore, we identify its IP classes and destination ports as well as geo-locate its source countries. We further investigate darknet-triggered threats. The aim is to explore darknet inferred threats and categorize their severities. Finally, we contribute by exploring the inter-correlation of such threats, by applying association rule mining techniques, to build threat association rules. Specifically, we generate clusters of threats that co-occur targeting a specific victim. Third, we propose a DDoS inference and forecasting model that aims at providing insights to organizations, security operators and emergency response teams during and after a DDoS attack. Specifically, this work strives to predict, within minutes, the attacks’ features, namely, intensity/rate (packets/sec) and size (estimated number of compromised machines/bots). The goal is to understand the future short-term trend of the ongoing DDoS attacks in terms of those features and thus provide the capability to recognize the current as well as future similar situations and hence appropriately respond to the threat. Further, our work aims at investigating DDoS campaigns by proposing a clustering approach to infer various victims targeted by the same campaign and predicting related features. To achieve our goal, our proposed approach leverages a number of time series and fluctuation analysis techniques, statistical methods and forecasting approaches. Fourth, we propose a novel approach to infer and characterize Internet-scale DRDoS attacks by leveraging the darknet space. Complementary to the pioneer work on inferring DDoS activities using darknet, this work shows that we can extract DoS activities without relying on backscattered analysis. The aim of this work is to extract cyber security intelligence related to DRDoS activities such as intensity, rate and geographic location in addition to various network-layer and flow-based insights. To achieve this task, the proposed approach exploits certain DDoS parameters to detect the attacks and the expectation maximization and k-means clustering techniques in an attempt to identify campaigns of DRDoS attacks. Finally, we conclude this work by providing some discussions and pinpointing some future work

    On the Generation of Cyber Threat Intelligence: Malware and Network Traffic Analyses

    Get PDF
    In recent years, malware authors drastically changed their course on the subject of threat design and implementation. Malware authors, namely, hackers or cyber-terrorists perpetrate new forms of cyber-crimes involving more innovative hacking techniques. Being motivated by financial or political reasons, attackers target computer systems ranging from personal computers to organizations’ networks to collect and steal sensitive data as well as blackmail, scam people, or scupper IT infrastructures. Accordingly, IT security experts face new challenges, as they need to counter cyber-threats proactively. The challenge takes a continuous allure of a fight, where cyber-criminals are obsessed by the idea of outsmarting security defenses. As such, security experts have to elaborate an effective strategy to counter cyber-criminals. The generation of cyber-threat intelligence is of a paramount importance as stated in the following quote: “the field is owned by who owns the intelligence”. In this thesis, we address the problem of generating timely and relevant cyber-threat intelligence for the purpose of detection, prevention and mitigation of cyber-attacks. To do so, we initiate a research effort, which falls into: First, we analyze prominent cyber-crime toolkits to grasp the inner-secrets and workings of advanced threats. We dissect prominent malware like Zeus and Mariposa botnets to uncover their underlying techniques used to build a networked army of infected machines. Second, we investigate cyber-crime infrastructures, where we elaborate on the generation of a cyber-threat intelligence for situational awareness. We adapt a graph-theoretic approach to study infrastructures used by malware to perpetrate malicious activities. We build a scoring mechanism based on a page ranking algorithm to measure the badness of infrastructures’ elements, i.e., domains, IPs, domain owners, etc. In addition, we use the min-hashing technique to evaluate the level of sharing among cyber-threat infrastructures during a period of one year. Third, we use machine learning techniques to fingerprint malicious IP traffic. By fingerprinting, we mean detecting malicious network flows and their attribution to malware families. This research effort relies on a ground truth collected from the dynamic analysis of malware samples. Finally, we investigate the generation of cyber-threat intelligence from passive DNS streams. To this end, we design and implement a system that generates anomalies from passive DNS traffic. Due to the tremendous nature of DNS data, we build a system on top of a cluster computing framework, namely, Apache Spark [70]. The integrated analytic system has the ability to detect anomalies observed in DNS records, which are potentially generated by widespread cyber-threats

    Using Botnet Technologies to Counteract Network Traffic Analysis

    Get PDF
    Botnets have been problematic for over a decade. They are used to launch malicious activities including DDoS (Distributed-Denial-of-Service), spamming, identity theft, unauthorized bitcoin mining and malware distribution. A recent nation-wide DDoS attacks caused by the Mirai botnet on 10/21/2016 involving 10s of millions of IP addresses took down Twitter, Spotify, Reddit, The New York Times, Pinterest, PayPal and other major websites. In response to take-down campaigns by security personnel, botmasters have developed technologies to evade detection. The most widely used evasion technique is DNS fast-flux, where the botmaster frequently changes the mapping between domain names and IP addresses of the C&C server so that it will be too late or too costly to trace the C&C server locations. Domain names generated with Domain Generation Algorithms (DGAs) are used as the \u27rendezvous\u27 points between botmasters and bots. This work focuses on how to apply botnet technologies (fast-flux and DGA) to counteract network traffic analysis, therefore protecting user privacy. A better understanding of botnet technologies also helps us be pro-active in defending against botnets. First, we proposed two new DGAs using hidden Markov models (HMMs) and Probabilistic Context-Free Grammars (PCFGs) which can evade current detection methods and systems. Also, we developed two HMM-based DGA detection methods that can detect the botnet DGA-generated domain names with/without training sets. This helps security personnel understand the botnet phenomenon and develop pro-active tools to detect botnets. Second, we developed a distributed proxy system using fast-flux to evade national censorship and surveillance. The goal is to help journalists, human right advocates and NGOs in West Africa to have a secure and free Internet. Then we developed a covert data transport protocol to transform arbitrary message into real DNS traffic. We encode the message into benign-looking domain names generated by an HMM, which represents the statistical features of legitimate domain names. This can be used to evade Deep Packet Inspection (DPI) and protect user privacy in a two-way communication. Both applications serve as examples of applying botnet technologies to legitimate use. Finally, we proposed a new protocol obfuscation technique by transforming arbitrary network protocol into another (Network Time Protocol and a video game protocol of Minecraft as examples) in terms of packet syntax and side-channel features (inter-packet delay and packet size). This research uses botnet technologies to help normal users have secure and private communications over the Internet. From our botnet research, we conclude that network traffic is a malleable and artificial construct. Although existing patterns are easy to detect and characterize, they are also subject to modification and mimicry. This means that we can construct transducers to make any communication pattern look like any other communication pattern. This is neither bad nor good for security. It is a fact that we need to accept and use as best we can
    • …