12 research outputs found

    Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity Analysis

    Full text link
    Graph neural networks (GNNs) have been utilized to create multi-layer graph models for a number of cybersecurity applications from fraud detection to software vulnerability analysis. Unfortunately, like traditional neural networks, GNNs also suffer from a lack of transparency, that is, it is challenging to interpret the model predictions. Prior works focused on specific factor explanations for a GNN model. In this work, we have designed and implemented Illuminati, a comprehensive and accurate explanation framework for cybersecurity applications using GNN models. Given a graph and a pre-trained GNN model, Illuminati is able to identify the important nodes, edges, and attributes that are contributing to the prediction while requiring no prior knowledge of GNN models. We evaluate Illuminati in two cybersecurity applications, i.e., code vulnerability detection and smart contract vulnerability detection. The experiments show that Illuminati achieves more accurate explanation results than state-of-the-art methods, specifically, 87.6% of subgraphs identified by Illuminati are able to retain their original prediction, an improvement of 10.3% over others at 77.3%. Furthermore, the explanation of Illuminati can be easily understood by the domain experts, suggesting the significant usefulness for the development of cybersecurity applications.Comment: EuroS&P 202

    Tango: rethinking quantization for graph neural network training on GPUs

    Full text link
    Graph Neural Networks (GNNs) are becoming increasingly popular due to their superior performance in critical graph-related tasks. While quantization is widely used to accelerate GNN computation, quantized training faces unprecedented challenges. Current quantized GNN training systems often have longer training times than their full-precision counterparts for two reasons: (i) addressing the accuracy challenge leads to excessive overhead, and (ii) the optimization potential exposed by quantization is not adequately leveraged. This paper introduces Tango which re-thinks quantization challenges and opportunities for graph neural network training on GPUs with three contributions: Firstly, we introduce efficient rules to maintain accuracy during quantized GNN training. Secondly, we design and implement quantization-aware primitives and inter-primitive optimizations that can speed up GNN training. Finally, we integrate Tango with the popular Deep Graph Library (DGL) system and demonstrate its superior performance over state-of-the-art approaches on various GNN models and datasets

    BotInfer: A Bot Inference Approach by Correlating Host and Network Information

    No full text
    Part 5: Session 5: MiscellaneousInternational audienceBotnet is widely used in cyber-attacks and becomes a serious threat to network security. Existing approaches can detect botnet effectively in certain environments, however problems still exist in using host or network detection approaches respectively, such as robustness in detection tools, difficulties in global deployment and low precision rate. To solve the above problems, a novel detection approach called BotInfer is proposed. In BotInfer approach, host-based bot detection tools are deployed on some of the hosts; network flow of all the hosts is captured and analyzed; host detection result and flow information are correlated by the bot inference engine. Through the experiments, BotInfer can effectively detect the hosts in the network. When the deployment rate of bot detection tools in the network reaches 80%, the precision rate of the hosts with detection tools is about 99%, and the precision rate of the hosts without detection tools is about 86%

    2023 SMC Data Challenge

    No full text
    The 2023 Smoky Mountains Conference-Data Challenge session is part of the Smoky Mountains Computational Sciences and Engineering Conference (SMC) hosted byOak Ridge National Laboratory (ORNL). The 2023 SMC Data Challenge session provides an opportunity to tackle scientific data challenges that come from eminent datasets related to ORNL. These datasets come from scientific simulations and instruments in physical and chemical sciences, electron microscopy, bioinformatics, engineering, materials science, neutron sources, urban development, and other areas, and had open research questions and tasks for participants to solve
    corecore