20 research outputs found

    INSOMNIA:Towards Concept-Drift Robustness in Network Intrusion Detection

    Get PDF
    Despite decades of research in network traffic analysis and incredible advances in artificial intelligence, network intrusion detection systems based on machine learning (ML) have yet to prove their worth. One core obstacle is the existence of concept drift, an issue for all adversary-facing security systems. Additionally, specific challenges set intrusion detection apart from other ML-based security tasks, such as malware detection. In this work, we offer a new perspective on these challenges. We propose INSOMNIA, a semi-supervised intrusion detector which continuously updates the underlying ML model as network traffic characteristics are affected by concept drift. We use active learning to reduce latency in the model updates, label estimation to reduce labeling overhead, and apply explainable AI to better interpret how the model reacts to the shifting distribution. To evaluate INSOMNIA, we extend TESSERACT - a framework originally proposed for performing sound time-aware evaluations of ML-based malware detectors - to the network intrusion domain. Our evaluation shows that accounting for drifting scenarios is vital for effective intrusion detection systems

    Leveraging Sentinel-2 time series for bark beetle-induced forest dieback inventory

    No full text
    The bark beetle is one of the most critical, biotic disturbance agents causing tree dieback in several coniferous forest ecosystems around Europe. Forest dieback inventory plays a crucial role to study the effect of this biotic forest disturbance and improve forest management strategies. In this study, we explore the performance of remote sensing methods used to perform the inventory mapping of bark beetle-induced forest dieback. Specifically, we analyse the performance of classification models trained with Random Forest and XGBoost, as well as semantic segmentation models trained with U-Net by accounting for both spectral bands of Sentinel-2 images and some developed spectral vegetation indices. In addition, we investigate the effect of accounting for temporal knowledge on the performance of remote sensing methods. To this aim, we consider a dataset of Sentinel-2 time series acquired from May to October 2018 in non-overlapping forest scenes from the Northeast of France. The selected scenes host bark beetle infestation hotspots of different sizes, which originate from the mass reproduction of the bark beetle in the 2018 infestation. The results of this study show that the Random Forest model trained taking into account the temporal patterns in both spectral bands and vegetation indices achieves the highest accuracy in the study inventory task. Finally, we use an eXplainable Artificial Intelligence technique to explain the effect of temporal knowledge on the Random Forest inventory decisions

    <inline-formula><tex-math notation="LaTeX">SILVIA\mathsf{SILVIA}</tex-math></inline-formula>: An eXplainable Framework to Map Bark Beetle Infestation in Sentinel-2 Images

    No full text
    Recent long spells of high temperatures and drought-hit summers have fostered the conditions for an unprecedented outbreak of bark beetles in Europe. This phenomenon has ruined vast swathes of European conifer forests creating a need among forest managers to find effective methods to gather information about the mapping of bark beetle infestation hotspots. Sentinel-2 data have been recently established as an alternative to field surveys for certain inventory tasks. Hence, this study explores the achievements of machine learning to perform the inventory mapping of bark beetle infestation hotspots in Sentinel-2 images. In particular, we investigate the accuracy performance of a spectral classifier that is learned for the study task by leveraging spectral vegetation indices and performing self-training. We use a dataset of Sentinel-2 images acquired in nonoverlapping forest scenes from the North-east of France acquired in October 2018. The selected scenes host bark beetle infestation hotspots of different sizes, which originate from the mass reproduction of the bark beetle in the 2018 infestation. We perform a learning stage by accounting for the ground-truth bark beetle infestation masks of a subset of images in the study imagery dataset (training imagery set). The goal is to produce a prediction of the bark beetle infestation masks for the remaining images in the study imagery dataset (working imagery set). Moreover, we use an explainable artificial intelligence technique to study the relevance of spectral information and explain the effect of both self-training and spectral vegetation indices on the mapping decisions

    SENECA: Change detection in optical imagery using Siamese networks with Active-Transfer Learning

    No full text
    International audienceChange Detection (CD) aims to distinguish surface changes based on bi-temporal remote sensing images. In recent years, deep neural models have made a breakthrough in CD processes. However, training a deep neural model requires a large volume of labelled training samples that are time-consuming and labour-intensive to acquire. With the aim of learning an accurate CD model with limited labelled data, we propose SENECA: a method based on a CD Siamese network, which takes advantage of both Transfer Learning (TL) and Active Learning (AL) to handle the constraint of limited supervision. More precisely, we jointly use AL and TL to adapt a CD model trained on a labelled source domain to a (related) target domain featured by restricted access to labelled data. We report results from an experimental evaluation involving five pairs of images acquired via Sentinel-2 satellites between 2015 and 2018 in various locations picked all over Asia and USA. The results show the beneficial effects of the proposed AL and TL strategies on the accuracy of the decisions made by the CD Siamese network and depict the merit of the proposed approach over competing CD baselines

    DIAMANTE: A data-centric semantic segmentation approach to map tree dieback induced by bark beetle infestations via satellite images

    No full text
    International audienceForest tree dieback inventory has a crucial role in improving forest management strategies. This inventory is traditionally performed by forests through laborious and time-consuming human assessment of individual trees.On the other hand, the large amount of Earth satellite data that is publicly available with the Copernicus program and can be processed through advanced deep learning techniques has recently been established as an alternative to field surveys for forest tree dieback tasks. However, to realize its full potential, deep learning requires a deep understanding of satellite data since the data collection and preparation steps are essential as the model development step. In this study, we explore the performance of a data-centric semantic segmentation approach to detect forest tree dieback events due to bark beetle infestation in satellite images. The proposed approach prepares a multisensor data set collected using both the SAR Sentinel-1 sensor and the optical Sentinel-2 sensor and uses this dataset to train a multisensor semantic segmentation model. The evaluation shows the effectiveness of the proposed approach in a real inventory case study that regards non-overlapping forest scenes from the Northeast of France acquired in October 2018. The selected scenes host bark beetle infestation hotspots of different sizes, which originate from the mass reproduction of the bark beetle in the 2018 infestation.</div

    PANACEA: A Neural Model Ensemble for Cyber-Threat Detection

    No full text
    This study describes a new cyber-threat detection method, named PANACEA, that uses Ensemble Deep Learning coupled with Adversarial Training and XAI, to gain accuracy with neural models trained in cybersecurity problem

    VINCENT: Cyber-threat detection through vision transformers and knowledge distillation

    No full text
    Vision Transformers (ViTs) denote a family of attention-based deep learning techniques that have recently achieved amazing results in various problems related to the field of computer vision. In this paper, we explore the use of ViTs in problems of cyber-threat detection related to malware and network intrusion detection. In particular, we propose VINCENT, that is a novel deep neural method, which resorts to a color imagery representation of cyber-data by encoding related cyber-data features into neighboring color pixels. ViTs are trained from cyber-data images as teacher models, to extract explainable imagery signatures of cyber-data classes. This knowledge is extracted by leveraging the self-attention mechanism to give paired attention values between pairs of imagery patches. The signature knowledge, extracted through the ViT teacher, is, finally, used to train a smaller neural student model according to the knowledge distillation theory. Experiments with various benchmark cybersecurity datasets assess the accuracy of the student model VINCENT also compared to that of several state-of-the-art methods. In addition, it shows that VINCENT can obtain insights from explanations recovered through the self-attention mechanism of the ViT teacher

    XAI to Explore Robustness of Features in Adversarial Training for Cybersecurity

    No full text
    Adversarial training is an effective learning approach to harden deep neural models against adversarial examples. In this paper, we explore the accuracy of adversarial training in cybersecurity. In addition, we use an XAI technique to analyze how certain input features may have an effect on decisions yielded with adversarial training giving the security analyst much better insight into robustness of features. Finally, we start the investigation of how XAI can be used for robust features selection within adversarial training in cybersecurity problems

    Leveraging Grad-CAM to Improve the Accuracy of Network Intrusion Detection Systems

    No full text
    As network cyber attacks continue to evolve, traditional intrusion detection systems are no longer able to detect new attacks with unexpected patterns. Deep learning is currently addressing this problem by enabling unprecedented breakthroughs to properly detect unexpected network cyber attacks. However, the lack of decomposability of deep neural networks into intuitive and understandable components makes deep learning decisions difficult to interpret. In this paper, we propose a method for leveraging the visual explanations of deep learning-based intrusion detection models by making them more transparent and accurate. In particular, we consider a CNN trained on a 2D representation of historical network traffic data to distinguish between attack and normal flows. Then, we use the Grad-CAM method to produce coarse localization maps that highlight the most important regions of the traffic data representation to predict the cyber attack. Since decisions made on samples belonging to the same class are expected to be explained with similar localization maps, we base the final classification of a new network flow on the class of the nearest-neighbour historical localization map. Experiments with various benchmark datasets demonstrate the effectiveness of the proposed method compared to several state-of-the-art methods
    corecore