6 research outputs found

    Generative Adversarial Networks for Mitigating Biases in Machine Learning Systems

    Full text link
    In this paper, we propose a new framework for mitigating biases in machine learning systems. The problem of the existing mitigation approaches is that they are model-oriented in the sense that they focus on tuning the training algorithms to produce fair results, while overlooking the fact that the training data can itself be the main reason for biased outcomes. Technically speaking, two essential limitations can be found in such model-based approaches: 1) the mitigation cannot be achieved without degrading the accuracy of the machine learning models, and 2) when the data used for training are largely biased, the training time automatically increases so as to find suitable learning parameters that help produce fair results. To address these shortcomings, we propose in this work a new framework that can largely mitigate the biases and discriminations in machine learning systems while at the same time enhancing the prediction accuracy of these systems. The proposed framework is based on conditional Generative Adversarial Networks (cGANs), which are used to generate new synthetic fair data with selective properties from the original data. We also propose a framework for analyzing data biases, which is important for understanding the amount and type of data that need to be synthetically sampled and labeled for each population group. Experimental results show that the proposed solution can efficiently mitigate different types of biases, while at the same time enhancing the prediction accuracy of the underlying machine learning model

    Performance Comparison Analysis of Classification Methodologies for Effective Detection of Intrusions

    Get PDF
    Intrusion detection systems (IDS) are critical in many applications, including cloud environments. The intrusion poses a security threat and extracts privacy data and information from the cloud. The user has an Internet function that allows him to store personal information in the cloud environment. The cloud can be affected by various issues such as data loss, data breaches, lower security and lack of privacy due to some intruders. A single intrusion incident can result in data within computer and network systems being quickly stolen or deleted. Additionally, intrusions can cause damage to system hardware, resulting in significant financial losses and exposing critical IT infrastructure to risk. To overcome these issues, the study employs the performance comparison analysis of Autoencoder Convolutional neural network (AE+CNN), Random K-means clustering assisted deep neural network (RF+K-means+DNN), Autoencoder K-means clustering assisted long short term memory (AE+K-means+LSTM), Alexnet+Bi-GRU, AE+Alexnet+Bi-GRU and Wild horse AlexNet assisted Bi-directional Gated Recurrent Unit (WABi-GRU) models to choose the best methodology for effective detection of intrusions. The data needed for the analysis is collected from CICIDS2018, UNSW-NB15 and NSL-KDD datasets. The collected data are pre-processed using data normalization and data cleaning. Finally, through this research, the best model suitable for effective intrusion detection can be identified and used for further processes. The proposed models, such as RF+K-means+DNN, AE+K-Means+LSTM, AlexNet Bi-GRU, AE+Alexnet+Bi-GRU and WABi-GRU can obtain an accuracy of 99.278%, 99.33%, 99.45%, 99.50%, 99.65% for the CICIDS dataset 2018 for binary classification. In multi-class classification, the AlexNet Bi-GRU, AE+Alexnet+Bi-GRU and WABi-GRU can attain accuracy of 99.819%, 99.852% and 99.890%. In NSL-KDD, the AlexNet Bi-GRU, AE+Alexnet+Bi-GRU and WABi-GRU achieve accuracy of 99.34%, 99.546% and 99.7%. In UNSW-NB 15 dataset, AlexNet Bi-GRU, AE+Alexnet+Bi-GRU and WABi-GRU achieve accuracy of 99.313%, 99.399% and 99.53%. AlexNet Bi-GRU-based models can obtain better performances than other existing models

    Towards privacy preserving cooperative cloud based intrusion detection systems

    Full text link
    Les systèmes infonuagiques deviennent de plus en plus complexes, dynamiques et vulnérables aux attaques. Par conséquent, il est de plus en plus difficile pour qu'un seul système de détection d'intrusion (IDS) basé sur le cloud puisse repérer toutes les menaces, en raison des lacunes de connaissances sur les attaques et leurs conséquences. Les études récentes dans le domaine de la cybersécurité ont démontré qu'une coopération entre les IDS d'un nuage pouvait apporter une plus grande efficacité de détection dans des systèmes informatiques aussi complexes. Grâce à cette coopération, les IDS d'un nuage peuvent se connecter et partager leurs connaissances afin d'améliorer l'exactitude de la détection et obtenir des bénéfices communs. L'anonymat des données échangées par les IDS constitue un élément crucial de l'IDS coopérative. Un IDS malveillant pourrait obtenir des informations confidentielles d'autres IDS en faisant des conclusions à partir des données observées. Pour résoudre ce problème, nous proposons un nouveau système de protection de la vie privée pour les IDS en nuage. Plus particulièrement, nous concevons un système uniforme qui intègre des techniques de protection de la vie privée dans des IDS basés sur l'apprentissage automatique pour obtenir des IDS qui respectent les informations personnelles. Ainsi, l'IDS permet de cacher des informations possédant des données confidentielles et sensibles dans les données partagées tout en améliorant ou en conservant la précision de la détection. Nous avons mis en œuvre un système basé sur plusieurs techniques d'apprentissage automatique et de protection de la vie privée. Les résultats indiquent que les IDS qui ont été étudiés peuvent détecter les intrusions sans utiliser nécessairement les données initiales. Les résultats (c'est-à-dire qu'aucune diminution significative de la précision n'a été enregistrée) peuvent être obtenus en se servant des nouvelles données générées, analogues aux données de départ sur le plan sémantique, mais pas sur le plan synthétique.Cloud systems are becoming more sophisticated, dynamic, and vulnerable to attacks. Therefore, it's becoming increasingly difficult for a single cloud-based Intrusion Detection System (IDS) to detect all attacks, because of limited and incomplete knowledge about attacks and their implications. The recent works on cybersecurity have shown that a co-operation among cloud-based IDSs can bring higher detection accuracy in such complex computer systems. Through collaboration, cloud-based IDSs can consult and share knowledge with other IDSs to enhance detection accuracy and achieve mutual benefits. One fundamental barrier within cooperative IDS is the anonymity of the data the IDS exchanges. Malicious IDS can obtain sensitive information from other IDSs by inferring from the observed data. To address this problem, we propose a new framework for achieving a privacy-preserving cooperative cloud-based IDS. Specifically, we design a unified framework that integrates privacy-preserving techniques into machine learning-based IDSs to obtain privacy-aware cooperative IDS. Therefore, this allows IDS to hide private and sensitive information in the shared data while improving or maintaining detection accuracy. The proposed framework has been implemented by considering several machine learning and privacy-preserving techniques. The results suggest that the consulted IDSs can detect intrusions without the need to use the original data. The results (i.e., no records of significant degradation in accuracy) can be achieved using the newly generated data, similar to the original data semantically but not synthetically

    Theoretical and Applied Foundations for Intrusion Detection in Single and Federated Clouds

    Get PDF
    Les systèmes infonuagiques deviennent de plus en plus complexes, plus dynamiques et hétérogènes. Un tel environnement produit souvent des données complexes et bruitées, empêchant les systèmes de détection d’intrusion (IDS) de détecter des variantes d’attaques connues. Une seule intrusion ou une attaque dans un tel système hétérogène peut se présenter sous des formes différentes, logiquement mais non synthétiquement similaires. Les IDS traditionnels sont incapables d’identifier ces attaques, car ils sont conçus pour des infrastructures spécifiques et limitées. Par conséquent, une détection précise dans le nuage ne sera absolument pas identifiée. Outre le problème de l’infonuagique, les cyber-attaques sont de plus en plus sophistiquées et difficiles à détecter. Il est donc extrêmement compliqué pour un unique IDS d’un nuage de détecter toutes les attaques, en raison de leurs implications, et leurs connaissances limitées et insuffisantes de celles-ci. Les solutions IDS actuelles de l’infonuagique résident dans le fait qu’elles ne tiennent pas compte des aspects dynamiques et hétérogènes de l’infonuagique. En outre, elles s’appuient fondamentalement sur les connaissances et l’expérience locales pour identifier les attaques et les modèles existants. Cela rend le nuage vulnérable aux attaques «Zero-Day». À cette fin, nous résolvons dans cette thèse deux défis associés à l’IDS de l’infonuagique : la détection des cyberattaques dans des environnements complexes, dynamiques et hétérogènes, et la détection des cyberattaques ayant des informations limitées et/ou incomplètes sur les intrusions et leurs conséquences. Dans cette thèse, nous sommes intéressés aux IDS génériques de l’infonuagique afin d’identifier les intrusions qui sont indépendantes de l’infrastructure utilisée. Par conséquent, à chaque fois qu’un pressentiment d’attaque est identifié, le système de détection d’intrusion doit être capable de reconnaître toutes les variantes d’une telle attaque, quelle que soit l’infrastructure utilisée. De plus, les IDS de l’infonuagique coopèrent et échangent des informations afin de faire bénéficier chacun des expertises des autres, pour identifier des modèles d’attaques inconnues.----------ABSTRACT: Cloud Computing systems are becoming more and more complex, dynamic and heterogeneous. Such an environment frequently produces complex and noisy data that make Intrusion Detection Systems (IDSs) unable to detect unknown variants of known attacks. A single intrusion or an attack in such a heterogeneous system could take various forms that are logically but not synthetically similar. This, in turn, makes traditional IDSs unable to identify these attacks, since they are designed for specific and limited infrastructures. Therefore, the accuracy of the detection in the cloud will be very negatively affected. In addition to the problem of the cloud computing environment, cyber attacks are getting more sophisticated and harder to detect. Thus, it is becoming increasingly difficult for a single cloud-based IDS to detect all attacks, because of limited and incomplete knowledge about attacks and implications. The problem of the existing cloud-based IDS solutions is that they overlook the dynamic and changing nature of the cloud. Moreover, they are fundamentally based on the local knowledge and experience to perform the classification of attacks and normal patterns. This renders the cloud vulnerable to “Zero-Day” attacks. To this end, we address throughout this thesis two challenges associated with the cloud-based IDS which are: the detection of cyber attacks under complex, dynamic and heterogeneous environments; and the detection of cyber attacks under limited and/or incomplete information about intrusions and implications. We are interested in this thesis in allowing cloud-based IDSs to be generic, in order to identify intrusions regardless of the infrastructure used. Therefore, whenever an intrusion has been identified, an IDS should be able to recognize all the different structures of such an attack, regardless of the infrastructure that is being used. Moreover, we are interested in allowing cloud-based IDSs to cooperate and share knowledge with each other, in order to make them benefit from each other’s expertise to cover unknown attack patterns. The originality of this thesis lies within two aspects: 1) the design of a generic cloud-based IDS that allows the detection under changing and heterogeneous environments and 2) the design of a multi-cloud cooperative IDS that ensures trustworthiness, fairness and sustainability. By trustworthiness, we mean that the cloud-based IDS should be able to ensure that it will consult, cooperate and share knowledge with trusted parties (i.e., cloud-based IDSs). By fairness, we mean that the cloud-based IDS should be able to guarantee that mutual benefits will be achieved through minimising the chance of cooperating with selfish IDSs. This is useful to give IDSs the motivation to participate in the community

    A deep learning approach for proactive multi-cloud cooperative intrusion detection system

    No full text
    ABSTRACT: The last few years have witnessed the ability of cooperative cloud-based Intrusion Detection Systems (IDS) in detecting sophisticated and unknown attacks associated with the complex architecture of the Cloud. In a cooperative setting, an IDS can consult other IDSs about suspicious intrusions and make a decision using an aggregation algorithm. However, undesired delays arise from applying aggregation algorithms and also from waiting to receive feedback from consulted IDSs. These limitations render the decisions generated by existing cooperative IDS approaches ineffective in real-time, hence making them unsustainable. To face these challenges, we propose a machine learning-based cooperative IDS that efficiently exploits the historical feedback data to provide the ability of proactive decision making. Specifically, the proposed model is based on a Denoising Autoencoder (DA), which is used as a building block to construct a deep neural network. The power of DA lies in its ability to learn how to reconstruct IDSs’ feedback from partial feedback. This allows us to proactively make decisions about suspicious intrusions even in the absence of complete feedback from the IDSs. The proposed model was implemented in GPU-enabled TensorFlow and evaluated using a real-life dataset. Experimental results show that our model can achieve detection accuracy up to 95%
    corecore