185 research outputs found

    Network Intrusion Detection System:A systematic study of Machine Learning and Deep Learning approaches

    Get PDF
    The rapid advances in the internet and communication fields have resulted in ahuge increase in the network size and the corresponding data. As a result, manynovel attacks are being generated and have posed challenges for network secu-rity to accurately detect intrusions. Furthermore, the presence of the intruderswiththeaimtolaunchvariousattackswithinthenetworkcannotbeignored.Anintrusion detection system (IDS) is one such tool that prevents the network frompossible intrusions by inspecting the network traffic, to ensure its confidential-ity, integrity, and availability. Despite enormous efforts by the researchers, IDSstillfaceschallengesinimprovingdetectionaccuracywhilereducingfalsealarmrates and in detecting novel intrusions. Recently, machine learning (ML) anddeep learning (DL)-based IDS systems are being deployed as potential solutionsto detect intrusions across the network in an efficient manner. This article firstclarifiestheconceptofIDSandthenprovidesthetaxonomybasedonthenotableML and DL techniques adopted in designing network-based IDS (NIDS) sys-tems. A comprehensive review of the recent NIDS-based articles is provided bydiscussing the strengths and limitations of the proposed solutions. Then, recenttrends and advancements of ML and DL-based NIDS are provided in terms ofthe proposed methodology, evaluation metrics, and dataset selection. Using theshortcomings of the proposed methods, we highlighted various research chal-lenges and provided the future scope for the research in improving ML andDL-based NIDS

    Generative Methods, Meta-learning, and Meta-heuristics for Robust Cyber Defense

    Get PDF
    Cyberspace is the digital communications network that supports the internet of battlefield things (IoBT), the model by which defense-centric sensors, computers, actuators and humans are digitally connected. A secure IoBT infrastructure facilitates real time implementation of the observe, orient, decide, act (OODA) loop across distributed subsystems. Successful hacking efforts by cyber criminals and strategic adversaries suggest that cyber systems such as the IoBT are not secure. Three lines of effort demonstrate a path towards a more robust IoBT. First, a baseline data set of enterprise cyber network traffic was collected and modelled with generative methods allowing the generation of realistic, synthetic cyber data. Next, adversarial examples of cyber packets were algorithmically crafted to fool network intrusion detection systems while maintaining packet functionality. Finally, a framework is presented that uses meta-learning to combine the predictive power of various weak models. This resulted in a meta-model that outperforms all baseline classifiers with respect to overall accuracy of packets, and adversarial example detection rate. The National Defense Strategy underscores cybersecurity as an imperative to defend the homeland and maintain a military advantage in the information age. This research provides both academic perspective and applied techniques to to further the cybersecurity posture of the Department of Defense into the information age

    Realistic adversarial machine learning to improve network intrusion detection

    Get PDF
    Modern organizations can significantly benefit from the use of Artificial Intelligence (AI), and more specifically Machine Learning (ML), to tackle the growing number and increasing sophistication of cyber-attacks targeting their business processes. However, there are several technological and ethical challenges that undermine the trustworthiness of AI. One of the main challenges is the lack of robustness, which is an essential property to ensure that ML is used in a secure way. Improving robustness is no easy task because ML is inherently susceptible to adversarial examples: data samples with subtle perturbations that cause unexpected behaviors in ML models. ML engineers and security practitioners still lack the knowledge and tools to prevent such disruptions, so adversarial examples pose a major threat to ML and to the intelligent Network Intrusion Detection (NID) systems that rely on it. This thesis presents a methodology for a trustworthy adversarial robustness analysis of multiple ML models, and an intelligent method for the generation of realistic adversarial examples in complex tabular data domains like the NID domain: Adaptative Perturbation Pattern Method (A2PM). It is demonstrated that a successful adversarial attack is not guaranteed to be a successful cyber-attack, and that adversarial data perturbations can only be realistic if they are simultaneously valid and coherent, complying with the domain constraints of a real communication network and the class-specific constraints of a certain cyber-attack class. A2PM can be used for adversarial attacks, to iteratively cause misclassifications, and adversarial training, to perform data augmentation with slightly perturbed data samples. Two case studies were conducted to evaluate its suitability for the NID domain. The first verified that the generated perturbations preserved both validity and coherence in Enterprise and Internet-of Things (IoT) network scenarios, achieving realism. The second verified that adversarial training with simple perturbations enables the models to retain a good generalization to regular IoT network traffic flows, in addition to being more robust to adversarial examples. The key takeaway of this thesis is: ML models can be incredibly valuable to improve a cybersecurity system, but their own vulnerabilities must not be disregarded. It is essential to continue the research efforts to improve the security and trustworthiness of ML and of the intelligent systems that rely on it.Organizações modernas podem beneficiar significativamente do uso de Inteligência Artificial (AI), e mais especificamente Aprendizagem Automática (ML), para enfrentar a crescente quantidade e sofisticação de ciberataques direcionados aos seus processos de negócio. No entanto, há vários desafios tecnológicos e éticos que comprometem a confiabilidade da AI. Um dos maiores desafios é a falta de robustez, que é uma propriedade essencial para garantir que se usa ML de forma segura. Melhorar a robustez não é uma tarefa fácil porque ML é inerentemente suscetível a exemplos adversos: amostras de dados com perturbações subtis que causam comportamentos inesperados em modelos ML. Engenheiros de ML e profissionais de segurança ainda não têm o conhecimento nem asferramentas necessárias para prevenir tais disrupções, por isso os exemplos adversos representam uma grande ameaça a ML e aos sistemas de Deteção de Intrusões de Rede (NID) que dependem de ML. Esta tese apresenta uma metodologia para uma análise da robustez de múltiplos modelos ML, e um método inteligente para a geração de exemplos adversos realistas em domínios de dados tabulares complexos como o domínio NID: Método de Perturbação com Padrões Adaptativos (A2PM). É demonstrado que um ataque adverso bem-sucedido não é garantidamente um ciberataque bem-sucedido, e que as perturbações adversas só são realistas se forem simultaneamente válidas e coerentes, cumprindo as restrições de domínio de uma rede de computadores real e as restrições específicas de uma certa classe de ciberataque. A2PM pode ser usado para ataques adversos, para iterativamente causar erros de classificação, e para treino adverso, para realizar aumento de dados com amostras ligeiramente perturbadas. Foram efetuados dois casos de estudo para avaliar a sua adequação ao domínio NID. O primeiro verificou que as perturbações preservaram tanto a validade como a coerência em cenários de redes Empresariais e Internet-das-Coisas (IoT), alcançando o realismo. O segundo verificou que o treino adverso com perturbações simples permitiu aos modelos reter uma boa generalização a fluxos de tráfego de rede IoT, para além de serem mais robustos contra exemplos adversos. A principal conclusão desta tese é: os modelos ML podem ser incrivelmente valiosos para melhorar um sistema de cibersegurança, mas as suas próprias vulnerabilidades não devem ser negligenciadas. É essencial continuar os esforços de investigação para melhorar a segurança e a confiabilidade de ML e dos sistemas inteligentes que dependem de ML

    Artificial Intelligence and Machine Learning in Cybersecurity: Applications, Challenges, and Opportunities for MIS Academics

    Get PDF
    The availability of massive amounts of data, fast computers, and superior machine learning (ML) algorithms has spurred interest in artificial intelligence (AI). It is no surprise, then, that we observe an increase in the application of AI in cybersecurity. Our survey of AI applications in cybersecurity shows most of the present applications are in the areas of malware identification and classification, intrusion detection, and cybercrime prevention. We should, however, be aware that AI-enabled cybersecurity is not without its drawbacks. Challenges to AI solutions include a shortage of good quality data to train machine learning models, the potential for exploits via adversarial AI/ML, and limited human expertise in AI. However, the rewards in terms of increased accuracy of cyberattack predictions, faster response to cyberattacks, and improved cybersecurity make it worthwhile to overcome these challenges. We present a summary of the current research on the application of AI and ML to improve cybersecurity, challenges that need to be overcome, and research opportunities for academics in management information systems

    Spatiotemporal anomaly detection: streaming architecture and algorithms

    Get PDF
    Includes bibliographical references.2020 Summer.Anomaly detection is the science of identifying one or more rare or unexplainable samples or events in a dataset or data stream. The field of anomaly detection has been extensively studied by mathematicians, statisticians, economists, engineers, and computer scientists. One open research question remains the design of distributed cloud-based architectures and algorithms that can accurately identify anomalies in previously unseen, unlabeled streaming, multivariate spatiotemporal data. With streaming data, time is of the essence, and insights are perishable. Real-world streaming spatiotemporal data originate from many sources, including mobile phones, supervisory control and data acquisition enabled (SCADA) devices, the internet-of-things (IoT), distributed sensor networks, and social media. Baseline experiments are performed on four (4) non-streaming, static anomaly detection multivariate datasets using unsupervised offline traditional machine learning (TML), and unsupervised neural network techniques. Multiple architectures, including autoencoders, generative adversarial networks, convolutional networks, and recurrent networks, are adapted for experimentation. Extensive experimentation demonstrates that neural networks produce superior detection accuracy over TML techniques. These same neural network architectures can be extended to process unlabeled spatiotemporal streaming using online learning. Space and time relationships are further exploited to provide additional insights and increased anomaly detection accuracy. A novel domain-independent architecture and set of algorithms called the Spatiotemporal Anomaly Detection Environment (STADE) is formulated. STADE is based on federated learning architecture. STADE streaming algorithms are based on a geographically unique, persistently executing neural networks using online stochastic gradient descent (SGD). STADE is designed to be pluggable, meaning that alternative algorithms may be substituted or combined to form an ensemble. STADE incorporates a Stream Anomaly Detector (SAD) and a Federated Anomaly Detector (FAD). The SAD executes at multiple locations on streaming data, while the FAD executes at a single server and identifies global patterns and relationships among the site anomalies. Each STADE site streams anomaly scores to the centralized FAD server for further spatiotemporal dependency analysis and logging. The FAD is based on recent advances in DNN-based federated learning. A STADE testbed is implemented to facilitate globally distributed experimentation using low-cost, commercial cloud infrastructure provided by Microsoft™. STADE testbed sites are situated in the cloud within each continent: Africa, Asia, Australia, Europe, North America, and South America. Communication occurs over the commercial internet. Three STADE case studies are investigated. The first case study processes commercial air traffic flows, the second case study processes global earthquake measurements, and the third case study processes social media (i.e., Twitter™) feeds. These case studies confirm that STADE is a viable architecture for the near real-time identification of anomalies in streaming data originating from (possibly) computationally disadvantaged, geographically dispersed sites. Moreover, the addition of the FAD provides enhanced anomaly detection capability. Since STADE is domain-independent, these findings can be easily extended to additional application domains and use cases

    Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research

    Get PDF
    This survey presents a comprehensive review of current literature on Explainable Artificial Intelligence (XAI) methods for cyber security applications. Due to the rapid development of Internet-connected systems and Artificial Intelligence in recent years, Artificial Intelligence including Machine Learning and Deep Learning has been widely utilized in the fields of cyber security including intrusion detection, malware detection, and spam filtering. However, although Artificial Intelligence-based approaches for the detection and defense of cyber attacks and threats are more advanced and efficient compared to the conventional signature-based and rule-based cyber security strategies, most Machine Learning-based techniques and Deep Learning-based techniques are deployed in the “black-box” manner, meaning that security experts and customers are unable to explain how such procedures reach particular conclusions. The deficiencies of transparencies and interpretability of existing Artificial Intelligence techniques would decrease human users’ confidence in the models utilized for the defense against cyber attacks, especially in current situations where cyber attacks become increasingly diverse and complicated. Therefore, it is essential to apply XAI in the establishment of cyber security models to create more explainable models while maintaining high accuracy and allowing human users to comprehend, trust, and manage the next generation of cyber defense mechanisms. Although there are papers reviewing Artificial Intelligence applications in cyber security areas and the vast literature on applying XAI in many fields including healthcare, financial services, and criminal justice, the surprising fact is that there are currently no survey research articles that concentrate on XAI applications in cyber security. Therefore, the motivation behind the survey is to bridge the research gap by presenting a detailed and up-to-date survey of XAI approaches applicable to issues in the cyber security field. Our work is the first to propose a clear roadmap for navigating the XAI literature in the context of applications in cyber security
    corecore