37 research outputs found
FPGA implementation of naive bayes classifier for network security
In the vast usage of internet nowadays, the rate of cybercrime such as fraud, hacking, identity theft, network intrusion, software piracy and espionage are becoming more critical. Malware code writers used this chance to create malware that able to breach the security and gain access to the information. Hence, the importance of malware detection system becoming more significant as the users need the protection from the malware threats. Most of malware detection systems implement signature based classification where only known malware can be detected. Nowadays, new malwares are able to change its signature sequence regularly in order to avoid detection. This polymorphic malware becomes the limitation for signature based detection approach. This project aim is to proposed signature-based detection approach that able to detect polymorphic malware by using NaĂŻve Bayes algorithm. The integration of the classifier architecture onto FPGA board in order to measures the performances of the system. The feature from network traffic subset to Snort signature detection of known malware and benign samples are extracted using overlapping Ngram string format. The data set is then being used for training and testing for the classifier. The classifier for the malware detection used NaĂŻve Bayes algorithm that using Bayesian Theorem probability for the features in the data set to determine types of the flow. The model is then being implemented into hardware FPGA architecture and being coded in RTL. The target FPGA that being used in Vivado software is Xilinx Virtex-7 VC709 that able to support the system requirements. The hardware performance of the model was analyzed and compared with the NaĂŻve Bayes software classifier for the performance evaluation. The proposed hardware NB malware detection classifier has managed to achieve 96.3% accuracy and improved FPR rate of 3.1%. The hardware NB malware detection classifier on FPGA architecture also able to achieve better resource utilization and improved detection speed of 0.13 ÎŒs per flow
A Survey on Enterprise Network Security: Asset Behavioral Monitoring and Distributed Attack Detection
Enterprise networks that host valuable assets and services are popular and
frequent targets of distributed network attacks. In order to cope with the
ever-increasing threats, industrial and research communities develop systems
and methods to monitor the behaviors of their assets and protect them from
critical attacks. In this paper, we systematically survey related research
articles and industrial systems to highlight the current status of this arms
race in enterprise network security. First, we discuss the taxonomy of
distributed network attacks on enterprise assets, including distributed
denial-of-service (DDoS) and reconnaissance attacks. Second, we review existing
methods in monitoring and classifying network behavior of enterprise hosts to
verify their benign activities and isolate potential anomalies. Third,
state-of-the-art detection methods for distributed network attacks sourced from
external attackers are elaborated, highlighting their merits and bottlenecks.
Fourth, as programmable networks and machine learning (ML) techniques are
increasingly becoming adopted by the community, their current applications in
network security are discussed. Finally, we highlight several research gaps on
enterprise network security to inspire future research.Comment: Journal paper submitted to Elseive
Security Technologies and Methods for Advanced Cyber Threat Intelligence, Detection and Mitigation
The rapid growth of the Internet interconnectivity and complexity of communication systems has led us to a significant growth of cyberattacks globally often with severe and disastrous consequences. The swift development of more innovative and effective (cyber)security solutions and approaches are vital which can detect, mitigate and prevent from these serious consequences. Cybersecurity is gaining momentum and is scaling up in very many areas. This book builds on the experience of the Cyber-Trust EU projectâs methods, use cases, technology development, testing and validation and extends into a broader science, lead IT industry market and applied research with practical cases. It offers new perspectives on advanced (cyber) security innovation (eco) systems covering key different perspectives. The book provides insights on new security technologies and methods for advanced cyber threat intelligence, detection and mitigation. We cover topics such as cyber-security and AI, cyber-threat intelligence, digital forensics, moving target defense, intrusion detection systems, post-quantum security, privacy and data protection, security visualization, smart contracts security, software security, blockchain, security architectures, system and data integrity, trust management systems, distributed systems security, dynamic risk management, privacy and ethics
Security Technologies and Methods for Advanced Cyber Threat Intelligence, Detection and Mitigation
The rapid growth of the Internet interconnectivity and complexity of communication systems has led us to a significant growth of cyberattacks globally often with severe and disastrous consequences. The swift development of more innovative and effective (cyber)security solutions and approaches are vital which can detect, mitigate and prevent from these serious consequences. Cybersecurity is gaining momentum and is scaling up in very many areas. This book builds on the experience of the Cyber-Trust EU projectâs methods, use cases, technology development, testing and validation and extends into a broader science, lead IT industry market and applied research with practical cases. It offers new perspectives on advanced (cyber) security innovation (eco) systems covering key different perspectives. The book provides insights on new security technologies and methods for advanced cyber threat intelligence, detection and mitigation. We cover topics such as cyber-security and AI, cyber-threat intelligence, digital forensics, moving target defense, intrusion detection systems, post-quantum security, privacy and data protection, security visualization, smart contracts security, software security, blockchain, security architectures, system and data integrity, trust management systems, distributed systems security, dynamic risk management, privacy and ethics
System steganalysis with automatic fingerprint extraction
This paper tries to tackle the modern challenge of practical steganalysis over large data by presenting a novel approach whose aim is to perform with perfect accuracy and in a completely automatic manner. The objective is to detect changes introduced by the steganographic process in those data objects, including signatures related to the tools being used. Our approach achieves this by first extracting reliable regularities by analyzing pairs of modified and unmodified data objects; then, combines these findings by creating general patterns present on data used for training. Finally, we construct a Naive Bayes model that is used to perform classification, and operates on attributes extracted using the aforementioned patterns. This technique has been be applied for different steganographic tools that operate in media files of several types. We are able to replicate or improve on a number or previously published results, but more importantly, we in addition present new steganalytic findings over a number of popular tools that had no previous known attacks
Deteção de propagação de ameaças e exfiltração de dados em redes empresariais
Modern corporations face nowadays multiple threats within their networks. In an era where companies are tightly dependent on information, these threats can seriously compromise the safety and integrity of sensitive data. Unauthorized access and illicit programs comprise a way of penetrating the corporate networks, able to traversing and propagating to other terminals across the private network, in search of confidential data and business secrets. The efficiency of traditional security defenses are being questioned with the number of data breaches occurred nowadays, being essential the development of new active monitoring systems with artificial intelligence capable to achieve almost perfect detection in very short time frames. However, network monitoring and storage of network activity records are restricted and limited by legal laws
and privacy strategies, like encryption, aiming to protect the confidentiality of private parties. This dissertation proposes methodologies to infer behavior patterns and disclose anomalies from network traffic analysis, detecting slight variations compared with the normal profile. Bounded by network OSI layers 1 to 4, raw data are modeled in features, representing network observations, and posteriorly, processed by machine learning algorithms to classify network activity. Assuming the inevitability of a network terminal to be compromised, this work comprises two scenarios: a self-spreading force that propagates over internal network and a data exfiltration charge which dispatch confidential info to the public network. Although features and modeling processes have been tested for these two cases, it is a generic operation that can be used in
more complex scenarios as well as in different domains. The last chapter describes the proof of concept scenario and how data was generated, along with some evaluation metrics to perceive the modelâs performance. The tests manifested promising results, ranging from 96% to 99% for the propagation case and 86% to 97% regarding data exfiltration.Nos dias de hoje, vĂĄrias organizaçÔes enfrentam mĂșltiplas ameaças no interior da sua rede. Numa Ă©poca onde as empresas dependem cada vez mais da
informação, estas ameaças podem compremeter seriamente a segurança e a integridade de dados confidenciais. O acesso nĂŁo autorizado e o uso de programas ilĂcitos constituem uma forma de penetrar e ultrapassar as barreiras organizacionais, sendo capazes de propagarem-se para outros terminais presentes no interior da rede privada com o intuito de atingir dados confidenciais e segredos comerciais. A eficiĂȘncia da segurança oferecida pelos sistemas de defesa tradicionais estĂĄ a ser posta em causa devido ao elevado nĂșmero de ataques de divulgação de dados sofridos pelas empresas. Desta forma, o desenvolvimento de novos sistemas de monitorização ativos usando inteligĂȘncia artificial Ă© crucial na medida de atingir uma deteção mais precisa em curtos perĂodos de tempo. No entanto, a monitorização e o armazenamento dos registos da atividade da rede sĂŁo restritos e limitados por questĂ”es legais e estratĂ©gias de privacidade, como a cifra dos dados, visando proteger a confidencialidade das entidades. Esta dissertação propĂ”e metodologias para inferir padrĂ”es de comportamento e revelar anomalias atravĂ©s da anĂĄlise de
tråfego que passa na rede, detetando pequenas variaçÔes em comparação com o perfil normal de atividade. Delimitado pelas camadas de rede OSI 1
a 4, os dados em bruto sĂŁo modelados em features, representando observaçÔes de rede e, posteriormente, processados por algoritmos de machine learning para classificar a atividade de rede. Assumindo a inevitabilidade de um terminal ser comprometido, este trabalho compreende dois cenĂĄrios: um ataque que se auto-propaga sobre a rede interna e uma tentativa de exfiltração de dados que envia informaçÔes para a rede pĂșblica. Embora os processos de criação de features e de modelação tenham sido testados para estes dois casos, Ă© uma operação genĂ©rica que pode ser utilizada em cenĂĄrios mais complexos, bem como em domĂnios diferentes. O Ășltimo capĂtulo inclui uma prova de conceito e descreve o mĂ©todo de criação dos dados, com a utilização de algumas mĂ©tricas de avaliação de forma a espelhar a performance do modelo. Os testes mostraram resultados promissores, variando entre 96% e 99% para o caso da propagação e entre 86% e 97% relativamente ao roubo de dados.Mestrado em Engenharia de Computadores e TelemĂĄtic
A monitoring and threat detection system using stream processing as a virtual function for big data
The late detection of security threats causes a significant increase in the risk of irreparable damages, disabling any defense attempt. As a consequence, fast realtime threat detection is mandatory for security guarantees. In addition, Network Function Virtualization (NFV) provides new opportunities for efficient and low-cost security solutions. We propose a fast and efficient threat detection system based on stream processing and machine learning algorithms. The main contributions of this work are i) a novel monitoring threat detection system based on stream processing; ii) two datasets, first a dataset of synthetic security data containing both legitimate and malicious traffic, and the second, a week of real traffic of a telecommunications operator in Rio de Janeiro, Brazil; iii) a data pre-processing algorithm, a normalizing algorithm and an algorithm for fast feature selection based on the correlation between variables; iv) a virtualized network function in an open-source platform for providing a real-time threat detection service; v) near-optimal placement of sensors through a proposed heuristic for strategically positioning sensors in the network infrastructure, with a minimum number of sensors; and, finally, vi) a greedy algorithm that allocates on demand a sequence of virtual network functions.A detecção tardia de ameaças de segurança causa um significante aumento no risco de danos irreparĂĄveis, impossibilitando qualquer tentativa de defesa. Como consequĂȘncia, a detecção rĂĄpida de ameaças em tempo real Ă© essencial para a administração de segurança. AlĂ©m disso, A tecnologia de virtualização de funçÔes de rede (Network Function Virtualization - NFV) oferece novas oportunidades para soluçÔes de segurança eficazes e de baixo custo. Propomos um sistema de detecção de ameaças rĂĄpido e eficiente, baseado em algoritmos de processamento de fluxo e de aprendizado de mĂĄquina. As principais contribuiçÔes deste trabalho sĂŁo: i) um novo sistema de monitoramento e detecção de ameaças baseado no processamento de fluxo; ii) dois conjuntos de dados, o primeiro ÂŽe um conjunto de dados sintĂ©tico de segurança contendo trĂĄfego suspeito e malicioso, e o segundo corresponde a uma semana de trĂĄfego real de um operador de telecomunicaçÔes no Rio de Janeiro, Brasil; iii) um algoritmo de prĂ©-processamento de dados composto por um algoritmo de normalização e um algoritmo para seleção rĂĄpida de caracterĂsticas com base na correlação entre variĂĄveis; iv) uma função de rede virtualizada em uma plataforma de cĂłdigo aberto para fornecer um serviço de detecção de ameaças em tempo real; v) posicionamento quase perfeito de sensores atravĂ©s de uma heurĂstica proposta para posicionamento estratĂ©gico de sensores na infraestrutura de rede, com um nĂșmero mĂnimo de sensores; e, finalmente, vi) um algoritmo guloso que aloca sob demanda uma sequencia de funçÔes de rede virtual
Intrusion Detection from Heterogenous Sensors
RĂSUMĂ
De nos jours, la protection des systĂšmes et rĂ©seaux informatiques contre diffĂ©rentes attaques avancĂ©es et distribuĂ©es constitue un dĂ©fi vital pour leurs propriĂ©taires. Lâune des menaces critiques Ă la sĂ©curitĂ© de ces infrastructures informatiques sont les attaques rĂ©alisĂ©es par des individus dont les intentions sont malveillantes, quâils soient situĂ©s Ă lâintĂ©rieur et Ă lâextĂ©rieur de lâenvironnement du systĂšme, afin dâabuser des services disponibles, ou de rĂ©vĂ©ler des informations confidentielles. Par consĂ©quent, la gestion et la surveillance des systĂšmes informatiques est un dĂ©fi considĂ©rable considĂ©rant que de nouvelles menaces et attaques sont dĂ©couvertes sur une base quotidienne.
Les systĂšmes de dĂ©tection dâintrusion, Intrusion Detection Systems (IDS) en anglais, jouent un rĂŽle clĂ© dans la surveillance et le contrĂŽle des infrastructures de rĂ©seau informatique. Ces systĂšmes inspectent les Ă©vĂ©nements qui se produisent dans les systĂšmes et rĂ©seaux informatiques et en cas de dĂ©tection dâactivitĂ© malveillante, ces derniers gĂ©nĂšrent des alertes afin de fournir les dĂ©tails des attaques survenues. Cependant, ces systĂšmes prĂ©sentent certaines limitations qui mĂ©ritent dâĂȘtre adressĂ©es si nous souhaitons les rendre suffisamment fiables pour rĂ©pondre aux besoins rĂ©els. Lâun des principaux dĂ©fis qui caractĂ©rise les IDS est le grand nombre dâalertes redondantes et non pertinentes ainsi que le taux de faux-positif gĂ©nĂ©rĂ©s, faisant de leur analyse une tĂąche difficile pour les administrateurs de sĂ©curitĂ© qui tentent de dĂ©terminer et dâidentifier les alertes qui sont rĂ©ellement importantes. Une partie du problĂšme rĂ©side dans le fait que la plupart des IDS ne prennent pas compte les informations contextuelles (type de systĂšmes, applications, utilisateurs, rĂ©seaux, etc.) reliĂ©es Ă lâattaque. Ainsi, une grande partie des alertes gĂ©nĂ©rĂ©es par les IDS sont non pertinentes en ce sens quâelles ne permettent de comprendre lâattaque dans son contexte et ce, malgrĂ© le fait que le systĂšme ait rĂ©ussi Ă correctement dĂ©tecter une intrusion. De plus, plusieurs IDS limitent leur dĂ©tection Ă un seul type de capteur, ce qui les rend inefficaces pour dĂ©tecter de nouvelles attaques complexes. Or, ceci est particuliĂšrement important dans le cas des attaques ciblĂ©es qui tentent dâĂ©viter la dĂ©tection par IDS conventionnels et par dâautres produits de sĂ©curitĂ©. Bien que de nombreux administrateurs systĂšme incorporent avec succĂšs des informations de contexte ainsi que diffĂ©rents types de capteurs et journaux dans leurs analyses, un problĂšme important avec cette approche reste le manque dâautomatisation, tant au niveau du stockage que de lâanalyse.
Afin de rĂ©soudre ces problĂšmes dâapplicabilitĂ©, divers types dâIDS ont Ă©tĂ© proposĂ©s dans les derniĂšres annĂ©es, dont les IDS de type composant pris sur Ă©tagĂšre, commercial off-the-shelf (COTS) en anglais, qui sont maintenant largement utilisĂ©s dans les centres dâopĂ©rations de sĂ©curitĂ©, Security Operations Center (SOC) en anglais, de plusieurs grandes organisations. Dâun point de vue plus gĂ©nĂ©ral, les diffĂ©rentes approches proposĂ©es peuvent ĂȘtre classĂ©es en diffĂ©rentes catĂ©gories : les mĂ©thodes basĂ©es sur lâapprentissage machine, tel que les rĂ©seaux bayĂ©siens, les mĂ©thodes dâextraction de donnĂ©es, les arbres de dĂ©cision, les rĂ©seaux de neurones, etc., les mĂ©thodes impliquant la corrĂ©lation dâalertes et les approches fondĂ©es sur la fusion dâalertes, les systĂšmes de dĂ©tection dâintrusion sensibles au contexte, les IDS dit distribuĂ©s et les IDS qui reposent sur la notion dâontologie de base. Ătant donnĂ© que ces diffĂ©rentes approches se concentrent uniquement sur un ou quelques-uns des dĂ©fis courants reliĂ©s aux IDS, au meilleure de notre connaissance, le problĂšme dans son ensemble nâa pas Ă©tĂ© rĂ©solu. Par consĂ©quent, il nâexiste aucune approche permettant de couvrir tous les dĂ©fis des IDS modernes prĂ©cĂ©demment mentionnĂ©s. Par exemple, les systĂšmes qui reposent sur des mĂ©thodes dâapprentissage machine classent les Ă©vĂ©nements sur la base de certaines caractĂ©ristiques en fonction du comportement observĂ© pour un type dâĂ©vĂ©nements, mais ils ne prennent pas en compte les informations reliĂ©es au contexte et les relations pouvant exister entre plusieurs Ă©vĂ©nements. La plupart des techniques de corrĂ©lation dâalerte proposĂ©es ne considĂšrent que la corrĂ©lation entre plusieurs capteurs du mĂȘme type ayant un Ă©vĂ©nement commun et une sĂ©mantique dâalerte similaire (corrĂ©lation homogĂšne), laissant aux administrateurs de sĂ©curitĂ© la tĂąche dâeffectuer la corrĂ©lation entre les diffĂ©rents types de capteurs hĂ©tĂ©rogĂšnes. Pour leur part, les approches sensibles au contexte nâemploient que des aspects limitĂ©s du contexte sous-jacent. Une autre limitation majeure des diffĂ©rentes approches proposĂ©es est lâabsence dâĂ©valuation prĂ©cise basĂ©e sur des ensembles de donnĂ©es qui contiennent des scĂ©narios dâattaque complexes et modernes.
Ă cet effet, lâobjectif de cette thĂšse est de concevoir un systĂšme de corrĂ©lation dâĂ©vĂ©nements qui peut prendre en considĂ©ration plusieurs types hĂ©tĂ©rogĂšnes de capteurs ainsi que les journaux de plusieurs applications (par exemple, IDS/IPS, pare-feu, base de donnĂ©es, systĂšme dâexploitation, antivirus, proxy web, routeurs, etc.). Cette mĂ©thode permettra de dĂ©tecter des attaques complexes qui laissent des traces dans les diffĂ©rents systĂšmes, et dâincorporer les informations de contexte dans lâanalyse afin de rĂ©duire les faux-positifs. Nos contributions peuvent ĂȘtre divisĂ©es en quatre parties principales : 1) Nous proposons la Pasargadae, une solution complĂšte sensible au contexte et reposant sur une ontologie de corrĂ©lation des Ă©vĂ©nements, laquelle effectue automatiquement la corrĂ©lation des Ă©vĂ©nements par lâanalyse des informations recueillies auprĂšs de diverses sources. Pasargadae utilise le concept dâontologie pour reprĂ©senter et stocker des informations sur les Ă©vĂ©nements, le contexte et les vulnĂ©rabilitĂ©s, les scĂ©narios dâattaques, et utilise des rĂšgles dâontologie de logique simple Ă©crites en Semantic Query-Enhance Web Rule Language (SQWRL) afin de corrĂ©ler diverse informations et de filtrer les alertes non pertinentes, en double, et les faux-positifs. 2) Nous proposons une approche basĂ©e sur, mĂ©ta-Ă©vĂ©nement , tri topologique et lâapproche corrĂ©lation dâĂ©vĂ©nement basĂ©e sur sĂ©mantique qui emploie Pasargadae pour effectuer la corrĂ©lation dâĂ©vĂ©nements Ă travers les Ă©vĂ©nements collectĂ©s de plusieurs capteurs rĂ©partis dans un rĂ©seau informatique. 3) Nous proposons une approche alerte de fusion basĂ©e sur sĂ©mantique, contexte sensible, qui sâappuie sur certains des sous-composantes de Pasargadae pour effectuer une alerte fusion hĂ©tĂ©rogĂšne recueillies auprĂšs IDS hĂ©tĂ©rogĂšnes. 4) Dans le but de montrer le niveau de flexibilitĂ© de Pasargadae, nous lâutilisons pour mettre en oeuvre dâautres approches proposĂ©es dâalertes et de corrĂ©lation dâĂ©vĂ©nements. La somme de ces contributions reprĂ©sente une amĂ©lioration significative de lâapplicabilitĂ© et la fiabilitĂ© des IDS dans des situations du monde rĂ©el.
Afin de tester la performance et la flexibilitĂ© de lâapproche de corrĂ©lation dâĂ©vĂ©nements proposĂ©s, nous devons aborder le manque dâinfrastructures expĂ©rimental adĂ©quat pour la sĂ©curitĂ© du rĂ©seau. Une Ă©tude de littĂ©rature montre que les approches expĂ©rimentales actuelles ne sont pas adaptĂ©es pour gĂ©nĂ©rer des donnĂ©es de rĂ©seau de grande fidĂ©litĂ©. Par consĂ©quent, afin dâaccomplir une Ă©valuation complĂšte, dâabord, nous menons nos expĂ©riences sur deux scĂ©narios dâĂ©tude dâanalyse de cas distincts, inspirĂ©s des ensembles de donnĂ©es dâĂ©valuation DARPA 2000 et UNB ISCX IDS. Ensuite, comme une Ă©tude dĂ©posĂ©e complĂšte, nous employons Pasargadae dans un vrai rĂ©seau informatique pour une pĂ©riode de deux semaines pour inspecter ses capacitĂ©s de dĂ©tection sur un vrai terrain trafic de rĂ©seau. Les rĂ©sultats obtenus montrent que, par rapport Ă dâautres amĂ©liorations IDS existants, les contributions proposĂ©es amĂ©liorent considĂ©rablement les performances IDS (taux de dĂ©tection) tout en rĂ©duisant les faux positifs, non pertinents et alertes en double.----------ABSTRACT
Nowadays, protecting computer systems and networks against various distributed and multi-steps attack has been a vital challenge for their owners. One of the essential threats to the security of such computer infrastructures is attacks by malicious individuals from inside and outside of the system environment to abuse available services, or reveal their confidential information. Consequently, managing and supervising computer systems is a considerable challenge, as new threats and attacks are discovered on a daily basis.
Intrusion Detection Systems (IDSs) play a key role in the surveillance and monitoring of computer network infrastructures. These systems inspect events occurred in computer systems and networks and in case of any malicious behavior they generate appropriate alerts describing the attacksâ details. However, there are a number of shortcomings that need to be addressed to make them reliable enough in the real-world situations. One of the fundamental challenges in real-world IDS is the large number of redundant, non-relevant, and false positive alerts that they generate, making it a difficult task for security administrators to determine and identify real and important alerts. Part of the problem is that most of the IDS do not take into account contextual information (type of systems, applications, users, networks, etc.), and therefore a large portion of the alerts are non-relevant in that even though they correctly recognize an intrusion, the intrusion fails to reach its objectives. Additionally, to detect newer and complicated attacks, relying on only one detection sensor type is not adequate, and as a result many of the current IDS are unable to detect them. This is especially important with respect to targeted attacks that try to avoid detection by conventional IDS and by other security products. While many system administrators are known to successfully incorporate context information and many different types of sensors and logs into their analysis, an important problem with this approach is the lack of automation in both storage and analysis. In order to address these problems in IDS applicability, various IDS types have been proposed in the recent years and commercial off-the-shelf (COTS) IDS products have found their way into Security Operations Centers (SOC) of many large organizations. From a general perspective, these works can be categorized into: machine learning based approaches including Bayesian networks, data mining methods, decision trees, neural networks, etc., alert correlation and alert fusion based approaches, context-aware intrusion detection systems, distributed intrusion detection systems, and ontology based intrusion detection systems. To the best of our knowledge, since these works only focus on one or few of the IDS challenges, the problem as a whole has not been resolved. Hence, there is no comprehensive work addressing all the mentioned challenges of modern intrusion detection systems. For example, works that utilize machine learning approaches only classify events based on some features depending on behavior observed with one type of events, and they do not take into account contextual information and event interrelationships. Most of the proposed alert correlation techniques consider correlation only across multiple sensors of the same type having a common event and alert semantics (homogeneous correlation), leaving it to security administrators to perform correlation across heterogeneous types of sensors. Context-aware approaches only employ limited aspects of the underlying context. The lack of accurate evaluation based on the data sets that encompass modern complex attack scenarios is another major shortcoming of most of the proposed approaches.
The goal of this thesis is to design an event correlation system that can correlate across several heterogeneous types of sensors and logs (e.g. IDS/IPS, firewall, database, operating system, anti-virus, web proxy, routers, etc.) in order to hope to detect complex attacks that leave traces in various systems, and incorporate context information into the analysis, in order to reduce false positives. To this end, our contributions can be split into 4 main parts: 1) we propose the Pasargadae comprehensive context-aware and ontology-based event correlation framework that automatically performs event correlation by reasoning on the information collected from various information resources. Pasargadae uses ontologies to represent and store information on events, context and vulnerability information, and attack scenarios, and uses simple ontology logic rules written in Semantic Query-Enhance Web Rule Language (SQWRL) to correlate various information and filter out non-relevant alerts and duplicate alerts, and false positives. 2) We propose a meta-event based, topological sort based and semantic-based event correlation approach that employs Pasargadae to perform event correlation across events collected form several sensors distributed in a computer network. 3) We propose a semantic-based context-aware alert fusion approach that relies on some of the subcomponents of Pasargadae to perform heterogeneous alert fusion collected from heterogeneous IDS. 4) In order to show the level of flexibility of Pasargadae, we use it to implement some other proposed alert and event correlation approaches. The sum of these contributions represent a significant improvement in the applicability and reliability of IDS in real-world situations.
In order to test the performance and flexibility of the proposed event correlation approach, we need to address the lack of experimental infrastructure suitable for network security. A study of the literature shows that current experimental approaches are not appropriate to generate high fidelity network data. Consequently, in order to accomplish a comprehensive evaluation, first, we conduct our experiments on two separate analysis case study scenarios, inspired from the DARPA 2000 and UNB ISCX IDS evaluation data sets. Next, as a complete field study, we employ Pasargadae in a real computer network for a two weeks period to inspect its detection capabilities on a ground truth network traffic. The results obtained show that compared to other existing IDS improvements, the proposed contributions significantly improve IDS performance (detection rate) while reducing false positives, non-relevant and duplicate alerts
Managing Networked IoT Assets Using Practical and Scalable Traffic Inference
The Internet has recently witnessed unprecedented growth of a class of connected assets called the Internet of Things (IoT). Due to relatively immature manufacturing processes and limited computing resources, IoTs have inadequate device-level security measures, exposing the Internet to various cyber risks. Therefore, network-level security has been considered a practical and scalable approach for securing IoTs, but this cannot be employed without discovering the connected devices and characterizing their behavior. Prior research leveraged predictable patterns in IoT network traffic to develop inference models. However, they fall short of expectations in addressing practical challenges, preventing them from being deployed in production settings. This thesis identifies four practical challenges and develops techniques to address them which can help secure businesses and protect user privacy against growing cyber threats.
My first contribution balances prediction gains against computing costs of traffic features for IoT traffic classification and monitoring. I develop a method to find the best set of specialized models for multi-view classification that can reach an average accuracy of 99%, i.e., a similar accuracy compared to existing works but reducing the cost by a factor of 6. I develop a hierarchy of one-class models per asset class, each at certain granularity, to progressively monitor IoT traffic. My second contribution addresses the challenges of measurement costs and data quality. I develop an inference method that uses stochastic and deterministic modeling to predict IoT devices in home networks from opaque and coarse-grained IPFIX flow data. Evaluations show that false positive rates can be reduced by 75% compared to related work without significantly affecting true positives. My third contribution focuses on the challenge of concept drifts by analyzing over six million flow records collected from 12 real home networks. I develop several inference strategies and compare their performance under concept drift, particularly when labeled data is unavailable in the testing phase. Finally, my fourth contribution studies the resilience of machine learning models against adversarial attacks with a specific focus on decision tree-based models. I develop methods to quantify the vulnerability of a given decision tree-based model against data-driven adversarial attacks and refine vulnerable decision trees, making them robust against 92% of adversarial attacks
Designing and Deploying Internet of Things Applications in the Industry: An Empirical Investigation
RĂSUMĂ : LâInternet des objets (IdO) a pour objectif de permettre la connectivitĂ© Ă presque tous les objets trouvĂ©s dans lâespace physique. Il Ă©tend la connectivitĂ© aux objets de tous les jours et oËre la possibilitĂ© de surveiller, de suivre, de se connecter et dâintĂ©ragir plus eĂżcacement avec les actifs industriels. Dans lâindustrie de nos jours, les rĂ©seaux de capteurs connectĂ©s surveillent les mouvements logistiques, fabriquent des machines et aident les organisations Ă amĂ©liorer leur eĂżcacitĂ© et Ă rĂ©duire les coĂ»ts. Cependant, la conception et lâimplĂ©mentation dâun rĂ©seau IdO restent, aujourdâhui, une tĂąche particuliĂšrement diĂżcile. Nous constatons un haut niveau de fragmentation dans le paysage de lâIdO, les dĂ©veloppeurs se complaig-nent rĂ©guliĂšrement de la diĂżcultĂ© Ă intĂ©grer diverses technologies avec des divers objets trouvĂ©s dans les systĂšmes IdO et lâabsence des directives et/ou des pratiques claires pour le dĂ©veloppement et le dĂ©ploiement dâapplication IdO sĂ»res et eĂżcaces. Par consĂ©quent, analyser et comprendre les problĂšmes liĂ©s au dĂ©veloppement et au dĂ©ploiement de lâIdO sont primordiaux pour permettre Ă lâindustrie dâexploiter son plein potentiel.
Dans cette thĂšse, nous examinons les interactions des spĂ©cialistes de lâIdO sur le sites Web populaire, Stack Overflow et Stack Exchange, afin de comprendre les dĂ©fis et les problĂšmes auxquels ils sont confrontĂ©s lors du dĂ©veloppement et du dĂ©ploiement de diËĂ©rentes appli-cations de lâIdO. Ensuite, nous examinons le manque dâinteropĂ©rabilitĂ© entre les techniques dĂ©veloppĂ©es pour lâIdO, nous Ă©tudions les dĂ©fis que leur intĂ©gration pose et nous fournissons des directives aux praticiens intĂ©ressĂ©s par la connexion des rĂ©seaux et des dispositifs de lâIdO pour dĂ©velopper divers services et applications. Dâautre part, la sĂ©curitĂ© Ă©tant essen-tielle au succĂšs de cette technologie, nous Ă©tudions les diËĂ©rentes menaces et dĂ©fis de sĂ©curitĂ© sur les diËĂ©rentes couches de lâarchitecture des systĂšmes de lâIdO et nous proposons des contre-mesures.
Enfin, nous menons une sĂ©rie dâexpĂ©riences qui vise Ă comprendre les avantages et les incon-vĂ©nients des dĂ©ploiements âserverfulâ et âserverlessâ des applications de lâIdO afin de fournir aux praticiens des directives et des recommandations fondĂ©es sur des Ă©lĂ©ments probants relatifs Ă de tels dĂ©ploiements. Les rĂ©sultats prĂ©sentĂ©s reprĂ©sentent une Ă©tape trĂšs importante vers une profonde comprĂ©hension de ces technologies trĂšs prometteuses. Nous estimons que nos recommandations et nos suggestions aideront les praticiens et les bĂątisseurs technologiques Ă amĂ©liorer la qualitĂ© des logiciels et des systĂšmes de lâIdO. Nous espĂ©rons que nos rĂ©sultats pourront aider les communautĂ©s et les consortiums de lâIdO Ă Ă©tablir des normes et des directives pour le dĂ©veloppement, la maintenance, et lâĂ©volution des logiciels de lâIdO.----------ABSTRACT : Internet of Things (IoT) aims to bring connectivity to almost every object found in the phys-ical space. It extends connectivity to everyday things, opens up the possibility to monitor, track, connect, and interact with industrial assets more eĂżciently. In the industry nowadays, we can see connected sensor networks monitor logistics movements, manufacturing machines, and help organizations improve their eĂżciency and reduce costs as well. However, designing and implementing an IoT network today is still a very challenging task. We are witnessing a high level of fragmentation in the IoT landscape and developers regularly complain about the diĂżculty to integrate diverse technologies of various objects found in IoT systems, and the lack of clear guidelines andâor practices for developing and deploying safe and eĂżcient IoT applications. Therefore, analyzing and understanding issues related to the development and deployment of the Internet of Things is utterly important to allow the industry to utilize its fullest potential. In this thesis, we examine IoT practitionersâ discussions on the popular Q&A websites, Stack Overflow and Stack Exchange, to understand the challenges and issues that they face when developing and deploying diËerent IoT applications. Next, we examine the lack of interoper-ability among technologies developed for IoT and study the challenges that their integration poses and provide guidelines for practitioners interested in connecting IoT networks and de-vices to develop various services and applications. Since security issues are center to the success of this technology, we also investigate diËerent security threats and challenges across diËerent layers of the architecture of IoT systems and propose countermeasures. Finally, we conduct a series of experiments to understand the advantages and trade-oËs of serverful and serverless deployments of IoT applications in order to provide practitioners with evidence-based guidelines and recommendations on such deployments. The results presented in this thesis represent a first important step towards a deep understanding of these very promising technologies. We believe that our recommendations and suggestions will help practitioners and technology builders improve the quality of IoT software and systems. We also hope that our results can help IoT communities and consortia establish standards and guidelines for the development, maintenance, and evolution of IoT software and systems