843 research outputs found

    Modélisation formelle des systèmes de détection d'intrusions

    Get PDF
    L’écosystème de la cybersécurité évolue en permanence en termes du nombre, de la diversité, et de la complexité des attaques. De ce fait, les outils de détection deviennent inefficaces face à certaines attaques. On distingue généralement trois types de systèmes de détection d’intrusions : détection par anomalies, détection par signatures et détection hybride. La détection par anomalies est fondée sur la caractérisation du comportement habituel du système, typiquement de manière statistique. Elle permet de détecter des attaques connues ou inconnues, mais génère aussi un très grand nombre de faux positifs. La détection par signatures permet de détecter des attaques connues en définissant des règles qui décrivent le comportement connu d’un attaquant. Cela demande une bonne connaissance du comportement de l’attaquant. La détection hybride repose sur plusieurs méthodes de détection incluant celles sus-citées. Elle présente l’avantage d’être plus précise pendant la détection. Des outils tels que Snort et Zeek offrent des langages de bas niveau pour l’expression de règles de reconnaissance d’attaques. Le nombre d’attaques potentielles étant très grand, ces bases de règles deviennent rapidement difficiles à gérer et à maintenir. De plus, l’expression de règles avec état dit stateful est particulièrement ardue pour reconnaître une séquence d’événements. Dans cette thèse, nous proposons une approche stateful basée sur les diagrammes d’état-transition algébriques (ASTDs) afin d’identifier des attaques complexes. Les ASTDs permettent de représenter de façon graphique et modulaire une spécification, ce qui facilite la maintenance et la compréhension des règles. Nous étendons la notation ASTD avec de nouvelles fonctionnalités pour représenter des attaques complexes. Ensuite, nous spécifions plusieurs attaques avec la notation étendue et exécutons les spécifications obtenues sur des flots d’événements à l’aide d’un interpréteur pour identifier des attaques. Nous évaluons aussi les performances de l’interpréteur avec des outils industriels tels que Snort et Zeek. Puis, nous réalisons un compilateur afin de générer du code exécutable à partir d’une spécification ASTD, capable d’identifier de façon efficiente les séquences d’événements.Abstract : The cybersecurity ecosystem continuously evolves with the number, the diversity, and the complexity of cyber attacks. Generally, we have three types of Intrusion Detection System (IDS) : anomaly-based detection, signature-based detection, and hybrid detection. Anomaly detection is based on the usual behavior description of the system, typically in a static manner. It enables detecting known or unknown attacks but also generating a large number of false positives. Signature based detection enables detecting known attacks by defining rules that describe known attacker’s behavior. It needs a good knowledge of attacker behavior. Hybrid detection relies on several detection methods including the previous ones. It has the advantage of being more precise during detection. Tools like Snort and Zeek offer low level languages to represent rules for detecting attacks. The number of potential attacks being large, these rule bases become quickly hard to manage and maintain. Moreover, the representation of stateful rules to recognize a sequence of events is particularly arduous. In this thesis, we propose a stateful approach based on algebraic state-transition diagrams (ASTDs) to identify complex attacks. ASTDs allow a graphical and modular representation of a specification, that facilitates maintenance and understanding of rules. We extend the ASTD notation with new features to represent complex attacks. Next, we specify several attacks with the extended notation and run the resulting specifications on event streams using an interpreter to identify attacks. We also evaluate the performance of the interpreter with industrial tools such as Snort and Zeek. Then, we build a compiler in order to generate executable code from an ASTD specification, able to efficiently identify sequences of events

    Online Analysis of Dynamic Streaming Data

    Get PDF
    Die Arbeit zum Thema "Online Analysis of Dynamic Streaming Data" beschäftigt sich mit der Distanzmessung dynamischer, semistrukturierter Daten in kontinuierlichen Datenströmen um Analysen auf diesen Datenstrukturen bereits zur Laufzeit zu ermöglichen. Hierzu wird eine Formalisierung zur Distanzberechnung für statische und dynamische Bäume eingeführt und durch eine explizite Betrachtung der Dynamik von Attributen einzelner Knoten der Bäume ergänzt. Die Echtzeitanalyse basierend auf der Distanzmessung wird durch ein dichte-basiertes Clustering ergänzt, um eine Anwendung des Clustering, einer Klassifikation, aber auch einer Anomalieerkennung zu demonstrieren. Die Ergebnisse dieser Arbeit basieren auf einer theoretischen Analyse der eingeführten Formalisierung von Distanzmessungen für dynamische Bäume. Diese Analysen werden unterlegt mit empirischen Messungen auf Basis von Monitoring-Daten von Batchjobs aus dem Batchsystem des GridKa Daten- und Rechenzentrums. Die Evaluation der vorgeschlagenen Formalisierung sowie der darauf aufbauenden Echtzeitanalysemethoden zeigen die Effizienz und Skalierbarkeit des Verfahrens. Zudem wird gezeigt, dass die Betrachtung von Attributen und Attribut-Statistiken von besonderer Bedeutung für die Qualität der Ergebnisse von Analysen dynamischer, semistrukturierter Daten ist. Außerdem zeigt die Evaluation, dass die Qualität der Ergebnisse durch eine unabhängige Kombination mehrerer Distanzen weiter verbessert werden kann. Insbesondere wird durch die Ergebnisse dieser Arbeit die Analyse sich über die Zeit verändernder Daten ermöglicht

    TOWARDS A HOLISTIC EFFICIENT STACKING ENSEMBLE INTRUSION DETECTION SYSTEM USING NEWLY GENERATED HETEROGENEOUS DATASETS

    Get PDF
    With the exponential growth of network-based applications globally, there has been a transformation in organizations\u27 business models. Furthermore, cost reduction of both computational devices and the internet have led people to become more technology dependent. Consequently, due to inordinate use of computer networks, new risks have emerged. Therefore, the process of improving the speed and accuracy of security mechanisms has become crucial.Although abundant new security tools have been developed, the rapid-growth of malicious activities continues to be a pressing issue, as their ever-evolving attacks continue to create severe threats to network security. Classical security techniquesfor instance, firewallsare used as a first line of defense against security problems but remain unable to detect internal intrusions or adequately provide security countermeasures. Thus, network administrators tend to rely predominantly on Intrusion Detection Systems to detect such network intrusive activities. Machine Learning is one of the practical approaches to intrusion detection that learns from data to differentiate between normal and malicious traffic. Although Machine Learning approaches are used frequently, an in-depth analysis of Machine Learning algorithms in the context of intrusion detection has received less attention in the literature.Moreover, adequate datasets are necessary to train and evaluate anomaly-based network intrusion detection systems. There exist a number of such datasetsas DARPA, KDDCUP, and NSL-KDDthat have been widely adopted by researchers to train and evaluate the performance of their proposed intrusion detection approaches. Based on several studies, many such datasets are outworn and unreliable to use. Furthermore, some of these datasets suffer from a lack of traffic diversity and volumes, do not cover the variety of attacks, have anonymized packet information and payload that cannot reflect the current trends, or lack feature set and metadata.This thesis provides a comprehensive analysis of some of the existing Machine Learning approaches for identifying network intrusions. Specifically, it analyzes the algorithms along various dimensionsnamely, feature selection, sensitivity to the hyper-parameter selection, and class imbalance problemsthat are inherent to intrusion detection. It also produces a new reliable dataset labeled Game Theory and Cyber Security (GTCS) that matches real-world criteria, contains normal and different classes of attacks, and reflects the current network traffic trends. The GTCS dataset is used to evaluate the performance of the different approaches, and a detailed experimental evaluation to summarize the effectiveness of each approach is presented. Finally, the thesis proposes an ensemble classifier model composed of multiple classifiers with different learning paradigms to address the issue of detection accuracy and false alarm rate in intrusion detection systems

    Network biology methods for functional characterization and integrative prioritization of disease genes and proteins

    Get PDF
    Nowadays, large amounts of experimental data have been produced by high-throughput techniques, in order to provide more insight into complex phenotypes and cellular processes. The development of a variety of computational and, in particular, network-based approaches to analyze these data have already shed light on previously unknown mechanisms. However, we are still far from a comprehensive understanding of human diseases and their causes as well as appropriate preventive measures and successful therapies. This thesis describes the development of methods and user-friendly software tools for the integrative analysis and interactive visualization of biological networks as well as their application to biomedical data for understanding diseases. We design an integrative phenotype-specific framework for prioritizing candidate disease genes and functionally characterizing similar phenotypes. It is applied to the identification of several disease-relevant genes and processes for inflammatory bowel diseases and primary sclerosing cholangitis as well as for Parkinson's disease. Since finding the causative disease genes does often not suffice to understand diseases, we also concentrate on the molecular characterization of sequence mutations and their effect on protein structure and function. We develop a software suite to support the interactive, multi-layered visual analysis of molecular interaction mechanisms such as protein binding, allostery and drug resistance. To capture the dynamic nature of proteins, we also devise an approach to visualizing and analyzing ensembles of protein structures as, for example, generated by molecular dynamics simulations.In den letzten Jahren wurde mittels Hochdurchsatzverfahren eine große Menge experimenteller Daten generiert, um einen Einblick in komplexe Phänotypen und zelluläre Prozesse zu ermöglichen. Die Entwicklung von verschiedenen bioinformatischen und insbesondere netzwerkbasierten Ansätzen zur Analyse dieser Daten konnte bereits Aufschluss über bisher unbekannte Mechanismen geben. Dennoch sind wir weit entfernt von einem umfassenden Verständnis menschlicher Krankheiten und ihrer Ursachen sowie geeigneter präventiver Maßnahmen und erfolgreicher Therapien. Diese Dissertation beschreibt die Entwicklung von Methoden und benutzerfreundlichen Softwarewerkzeugen für die integrative Analyse und interaktive Visualisierung biologischer Netzwerke sowie ihre Anwendung auf biomedizinische Daten zum Verständnis von http://scidok.sulb.uni-saarland.de/volltexte/incoming/2016/6595/Krankheiten. Wir entwerfen ein integratives, phänotypspezifisches Framework für die Priorisierung potentiell krankheitserregender Gene und die funktionelle Charakterisierung ähnlicher Phänotypen. Es wird angewandt, um mehrere krankheitsspezifische Gene und Prozesse von chronisch-entzündlichen Darmerkrankungen und primär sklerosierender Cholangitis sowie von Parkinson zu bestimmen. Da es für das Verständnis von Krankheiten oft nicht genügt, die krankheitserregenden Gene zu entdecken, konzentrieren wir uns auch auf die molekulare Charakterisierung von Sequenzmutationen und ihren Effekt auf die Proteinstruktur und -funktion. Wir entwickeln eine Software, um die interaktive, vielschichtige visuelle Analyse von molekularen Mechanismen wie Proteinfaltung, Allosterie und Arzneimittelresistenz zu unterstützen. Um den dynamischen Charakter von Proteinen zu erfassen, ersinnen wir auch eine Methode für die Visualisierung und Analyse von Proteinstrukturen, welche sich zum Beispiel während Molekulardynamiksimulationen ergeben

    Network Traffic Measurements, Applications to Internet Services and Security

    Get PDF
    The Internet has become along the years a pervasive network interconnecting billions of users and is now playing the role of collector for a multitude of tasks, ranging from professional activities to personal interactions. From a technical standpoint, novel architectures, e.g., cloud-based services and content delivery networks, innovative devices, e.g., smartphones and connected wearables, and security threats, e.g., DDoS attacks, are posing new challenges in understanding network dynamics. In such complex scenario, network measurements play a central role to guide traffic management, improve network design, and evaluate application requirements. In addition, increasing importance is devoted to the quality of experience provided to final users, which requires thorough investigations on both the transport network and the design of Internet services. In this thesis, we stress the importance of users’ centrality by focusing on the traffic they exchange with the network. To do so, we design methodologies complementing passive and active measurements, as well as post-processing techniques belonging to the machine learning and statistics domains. Traffic exchanged by Internet users can be classified in three macro-groups: (i) Outbound, produced by users’ devices and pushed to the network; (ii) unsolicited, part of malicious attacks threatening users’ security; and (iii) inbound, directed to users’ devices and retrieved from remote servers. For each of the above categories, we address specific research topics consisting in the benchmarking of personal cloud storage services, the automatic identification of Internet threats, and the assessment of quality of experience in the Web domain, respectively. Results comprise several contributions in the scope of each research topic. In short, they shed light on (i) the interplay among design choices of cloud storage services, which severely impact the performance provided to end users; (ii) the feasibility of designing a general purpose classifier to detect malicious attacks, without chasing threat specificities; and (iii) the relevance of appropriate means to evaluate the perceived quality of Web pages delivery, strengthening the need of users’ feedbacks for a factual assessment

    Selected Works in Bioinformatics

    Get PDF
    This book consists of nine chapters covering a variety of bioinformatics subjects, ranging from database resources for protein allergens, unravelling genetic determinants of complex disorders, characterization and prediction of regulatory motifs, computational methods for identifying the best classifiers and key disease genes in large-scale transcriptomic and proteomic experiments, functional characterization of inherently unfolded proteins/regions, protein interaction networks and flexible protein-protein docking. The computational algorithms are in general presented in a way that is accessible to advanced undergraduate students, graduate students and researchers in molecular biology and genetics. The book should also serve as stepping stones for mathematicians, biostatisticians, and computational scientists to cross their academic boundaries into the dynamic and ever-expanding field of bioinformatics

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 16th International Annual Conference on Cyber Security, CNCERT 2020, held in Beijing, China, in August 2020. The 17 papers presented were carefully reviewed and selected from 58 submissions. The papers are organized according to the following topical sections: access control; cryptography; denial-of-service attacks; hardware security implementation; intrusion/anomaly detection and malware mitigation; social network security and privacy; systems security

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    New approaches to protein docking

    Get PDF
    In the first part of this work, we propose new methods for protein docking. First, we present two approaches to protein docking with flexible side chains. The first approach is a fast greedy heuristic, while the second is a branch -&-cut algorithm that yields optimal solutions. For a test set of protease-inhibitor complexes, both approaches correctly predict the true complex structure. Another problem in protein docking is the prediction of the binding free energy, which is the the final step of many protein docking algorithms. Therefore, we propose a new approach that avoids the expensive and difficult calculation of the binding free energy and, instead, employs a scoring function that is based on the similarity of the proton nuclear magnetic resonance spectra of the tentative complexes with the experimental spectrum. Using this method, we could even predict the structure of a very difficult protein-peptide complex that could not be solved using any energy-based scoring functions. The second part of this work presents BALL (Biochemical ALgorithms Library), a framework for Rapid Application Development in the field of Molecular Modeling. BALL provides an extensive set of data structures as well as classes for Molecular Mechanics, advanced solvation methods, comparison and analysis of protein structures, file import/export, NMR shift prediction, and visualization. BALL has been carefully designed to be robust, easy to use, and open to extensions. Especially its extensibility, which results from an object-oriented and generic programming approach, distinguishes it from other software packages.Der erste Teil dieser Arbeit beschäftigt sich mit neuen Ansätzen zum Proteindocking. Zunächst stellen wir zwei Ansätze zum Proteindocking mit flexiblen Seitenketten vor. Der erste Ansatz beruht auf einer schnellen, gierigen Heuristik, während der zweite Ansatz auf branch-&-cut-Techniken beruht und das Problem optimal lösen kann. Beide Ansätze sind in der Lage die korrekte Komplexstruktur für einen Satz von Testbeispielen (bestehend aus Protease-Inhibitor-Komplexen) vorherzusagen. Ein weiteres, grösstenteils ungelöstes, Problem ist der letzte Schritt vieler Protein-Docking-Algorithmen, die Vorhersage der freien Bindungsenthalpie. Daher schlagen wir eine neue Methode vor, die die schwierige und aufwändige Berechnung der freien Bindungsenthalpie vermeidet. Statt dessen wird eine Bewertungsfunktion eingesetzt, die auf der Ähnlichkeit der Protonen-Kernresonanzspektren der potentiellen Komplexstrukturen mit dem experimentellen Spektrum beruht. Mit dieser Methode konnten wir sogar die korrekte Struktur eines Protein-Peptid-Komplexes vorhersagen, an dessen Vorhersage energiebasierte Bewertungsfunktionen scheitern. Der zweite Teil der Arbeit stellt BALL (Biochemical ALgorithms Library) vor, ein Rahmenwerk zur schnellen Anwendungsentwicklung im Bereich MolecularModeling. BALL stellt eine Vielzahl von Datenstrukturen und Algorithmen für die FelderMolekülmechanik,Vergleich und Analyse von Proteinstrukturen, Datei-Import und -Export, NMR-Shiftvorhersage und Visualisierung zur Verfügung. Beim Entwurf von BALL wurde auf Robustheit, einfache Benutzbarkeit und Erweiterbarkeit Wert gelegt. Von existierenden Software-Paketen hebt es sich vor allem durch seine Erweiterbarkeit ab, die auf der konsequenten Anwendung von objektorientierter und generischer Programmierung beruht
    • …
    corecore