1,249 research outputs found

    End-to-end anomaly detection in stream data

    Get PDF
    Nowadays, huge volumes of data are generated with increasing velocity through various systems, applications, and activities. This increases the demand for stream and time series analysis to react to changing conditions in real-time for enhanced efficiency and quality of service delivery as well as upgraded safety and security in private and public sectors. Despite its very rich history, time series anomaly detection is still one of the vital topics in machine learning research and is receiving increasing attention. Identifying hidden patterns and selecting an appropriate model that fits the observed data well and also carries over to unobserved data is not a trivial task. Due to the increasing diversity of data sources and associated stochastic processes, this pivotal data analysis topic is loaded with various challenges like complex latent patterns, concept drift, and overfitting that may mislead the model and cause a high false alarm rate. Handling these challenges leads the advanced anomaly detection methods to develop sophisticated decision logic, which turns them into mysterious and inexplicable black-boxes. Contrary to this trend, end-users expect transparency and verifiability to trust a model and the outcomes it produces. Also, pointing the users to the most anomalous/malicious areas of time series and causal features could save them time, energy, and money. For the mentioned reasons, this thesis is addressing the crucial challenges in an end-to-end pipeline of stream-based anomaly detection through the three essential phases of behavior prediction, inference, and interpretation. The first step is focused on devising a time series model that leads to high average accuracy as well as small error deviation. On this basis, we propose higher-quality anomaly detection and scoring techniques that utilize the related contexts to reclassify the observations and post-pruning the unjustified events. Last but not least, we make the predictive process transparent and verifiable by providing meaningful reasoning behind its generated results based on the understandable concepts by a human. The provided insight can pinpoint the anomalous regions of time series and explain why the current status of a system has been flagged as anomalous. Stream-based anomaly detection research is a principal area of innovation to support our economy, security, and even the safety and health of societies worldwide. We believe our proposed analysis techniques can contribute to building a situational awareness platform and open new perspectives in a variety of domains like cybersecurity, and health

    Decision Support Elements and Enabling Techniques to Achieve a Cyber Defence Situational Awareness Capability

    Full text link
    [ES] La presente tesis doctoral realiza un análisis en detalle de los elementos de decisión necesarios para mejorar la comprensión de la situación en ciberdefensa con especial énfasis en la percepción y comprensión del analista de un centro de operaciones de ciberseguridad (SOC). Se proponen dos arquitecturas diferentes basadas en el análisis forense de flujos de datos (NF3). La primera arquitectura emplea técnicas de Ensemble Machine Learning mientras que la segunda es una variante de Machine Learning de mayor complejidad algorítmica (lambda-NF3) que ofrece un marco de defensa de mayor robustez frente a ataques adversarios. Ambas propuestas buscan automatizar de forma efectiva la detección de malware y su posterior gestión de incidentes mostrando unos resultados satisfactorios en aproximar lo que se ha denominado un SOC de próxima generación y de computación cognitiva (NGC2SOC). La supervisión y monitorización de eventos para la protección de las redes informáticas de una organización debe ir acompañada de técnicas de visualización. En este caso, la tesis aborda la generación de representaciones tridimensionales basadas en métricas orientadas a la misión y procedimientos que usan un sistema experto basado en lógica difusa. Precisamente, el estado del arte muestra serias deficiencias a la hora de implementar soluciones de ciberdefensa que reflejen la relevancia de la misión, los recursos y cometidos de una organización para una decisión mejor informada. El trabajo de investigación proporciona finalmente dos áreas claves para mejorar la toma de decisiones en ciberdefensa: un marco sólido y completo de verificación y validación para evaluar parámetros de soluciones y la elaboración de un conjunto de datos sintéticos que referencian unívocamente las fases de un ciberataque con los estándares Cyber Kill Chain y MITRE ATT & CK.[CA] La present tesi doctoral realitza una anàlisi detalladament dels elements de decisió necessaris per a millorar la comprensió de la situació en ciberdefensa amb especial èmfasi en la percepció i comprensió de l'analista d'un centre d'operacions de ciberseguretat (SOC). Es proposen dues arquitectures diferents basades en l'anàlisi forense de fluxos de dades (NF3). La primera arquitectura empra tècniques de Ensemble Machine Learning mentre que la segona és una variant de Machine Learning de major complexitat algorítmica (lambda-NF3) que ofereix un marc de defensa de major robustesa enfront d'atacs adversaris. Totes dues propostes busquen automatitzar de manera efectiva la detecció de malware i la seua posterior gestió d'incidents mostrant uns resultats satisfactoris a aproximar el que s'ha denominat un SOC de pròxima generació i de computació cognitiva (NGC2SOC). La supervisió i monitoratge d'esdeveniments per a la protecció de les xarxes informàtiques d'una organització ha d'anar acompanyada de tècniques de visualització. En aquest cas, la tesi aborda la generació de representacions tridimensionals basades en mètriques orientades a la missió i procediments que usen un sistema expert basat en lògica difusa. Precisament, l'estat de l'art mostra serioses deficiències a l'hora d'implementar solucions de ciberdefensa que reflectisquen la rellevància de la missió, els recursos i comeses d'una organització per a una decisió més ben informada. El treball de recerca proporciona finalment dues àrees claus per a millorar la presa de decisions en ciberdefensa: un marc sòlid i complet de verificació i validació per a avaluar paràmetres de solucions i l'elaboració d'un conjunt de dades sintètiques que referencien unívocament les fases d'un ciberatac amb els estàndards Cyber Kill Chain i MITRE ATT & CK.[EN] This doctoral thesis performs a detailed analysis of the decision elements necessary to improve the cyber defence situation awareness with a special emphasis on the perception and understanding of the analyst of a cybersecurity operations center (SOC). Two different architectures based on the network flow forensics of data streams (NF3) are proposed. The first architecture uses Ensemble Machine Learning techniques while the second is a variant of Machine Learning with greater algorithmic complexity (lambda-NF3) that offers a more robust defense framework against adversarial attacks. Both proposals seek to effectively automate the detection of malware and its subsequent incident management, showing satisfactory results in approximating what has been called a next generation cognitive computing SOC (NGC2SOC). The supervision and monitoring of events for the protection of an organisation's computer networks must be accompanied by visualisation techniques. In this case, the thesis addresses the representation of three-dimensional pictures based on mission oriented metrics and procedures that use an expert system based on fuzzy logic. Precisely, the state-of-the-art evidences serious deficiencies when it comes to implementing cyber defence solutions that consider the relevance of the mission, resources and tasks of an organisation for a better-informed decision. The research work finally provides two key areas to improve decision-making in cyber defence: a solid and complete verification and validation framework to evaluate solution parameters and the development of a synthetic dataset that univocally references the phases of a cyber-attack with the Cyber Kill Chain and MITRE ATT & CK standards.Llopis Sánchez, S. (2023). Decision Support Elements and Enabling Techniques to Achieve a Cyber Defence Situational Awareness Capability [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/19424

    Cybersecurity knowledge graphs

    Get PDF
    Cybersecurity knowledge graphs, which represent cyber-knowledge with a graph-based data model, provide holistic approaches for processing massive volumes of complex cybersecurity data derived from diverse sources. They can assist security analysts to obtain cyberthreat intelligence, achieve a high level of cyber-situational awareness, discover new cyber-knowledge, visualize networks, data flow, and attack paths, and understand data correlations by aggregating and fusing data. This paper reviews the most prominent graph-based data models used in this domain, along with knowledge organization systems that define concepts and properties utilized in formal cyber-knowledge representation for both background knowledge and specific expert knowledge about an actual system or attack. It is also discussed how cybersecurity knowledge graphs enable machine learning and facilitate automated reasoning over cyber-knowledge
    corecore