11 research outputs found

    On the Detection Capabilities of Signature-Based Intrusion Detection Systems in the Context of Web Attacks

    Get PDF
    This work has been partly funded by the research grant PID2020-115199RB-I00 provided by the Spanish ministry of Industry under the contract MICIN/AEI/10.13039/501100011033, and also by FEDER/Junta de Andalucia-Consejeria de Transformacion Economica, Industria, Conocimiento y Universidades under project PYC20-RE-087-USE.Signature-based Intrusion Detection Systems (SIDS) play a crucial role within the arsenal of security components of most organizations. They can find traces of known attacks in the network traffic or host events for which patterns or signatures have been pre-established. SIDS include standard packages of detection rulesets, but only those rules suited to the operational environment should be activated for optimal performance. However, some organizations might skip this tuning process and instead activate default off-the-shelf rulesets without understanding its implications and trade-offs. In this work, we help gain insight into the consequences of using predefined rulesets in the performance of SIDS. We experimentally explore the performance of three SIDS in the context of web attacks. In particular, we gauge the detection rate obtained with predefined subsets of rules for Snort, ModSecurity and Nemesida using seven attack datasets. We also determine the precision and rate of alert generated by each detector in a real-life case using a large trace from a public webserver. Results show that the maximum detection rate achieved by the SIDS under test is insufficient to protect systems effectively and is lower than expected for known attacks. Our results also indicate that the choice of predefined settings activated on each detector strongly influences its detection capability and false alarm rate. Snort and ModSecurity scored either a very poor detection rate (activating the less-sensitive predefined ruleset) or a very poor precision (activating the full ruleset). We also found that using various SIDS for a cooperative decision can improve the precision or the detection rate, but not both. Consequently, it is necessary to reflect upon the role of these open-source SIDS with default configurations as core elements for protection in the context of web attacks. Finally, we provide an efficient method for systematically determining which rules deactivate from a ruleset to significantly reduce the false alarm rate for a target operational environment. We tested our approach using Snort’s ruleset in our real-life trace, increasing the precision from 0.015 to 1 in less than 16 h of work.Spanish Government PID2020-115199RB-I00 MICIN/AEI/10.13039/501100011033FEDER/Junta de Andalucia-Consejeria de Transformacion Economica, Industria, Conocimiento y Universidades PYC20-RE-087-US

    How Much Training Data Is Enough? A Case Study for HTTP Anomaly-Based Intrusion Detection

    Get PDF
    Most anomaly-based intrusion detectors rely on models that learn from training datasets whose quality is crucial in their performance. Albeit the properties of suitable datasets have been formulated, the influence of the dataset size on the performance of the anomaly-based detector has received scarce attention so far. In this work, we investigate the optimal size of a training dataset. This size should be large enough so that training data is representative of normal behavior, but after that point, collecting more data may result in unnecessary waste of time and computational resources, not to mention an increased risk of overtraining. In this spirit, we provide a method to find out when the amount of data collected at the production environment is representative of normal behavior in the context of a detector of HTTP URI attacks based on 1-grammar. Our approach is founded on a set of indicators related to the statistical properties of the data. These indicators are periodically calculated during data collection, producing time series that stabilize when more training data is not expected to translate to better system performance, which indicates that data collection can be stopped.We present a case study with real-life datasets collected at the University of Seville (Spain) and a public dataset from the University of Saskatchewan. The application of our method to these datasets showed that more than 42% of one trace, and almost 20% of another were unnecessarily collected, thereby showing that our proposed method can be an efficient approach for collecting training data at the production environment.This work was supported in part by the Corporación Tecnológica de Andalucía and the University of Seville through the Projects under Grant CTA 1669/22/2017, Grant PI-1786/22/2018, and Grant PI-1736/22/2017

    Fusing Information from Tickets and Alerts to Improve the Incident Resolution Process

    Get PDF
    In the context of network incident monitoring, alerts are useful notifications that provide IT management staff with information about incidents. They are usually triggered in an automatic manner by network equipment and monitoring systems, thus containing only technical information available to the systems that are generating them. On the other hand, ticketing systems play a different role in this context. Tickets represent the business point of view of incidents. They are usually generated by human intervention and contain enriched semantic information about ongoing and past incidents. In this article, our main hypothesis is that incorporating tickets information into the alert correlation process will be beneficial to the incident resolution life-cycle in terms of accuracy, timing, and overall incident’s description. We propose a methodology to validate this hypothesis and suggest a solution to the main challenges that appear. The proposed correlation approach is based on the time alignment of the events (alerts and tickets) that affect common elements in the network. For this we use real alert and ticket datasets obtained from a large telecommunications network. The results have shown that using ticket information enhances the incident resolution process, mainly by reducing and aggregating a higher percentage of alerts compared with standard alert correlation systems that only use alerts as the main source of information. Finally, we also show the applicability and usability of this model by applying it to a case study where we analyze the performance of the management staff

    Smart home anomaly-based IDS: Architecture proposal and case study

    No full text
    The complexity and diversity of the technologies involved in the Internet of Things (IoT) challenge the generalization of security solutions based on anomaly detection, which should fit the particularities of each context and deployment and allow for performance comparison. In this work, we provide a flexible architecture based on building blocks suited for detecting anomalies in the network traffic and the application-layer data exchanged by IoT devices in the context of Smart Home. Following this architecture, we have defined a particular Intrusion Detector System (IDS) for a case study that uses a public dataset with the electrical consumption of 21 home devices over one year. In particular, we have defined ten Indicators of Compromise (IoC) to detect network attacks and two anomaly detectors to detect false command or data injection attacks. We have also included a signature-based IDS (Snort) to extend the detection range to known attacks. We have reproduced eight network attacks (e.g., DoS, scanning) and four False Command or Data Injection attacks to test our IDS performance. The results show that all attacks were successfully detected by our IoCs and anomaly detectors with a false positive rate lower than 0.3%. Signature detection was able to detect only 4 out of 12 attacks. Our architecture and the IDS developed can be a reference for developing future IDS suited to different contexts or use cases. Given that we use a public dataset, our contribution can also serve as a baseline for comparison with new techniques that improve detection performanc

    Validación de un sistema de diálogo mediante el uso de diferentes umbrales de poda en el proceso de reconocimiento automático de voz

    Get PDF
    Este artículo presenta una nueva metodología cuyo objetivo es validar el funcionamiento de los sistemas de diálogo centrándose en dos aspectos: tiempo de respuesta y porcentaje de comprensión de frases. En primer lugar, el artículo realiza una descripción de la interfaz de entrada del sistema de diálogo usado en los experimentos, incluyendo una clasificación de las tareas de reconocimiento consideradas. A continuación, presenta los fundamentos de la técnica propuesta y muestra una aplicación de la misma para validar el funcionamiento del sistema de diálogo. Seguidamente muestra los resultados experimentales, los cuales indican, por una parte, que seis de las nueve tareas de reconocimiento diseñadas pueden ser consideradas válidas, pues se cumplen los requisitos impuestos respecto a tiempo de reconocimiento y porcentaje de comprensión de frases. Por otra parte, los resultados indican que para validar el funcionamiento del sistema es necesario cambiar las estrategias empleadas en las tres tareas restantes. Finalmente, el artículo muestra algunas líneas de trabajo futuro, encaminadas a utilizar nuevas estrategias para mejorar el funcionamiento del sistema de diálogo.This paper presents a new technique to validate the performance of dialogue systems focusing on two measures: response time a sentence understanding. Initially, the paper presents a description of the input interface of the dialogue system used in the experiments, including a classification of the recognition tasks considered. Later, it presents the basic features of the proposed technique and describes how the technique has been applied to validate the performance of the dialogue system. Later the paper presents the experimental results which, on the one hand, show that six out of the nine recognition tasks employed by the system can be considered validated, since the imposed restrictions on recognition time and sentence understanding are kept. On the other hand, the results show that for improving the system it is necessary to change the strategies used for the remainder three tasks. Finally, the paper shows some possibilities for future work related to the new strategies employable to enhance the performance of the dialogue system

    Validación de un sistema de diálogo mediante el uso de diferentes umbrales de poda en el proceso de reconocimiento automático de voz

    No full text
    This paper presents a new technique to validate the performance of dialogue systems focusing on two measures: response time a sentence understanding. Initially, the paper presents a description of the input interface of the dialogue system used in the experiments, including a classification of the recognition tasks considered. Later, it presents the basic features of the proposed technique and describes how the technique has been applied to validate the performance of the dialogue system. Later the paper presents the experimental results which, on the one hand, show that six out of the nine recognition tasks employed by the system can be considered validated, since the imposed restrictions on recognition time and sentence understanding are kept. On the other hand, the results show that for improving the system it is necessary to change the strategies used for the remainder three tasks. Finally, the paper shows some possibilities for future work related to the new strategies employable to enhance the performance of the dialogue system.Este artículo presenta una nueva metodología cuyo objetivo es validar el funcionamiento de los sistemas de diálogo centrándose en dos aspectos: tiempo de respuesta y porcentaje de comprensión de frases. En primer lugar, el artículo realiza una descripción de la interfaz de entrada del sistema de diálogo usado en los experimentos, incluyendo una clasificación de las tareas de reconocimiento consideradas. A continuación, presenta los fundamentos de la técnica propuesta y muestra una aplicación de la misma para validar el funcionamiento del sistema de diálogo. Seguidamente muestra los resultados experimentales, los cuales indican, por una parte, que seis de las nueve tareas de reconocimiento diseñadas pueden ser consideradas válidas, pues se cumplen los requisitos impuestos respecto a tiempo de reconocimiento y porcentaje de comprensión de frases. Por otra parte, los resultados indican que para validar el funcionamiento del sistema es necesario cambiar las estrategias empleadas en las tres tareas restantes. Finalmente, el artículo muestra algunas líneas de trabajo futuro, encaminadas a utilizar nuevas estrategias para mejorar el funcionamiento del sistema de diálogo

    Resultados preliminares sobre SLHMM

    No full text

    Resultados preliminares sobre SLHMM

    No full text
    En este trabajo se propone un nuevo sistema híbrido para el reconocimiento de voz continua que integra HMM y ANN. Dicho sistema se compone de 3 clases de bloques (LVQ, SLHMM y DP), todos ellos redes neuronales, si bien los denominados SLHMM pueden ser interpretados y entrenados de acuerdo con los HMM. Un SLHMM es, básicamente, una expansión en una red con un número fijo de capas de una red neuronal recurrente con una topología conveniente. Se presentan algunos resultados experimentales preliminares que, comparados con los obtenidos a partir de un sistema basado únicamente en HMM, muestran un incremento en el rendimiento del sistema sencillamente debido a la topología utilizada

    Entrenamiento discriminativo para HMM utilizando redes neuronales recurrentes

    No full text
    En el presente artículo se presentan los resultados obtenidos a partir de una estructura en red para los Modelos Ocultos de Markov, aplicados al reconocimiento del habla. La topología de la red es la de una Red Neuronal Recurrente, en la que cada iteración temporal es identificada con una capa. El entrenamiento de dicha red se realiza mediante técnicas de retropropagación. Dos tipos de medidas de error se utilizan para el entrenamiento: máxima semejanza y entrenamiento discriminativo. La aplicación de las técnicas de retropropagación para la reestimación de los HMM-RNN en el caso de entrenamiento por máxima semejanza proporciona las mismas ecuaciones de reestimación que el algoritmo de Baum-Welch utilizado para entrenar los HMM. El entrenamiento discriminativo se basa en la probabilidad de clasificación correcta de las secuencias a partir de la medida de máxima semejanza. Los resultados obtenidos han demostrado que el mejor procedimiento para entrenar los RNN-HMM consiste en realizar una primera estimación mediante la medida de máxima semejanza para, posteriormente, reentrenarlos mediante el algoritmo de entrenamiento discriminativo
    corecore