498 research outputs found

    LSTM Networks for Detection and Classification of Anomalies in Raw Sensor Data

    Get PDF
    In order to ensure the validity of sensor data, it must be thoroughly analyzed for various types of anomalies. Traditional machine learning methods of anomaly detections in sensor data are based on domain-specific feature engineering. A typical approach is to use domain knowledge to analyze sensor data and manually create statistics-based features, which are then used to train the machine learning models to detect and classify the anomalies. Although this methodology is used in practice, it has a significant drawback due to the fact that feature extraction is usually labor intensive and requires considerable effort from domain experts. An alternative approach is to use deep learning algorithms. Research has shown that modern deep neural networks are very effective in automated extraction of abstract features from raw data in classification tasks. Long short-term memory networks, or LSTMs in short, are a special kind of recurrent neural networks that are capable of learning long-term dependencies. These networks have proved to be especially effective in the classification of raw time-series data in various domains. This dissertation systematically investigates the effectiveness of the LSTM model for anomaly detection and classification in raw time-series sensor data. As a proof of concept, this work used time-series data of sensors that measure blood glucose levels. A large number of time-series sequences was created based on a genuine medical diabetes dataset. Anomalous series were constructed by six methods that interspersed patterns of common anomaly types in the data. An LSTM network model was trained with k-fold cross-validation on both anomalous and valid series to classify raw time-series sequences into one of seven classes: non-anomalous, and classes corresponding to each of the six anomaly types. As a control, the accuracy of detection and classification of the LSTM was compared to that of four traditional machine learning classifiers: support vector machines, Random Forests, naive Bayes, and shallow neural networks. The performance of all the classifiers was evaluated based on nine metrics: precision, recall, and the F1-score, each measured in micro, macro and weighted perspective. While the traditional models were trained on vectors of features, derived from the raw data, that were based on knowledge of common sources of anomaly, the LSTM was trained on raw time-series data. Experimental results indicate that the performance of the LSTM was comparable to the best traditional classifiers by achieving 99% accuracy in all 9 metrics. The model requires no labor-intensive feature engineering, and the fine-tuning of its architecture and hyper-parameters can be made in a fully automated way. This study, therefore, finds LSTM networks an effective solution to anomaly detection and classification in sensor data

    Multi-head CNN–RNN for multi-time series anomaly detection: An industrial case study

    Get PDF
    Detecting anomalies in time series data is becoming mainstream in a wide variety of industrial applications in which sensors monitor expensive machinery. The complexity of this task increases when multiple heterogeneous sensors provide information of di_erent nature, scales and frequencies from the same machine. Traditionally, machine learning techniques require a separate data preprocessing before training, which tends to be very time-consuming and often requires domain knowledge. Recent deep learning approaches have shown to perform well on raw time series data, eliminating the need for pre-processing. In this work, we propose a deep learning based approach for supervised multitime series anomaly detection that combines a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN) in different ways. Unlike other approaches, we use independent CNNs, so-called convolutional heads, to deal with anomaly detection in multi-sensor systems. We address each sensor individually avoiding the need for data pre-processing and allowing for a more tailored architecture for each type of sensor. We refer to this architecture as Multi-head CNN-RNN. The proposed architecture is assessed against a real industrial case study, provided by an industrial partner, where a service elevator is monitored. Within this case study, three type of anomalies are considered: point, context-specific, and collective. The experimental results show that the proposed architecture is suitable for multi-time series anomaly detection as it obtained promising results on the real industrial scenario

    Comparison of LSTM and Transformer Neural Network on multiple approaches for weblogs attack detection

    Get PDF
    El presente trabajo realiza una discusión y comparación de diferentes enfoques y redes neuronales para la clasificación de secuencias, en un contexto de detección de ataques en servicios web. El primer enfoque para la detección de ataques mediante clasificación de logs es la creación de modelos de clasificación basados en caracteres. El segundo en- foque parte de la creación de modelos de lenguaje que predicen la probabilidad del siguiente carácter en una secuencia, que en conjunto con una técnica de cálculo de umbrales para las probabilidades, clasifican los logs para detectar ataques. Estos enfoques fueron trabajados con redes neuronales de tipo LSTM, comunes en el procesamiento de secuencias, así también como con redes neuronales Trans- former. Las redes Transformer han tenido muy buenos resultados en sistemas de traducción de máquina y problemas similares en cuanto a procesamiento de lenguaje natural, pero no ha sido explorado su uso en detección de ataques en base a logs. Para presentar las comparaciones de enfoques y redes neuronales, se realizó un análisis del estado del arte, de los enfoques a aplicar y se realizaron múlti- ples experimentos. Estos experimentos implicaron el desarrollo de códigos para el análisis, transformación y preparación de los data sets, así como el entrenamiento y evaluación de los modelos y clasificaciones. Finalmente se plantean conclusiones sobre el uso de cada enfoque y red neu- ronal, así como el planteo de futuros trabajos que puedan mejorar y responder cuestiones encontradas en el proyecto.Agencia Nacional de Investigación e Innovació

    Comparison of LSTM and Transformer Neural Network on multiple approaches for weblogs attack detection

    Get PDF
    El presente trabajo realiza una discusión y comparación de diferentes enfoques y redes neuronales para la clasificación de secuencias, en un contexto de detección de ataques en servicios web. El primer enfoque para la detección de ataques mediante clasificación de logs es la creación de modelos de clasificación basados en caracteres. El segundo en- foque parte de la creación de modelos de lenguaje que predicen la probabilidad del siguiente carácter en una secuencia, que en conjunto con una técnica de cálculo de umbrales para las probabilidades, clasifican los logs para detectar ataques. Estos enfoques fueron trabajados con redes neuronales de tipo LSTM, comunes en el procesamiento de secuencias, así también como con redes neuronales Trans- former. Las redes Transformer han tenido muy buenos resultados en sistemas de traducción de máquina y problemas similares en cuanto a procesamiento de lenguaje natural, pero no ha sido explorado su uso en detección de ataques en base a logs. Para presentar las comparaciones de enfoques y redes neuronales, se realizó un análisis del estado del arte, de los enfoques a aplicar y se realizaron múlti- ples experimentos. Estos experimentos implicaron el desarrollo de códigos para el análisis, transformación y preparación de los data sets, así como el entrenamiento y evaluación de los modelos y clasificaciones. Finalmente se plantean conclusiones sobre el uso de cada enfoque y red neu- ronal, así como el planteo de futuros trabajos que puedan mejorar y responder cuestiones encontradas en el proyecto.Agencia Nacional de Investigación e Innovació

    Recurrent Neural Network Architectures Toward Intrusion Detection

    Get PDF
    Recurrent Neural Networks (RNN) show a remarkable result in sequence learning, particularly in architectures with gated unit structures such as Long Short-term Memory (LSTM). In recent years, several permutations of LSTM architecture have been proposed mainly to overcome the computational complexity of LSTM. In this dissertation, a novel study is presented that will empirically investigate and evaluate LSTM architecture variants such as Gated Recurrent Unit (GRU), Bi-Directional LSTM, and Dynamic-RNN for LSTM and GRU specifically on detecting network intrusions. The investigation is designed to identify the learning time required for each architecture algorithm and to measure the intrusion prediction accuracy. RNN was evaluated on the DARPA/KDD Cup’99 intrusion detection dataset for each architecture. Feature selection mechanisms were also implemented to help in identifying and removing nonessential variables from data that do not affect the accuracy of the prediction models, in this case Principle Component Analysis (PCA) and the RandomForest (RF) algorithm. The results showed that RF captured more significant features over PCA when the accuracy for RF 97.86% for LSTM and 96.59% for GRU, were PCA 64.34% for LSTM and 67.97% for GRU. In terms of RNN architectures, prediction accuracy of each variant exhibited improvement at specific parameters, yet with a large dataset and a suitable time training, the standard vanilla LSTM tended to lead among all other RNN architectures which scored 99.48%. Although Dynamic RNN’s offered better performance with accuracy, Dynamic-RNN GRU scored 99.34%, however they tended to take a longer time to be trained with high training cycles, Dynamic-RNN LSTM needs 25284.03 seconds at 1000 training cycle. GRU architecture had one variant introduced to reduce LSTM complexity, which developed with fewer parameters resulting in a faster-trained model compared to LSTM needs 1903.09 seconds when LSTM required 2354.93 seconds for the same training cycle. It also showed equivalent performance with respect to the parameters such as hidden layers and time-step. BLSTM offered impressive training time as 190 seconds at 100 training cycle, though the accuracy was below that of the other RNN architectures which didn’t exceed 90%

    Adversarially Reweighted Sequence Anomaly Detection With Limited Log Data

    Get PDF
    In the realm of safeguarding digital systems, the ability to detect anomalies in log sequences is paramount, with applications spanning cybersecurity, network surveillance, and financial transaction monitoring. This thesis presents AdvSVDD, a sophisticated deep learning model designed for sequence anomaly detection. Built upon the foundation of Deep Support Vector Data Description (Deep SVDD), AdvSVDD stands out by incorporating Adversarial Reweighted Learning (ARL) to enhance its performance, particularly when confronted with limited training data. By leveraging the Deep SVDD technique to map normal log sequences into a hypersphere and harnessing the amplification effects of Adversarial Reweighted Learning, AdvSVDD demonstrates remarkable efficacy in anomaly detection. Empirical evaluations on the BlueGene/L (BG/L) and Thunderbird supercomputer datasets showcase AdvSVDD’s superiority over conventional machine learning and deep learning approaches, including the foundational Deep SVDD framework. Performance metrics such as Precision, Recall, F1-Score, ROC AUC, and PR AUC attest to its proficiency. Furthermore, the study emphasizes AdvSVDD’s effectiveness under constrained training data and offers valuable insights into the role of adversarial component has in the enhancement of anomaly detection

    Optimizing Bayesian Recurrent Neural Networks on an FPGA-based Accelerator

    Get PDF
    Neural networks have demonstrated their outstanding performance in a wide range of tasks. Specifically recurrent architectures based on long-short term memory (LSTM) cells have manifested excellent capability to model time dependencies in real-world data. However, standard recurrent architectures cannot estimate their uncertainty which is essential for safety-critical applications such as in medicine. In contrast, Bayesian recurrent neural networks (RNNs) are able to provide uncertainty estimation with improved accuracy. Nonetheless, Bayesian RNNs are computationally and memory demanding, which limits their practicality despite their advantages. To address this issue, we propose an FPGA-based hardware design to accelerate Bayesian LSTM-based RNNs. To further improve the overall algorithmic-hardware performance, a co-design framework is proposed to explore the most fitting algorithmic-hardware configurations for Bayesian RNNs. We conduct extensive experiments on healthcare applications to demonstrate the improvement of our design and the effectiveness of our framework. Compared with GPU implementation, our FPGA-based design can achieve up to 10 times speedup with nearly 106 times higher energy efficiency. To the best of our knowledge, this is the first work targeting acceleration of Bayesian RNNs on FPGAs
    corecore