8,699 research outputs found

    ALGAN: Time Series Anomaly Detection with Adjusted-LSTM GAN

    Full text link
    Anomaly detection in time series data, to identify points that deviate from normal behaviour, is a common problem in various domains such as manufacturing, medical imaging, and cybersecurity. Recently, Generative Adversarial Networks (GANs) are shown to be effective in detecting anomalies in time series data. The neural network architecture of GANs (i.e. Generator and Discriminator) can significantly improve anomaly detection accuracy. In this paper, we propose a new GAN model, named Adjusted-LSTM GAN (ALGAN), which adjusts the output of an LSTM network for improved anomaly detection in both univariate and multivariate time series data in an unsupervised setting. We evaluate the performance of ALGAN on 46 real-world univariate time series datasets and a large multivariate dataset that spans multiple domains. Our experiments demonstrate that ALGAN outperforms traditional, neural network-based, and other GAN-based methods for anomaly detection in time series data

    MIM-GAN-based Anomaly Detection for Multivariate Time Series Data

    Full text link
    The loss function of Generative adversarial network(GAN) is an important factor that affects the quality and diversity of the generated samples for anomaly detection. In this paper, we propose an unsupervised multiple time series anomaly detection algorithm based on the GAN with message importance measure(MIM-GAN). In particular, the time series data is divided into subsequences using a sliding window. Then a generator and a discriminator designed based on the Long Short-Term Memory (LSTM) are employed to capture the temporal correlations of the time series data. To avoid the local optimal solution of loss function and the model collapse, we introduce an exponential information measure into the loss function of GAN. Additionally, a discriminant reconstruction score consisting on discrimination and reconstruction loss is taken into account. The global optimal solution for the loss function is derived and the model collapse is proved to be avoided in our proposed MIM-GAN-based anomaly detection algorithm. Experimental results show that the proposed MIM-GAN-based anomaly detection algorithm has superior performance in terms of precision, recall, and F1 score.Comment: 7 pages,6 figure

    GANs para detecção de anomalias em séries temporais : um estudo de caso

    Get PDF
    O problema geral da detecção de anomalias se manifesta em diversos campos e se relaciona intimamente com inúmeros problemas específicos. A formulação habitual totalmente não supervisionada gera dificuldades adicionais na obtenção de representações relevantes para o problema, e restringe os métodos aplicáveis. Nesse contexto, o grande sucesso recente de soluções baseadas em GANs na modelagem de distribuições e processos arbitrários a partir de dados não supervisionados suscita grande interesse na sua aplicação ao problema de detecção de anomalias. Com objetivo de abordar esse tema, a aplicação de soluções baseadas em GANs para detecção de anomalias no contexto não supervisionado em séries temporais foi estudada. A partir de uma revisão da literatura dos princípios gerais de GANs e detecção de anomalias, trabalhos recentes aplicando GANs à séries temporais foram compilados e apresentados. Em sequência, um método específico, TadGan (GEIGER et al., 2020), foi selecionado para experimentação e estudos aprofundados sob o formato de estudo de caso. Uma implementação foi obtida e verificada, e uma metodologia para demonstrar o funcionamento e os princípios gerais do método e da aplicação de GANs às séries temporais sobre dados sintetizados a partir de funções analíticas desenvolvida e executada. Avaliou-se, em sequência, possíveis limitações do método, extraídas da literatura e propostas com base nos ensaios executados. Explorou-se a instabilidade do treinamento, e os possíveis impactos da entropia e características do processo de interesse na capacidade de detecção de anomalias. Sinais foram então sintetizados com a adição de tipos específicos de anomalias, a fim de verificar a generalidade do método quanto à natureza das anomalias, e uma coleção de sinais reais de domínios diversos compilados do conjunto UCR Anomaly Benchmark, de maneira a serem aplicados ao método. Por fim, alterações no método foram propostas, com maneiras alternativas de quantificar a anormalidade a partir dos modelos obtidos, e brevemente avaliadas. Os resultados obtidos permitiram a verificação e corroboração da grande aplicabilidade de GANs para detecção de anomalias em séries temporais, bem como da utilidade de experimentação com dados sintéticos analíticos para desenvolvimento de compreensão e validação de modificações. A exploração das limitações efetuadas permitiu o desenvolvimento de intuições sobre seus impactos no método, e sugeriram a possibilidade de influência de características do processo alvo na performance, e as modificações propostas apresentaram potencial de ganhos de performance, e apontaram a necessidade de estudos futuros aprofundados para a investigação posterior.The general problem of unsupervised anomaly detection in time series has applications in several different fields and is related to many specific problems. In the context of time series data, however, expert knowledge in the target application is often required in order to extract meaningful features of the process, which can be expansive and at times not possible. The field of Deep Learning provided techniques to tackle such problems with the possibility of automatic features extractions techniques, and present great potential in time series anomaly detection. The need for labeled data, however, restricts the direct application of several methods. GAN-based solutions have recently presented great performance in modeling arbitrary data distribution in unsupervised problems, showing a considerable conceptual potential in anomaly detection. In that context, with the goal of exploring the potential and applicability of GAN-based solutions for time series anomaly detection, the literature was reviewed for GAN and anomaly detection principles, and recent works specifically on GAN-based methods for time series anomaly detection summarized and presented. In sequence, a method was selected, TadGan (GEIGER et al., 2020), due to the presence of the main principles of GAN application to anomaly detection and its good reported performance in public benchmarks, for detailed investigation and exploration. An implementation of the method was obtained, and verified over a partial reproduction of the original article results. A series of experiments over synthetic generated data from analytical functions were then proposed and executed in order to verify the method’s principles in a controlled environment, as well as to raise intuitions of possible limitations. Limitations raised by the literature were then explored, and a new limitation, based on the influence of the signal entropy in the method performance, was informally formulated and investigated. Time series containing different types of anomalies were then synthesized, in order to verify the generality with respect to the nature of the anomalies, and data from real applications compiled from the UCR Anomaly Benchmark, and applied to the method. Finally, some modifications and suggestions of new scores derived from the method were presented, implemented and superficially analyzed. The results allowed to verify the great potential of the application of GAN-based techniques for unsupervised anomaly detection, as well as the benefits from exploring the method in synthetic data. The experimentation showed evidence of the explored limitations, in particular the influence of the target process entropy, and the proposed metrics showed potential of improvements and the need for further investigations

    Generative adversarial networks for detecting contamination events in water distribution systems using multi-parameter, multi-site water quality monitoring

    Get PDF
    This is the final version. Available on open access from Elsevier via the DOI in this recordContamination events in water distribution networks (WDNs) can have a huge impact on water supply and public health; increasingly, online water quality sensors are deployed for real-time detection of contamination events. Machine learning has been used to integrate multivariate time series water quality data at multiple stations for contamination detection; however, accurate extraction of spatial features in water quality signals remains challenging. This study proposed a contamination detection method based on generative adversarial networks (GANs). The GAN model was constructed to simultaneously consider the spatial correlation between sensor locations and temporal information of water quality indicators. The model consists of two networks—a generator and a discriminator—the outputs of which are used to measure the degree of abnormality of water quality data at each time step, referred to as the anomaly score. Bayesian sequential analysis is used to update the likelihood of event occurrence based on the anomaly scores. Alarms are then generated from the fusion of single-site and multi-site models. The proposed method was tested on a WDN for various contamination events with different characteristics. Results showed high detection performance by the proposed GAN method compared with the minimum volume ellipsoid benchmark method for various contamination amplitudes. Additionally, the GAN method achieved high accuracy for various contamination events with different amplitudes and numbers of anomalous water quality parameters, and water quality data from different sensor stations, highlighting its robustness and potential for practical application to real-time contamination events.National Natural Science Foundation of ChinaFundamental Research Funds for the Central UniversitiesRoyal Societ

    A Novel GAN-based Fault Diagnosis Approach for Imbalanced Industrial Time Series

    Full text link
    This paper proposes a novel fault diagnosis approach based on generative adversarial networks (GAN) for imbalanced industrial time series where normal samples are much larger than failure cases. We combine a well-designed feature extractor with GAN to help train the whole network. Aimed at obtaining data distribution and hidden pattern in both original distinguishing features and latent space, the encoder-decoder-encoder three-sub-network is employed in GAN, based on Deep Convolution Generative Adversarial Networks (DCGAN) but without Tanh activation layer and only trained on normal samples. In order to verify the validity and feasibility of our approach, we test it on rolling bearing data from Case Western Reserve University and further verify it on data collected from our laboratory. The results show that our proposed approach can achieve excellent performance in detecting faulty by outputting much larger evaluation scores

    DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN

    Full text link
    Recently, the introduction of the generative adversarial network (GAN) and its variants has enabled the generation of realistic synthetic samples, which has been used for enlarging training sets. Previous work primarily focused on data augmentation for semi-supervised and supervised tasks. In this paper, we instead focus on unsupervised anomaly detection and propose a novel generative data augmentation framework optimized for this task. In particular, we propose to oversample infrequent normal samples - normal samples that occur with small probability, e.g., rare normal events. We show that these samples are responsible for false positives in anomaly detection. However, oversampling of infrequent normal samples is challenging for real-world high-dimensional data with multimodal distributions. To address this challenge, we propose to use a GAN variant known as the adversarial autoencoder (AAE) to transform the high-dimensional multimodal data distributions into low-dimensional unimodal latent distributions with well-defined tail probability. Then, we systematically oversample at the `edge' of the latent distributions to increase the density of infrequent normal samples. We show that our oversampling pipeline is a unified one: it is generally applicable to datasets with different complex data distributions. To the best of our knowledge, our method is the first data augmentation technique focused on improving performance in unsupervised anomaly detection. We validate our method by demonstrating consistent improvements across several real-world datasets.Comment: Published as a conference paper at ICDM 2018 (IEEE International Conference on Data Mining
    corecore