29,479 research outputs found

    Deep Baseline Network for Time Series Modeling and Anomaly Detection

    Full text link
    Deep learning has seen increasing applications in time series in recent years. For time series anomaly detection scenarios, such as in finance, Internet of Things, data center operations, etc., time series usually show very flexible baselines depending on various external factors. Anomalies unveil themselves by lying far away from the baseline. However, the detection is not always easy due to some challenges including baseline shifting, lacking of labels, noise interference, real time detection in streaming data, result interpretability, etc. In this paper, we develop a novel deep architecture to properly extract the baseline from time series, namely Deep Baseline Network (DBLN). By using this deep network, we can easily locate the baseline position and then provide reliable and interpretable anomaly detection result. Empirical evaluation on both synthetic and public real-world datasets shows that our purely unsupervised algorithm achieves superior performance compared with state-of-art methods and has good practical applications

    Prototypes as Explanation for Time Series Anomaly Detection

    Full text link
    Detecting abnormal patterns that deviate from a certain regular repeating pattern in time series is essential in many big data applications. However, the lack of labels, the dynamic nature of time series data, and unforeseeable abnormal behaviors make the detection process challenging. Despite the success of recent deep anomaly detection approaches, the mystical mechanisms in such black-box models have become a new challenge in safety-critical applications. The lack of model transparency and prediction reliability hinders further breakthroughs in such domains. This paper proposes ProtoAD, using prototypes as the example-based explanation for the state of regular patterns during anomaly detection. Without significant impact on the detection performance, prototypes shed light on the deep black-box models and provide intuitive understanding for domain experts and stakeholders. We extend the widely used prototype learning in classification problems into anomaly detection. By visualizing both the latent space and input space prototypes, we intuitively demonstrate how regular data are modeled and why specific patterns are considered abnormal

    LSTM based Anomaly Detection in Time Series for United States exports and imports

    Get PDF
    This survey aims to offer a thorough and organized overview of research on anomaly detection, which is a significant problem that has been studied in various fields and application areas. Some anomaly detection techniques have been tailored for specific domains, while others are more general. Anomaly detection involves identifying unusual patterns or events in a dataset, which is important for a wide range of applications including fraud detection and medical diagnosis. Not much research on anomaly detection techniques has been conducted in the field of economic and international trade. Therefore, this study attempts to analyze the time-series data of United Nations exports and imports for the period 1992 – 2022 using LSTM based anomaly detection algorithm. Deep learning, particularly LSTM networks, are becoming increasingly popular in anomaly detection tasks due to their ability to learn complex patterns in sequential data. This paper presents a detailed explanation of LSTM architecture, including the role of input, forget, and output gates in processing input vectors and hidden states at each timestep. The LSTM based anomaly detection approach yields promising results by modelling small-term as well as long-term temporal dependencies

    Neural Sequence Analysis Toolbox

    Get PDF
    Time series have always been of great interest in the financial sector but today with the advent of sensors and the IoT they have received new attention and their analysis is no longer carried out using linear methods of classical statistics but deep learning is revealing a new paradigm with interesting performances for tasks such as predicting time sequences over time or looking for anomalous patterns that could represent failure of the industrial apparatus. Strategies for time series preprocessing with splines and wavelets are investigated with the present work. Methods such as error based methods and GANs for anomaly detection are also studied and models such as sequence to sequence learning and attention mechanisms for forecasting are taken into consideration. Experiments have been carried out to compare all these methodologies using public data from NASA and airpollution dataset (you can find the links in the experiments chapter). Regarding anomaly detection, the most promising approach was that of GANs. The problem of finding a number of timestamps on which to obtain reliable predictions was also investigated and the problem was formulated in such a way that the neural network itself in the training process can learn the length of the time horizon on which to make predictions. A toolbox has been produced that allows the user to preprocess multivariate time series and implement outlier detection or forecasting applications with the above methodologies

    Deep knowledge transfer for generalization across tasks and domains under data scarcity

    Get PDF
    Over the last decade, deep learning approaches have achieved tremendous performance in a wide variety of fields, e.g., computer vision and natural language understanding, and across several sectors such as healthcare, industrial manufacturing, and driverless mobility. Most deep learning successes were accomplished in learning scenarios fulfilling the two following requirements. First, large amounts of data are available for training the deep learning model and there are no access restrictions to the data. Second, the data used for training and testing is independent and identically distributed (i.i.d.). However, many real-world applications infringe at least one of the aforementioned requirements, which results in challenging learning problems. The present thesis comprises four contributions to address four such learning problems. In each contribution, we propose a novel method and empirically demonstrate its effectiveness for the corresponding problem setting. The first part addresses the underexplored intersection of the few-shot learning and the one-class classification problems. In this learning scenario, the model has to learn a new task using only a few examples from only the majority class, without overfitting to the few examples or to the majority class. This learning scenario is faced in real-world applications of anomaly detection where data is scarce. We propose an episode sampling technique to adapt meta-learning algorithms designed for class-balanced few-shot classification to the addressed few-shot one-class classification problem. This is done by optimizing for a model initialization tailored for the addressed scenario. In addition, we provide theoretical and empirical analyses to investigate the need for second-order derivatives to learn such parameter initializations. Our experiments on 8 image and time-series datasets, including a real-world dataset of industrial sensor readings, demonstrate the effectiveness of our method. The second part tackles the intersection of the continual learning and the anomaly detection problems, which we are the first to explore, to the best of our knowledge. In this learning scenario, the model is exposed to a stream of anomaly detection tasks, i.e., only examples from the normal class are available, that it has to learn sequentially. Such problem settings are encountered in anomaly detection applications where the data distribution continuously changes. We propose a meta-learning approach that learns parameter-specific initializations and learning rates suitable for continual anomaly detection. Our empirical evaluations show that a model trained with our algorithm is able to learn up 100 anomaly detection tasks sequentially with minimal catastrophic forgetting and overfitting to the majority class. In the third part, we address the domain generalization problem, in which a model trained on several source domains is expected to generalize well to data from a previously unseen target domain, without any modification or exposure to its data. This challenging learning scenario is present in applications involving domain shift, e.g., different clinical centers using different MRI scanners or data acquisition protocols. We assume that learning to extract a richer set of features improves the transfer to a wider set of unknown domains. Motivated by this, we propose an algorithm that identifies the already learned features and corrupts them, hence enforcing new feature discovery. We leverage methods from the explainable machine learning literature to identify the features, and apply the targeted corruption on multiple representation levels, including input data and high-level embeddings. Our extensive empirical evaluation shows that our approach outperforms 18 domain generalization algorithms on multiple benchmark datasets. The last part of the thesis addresses the intersection of domain generalization and data-free learning methods, which we are the first to explore, to the best of our knowledge. Hereby, we address the learning scenario where a model robust to domain shift is needed and only models trained on the same task but different domains are available instead of the original datasets. This learning scenario is relevant for any domain generalization application where the access to the data of the source domains is restricted, e.g., due to concerns about data privacy concerns or intellectual property infringement. We develop an approach that extracts and fuses domain-specific knowledge from the available teacher models into a student model robust to domain shift, by generating synthetic cross-domain data. Our empirical evaluation demonstrates the effectiveness of our method which outperforms ensemble and data-free knowledge distillation baselines. Most importantly, the proposed approach substantially reduces the gap between the best data-free baseline and the upper-bound baseline that uses the original private data

    Investigation Of Multi-Criteria Clustering Techniques For Smart Grid Datasets

    Get PDF
    The processing of data arising from connected smart grid technology is an important area of research for the next generation power system. The volume of data allows for increased awareness and efficiency of operation but poses challenges for analyzing the data and turning it into meaningful information. This thesis showcases the utility of clustering algorithms applied to three separate smart-grid data sets and analyzes their ability to improve awareness and operational efficiency. Hierarchical clustering for anomaly detection in phasor measurement unit (PMU) datasets is identified as an appropriate method for fault and anomaly detection. It showed an increase in anomaly detection efficiency according to Dunn Index (DI) and improved computational considerations compared to currently employed techniques such as Density Based Spatial Clustering of Applications with Noise (DBSCAN). The efficacy of betweenness-centrality (BC) based clustering in a novel clustering scheme for the determination of microgrids from large scale bus systems is demonstrated and compared against a multitude of other graph clustering algorithms. The BC based clustering showed an overall decrease in economic dispatch cost when compared to other methods of graph clustering. Additionally, the utility of BC for identification of critical buses was showcased. Finally, this work demonstrates the utility of partitional dynamic time warping (DTW) and k-shape clustering methods for classifying power demand profiles of households with and without electric vehicles (EVs). The utility of DTW time-series clustering was compared against other methods of time-series clustering and tested based upon demand forecasting using traditional and deep-learning techniques. Additionally, a novel process for selecting an optimal time-series clustering scheme based upon a scaled sum of cluster validity indices (CVIs) was developed. Forecasting schemes based on DTW and k-shape demand profiles showed an overall increase in forecast accuracy. In summary, the use of clustering methods for three distinct types of smart grid datasets is demonstrated. The use of clustering algorithms as a means of processing data can lead to overall methods that improve forecasting, economic dispatch, event detection, and overall system operation. Ultimately, the techniques demonstrated in this thesis give analytical insights and foster data-driven management and automation for smart grid power systems of the future

    Anomaly Detection in Noisy Images

    Get PDF
    Finding rare events in multidimensional data is an important detection problem that has applications in many fields, such as risk estimation in insurance industry, finance, flood prediction, medical diagnosis, quality assurance, security, or safety in transportation. The occurrence of such anomalies is so infrequent that there is usually not enough training data to learn an accurate statistical model of the anomaly class. In some cases, such events may have never been observed, so the only information that is available is a set of normal samples and an assumed pairwise similarity function. Such metric may only be known up to a certain number of unspecified parameters, which would either need to be learned from training data, or fixed by a domain expert. Sometimes, the anomalous condition may be formulated algebraically, such as a measure exceeding a predefined threshold, but nuisance variables may complicate the estimation of such a measure. Change detection methods used in time series analysis are not easily extendable to the multidimensional case, where discontinuities are not localized to a single point. On the other hand, in higher dimensions, data exhibits more complex interdependencies, and there is redundancy that could be exploited to adaptively model the normal data. In the first part of this dissertation, we review the theoretical framework for anomaly detection in images and previous anomaly detection work done in the context of crack detection and detection of anomalous components in railway tracks. In the second part, we propose new anomaly detection algorithms. The fact that curvilinear discontinuities in images are sparse with respect to the frame of shearlets, allows us to pose this anomaly detection problem as basis pursuit optimization. Therefore, we pose the problem of detecting curvilinear anomalies in noisy textured images as a blind source separation problem under sparsity constraints, and propose an iterative shrinkage algorithm to solve it. Taking advantage of the parallel nature of this algorithm, we describe how this method can be accelerated using graphical processing units (GPU). Then, we propose a new method for finding defective components on railway tracks using cameras mounted on a train. We describe how to extract features and use a combination of classifiers to solve this problem. Then, we scale anomaly detection to bigger datasets with complex interdependencies. We show that the anomaly detection problem naturally fits in the multitask learning framework. The first task consists of learning a compact representation of the good samples, while the second task consists of learning the anomaly detector. Using deep convolutional neural networks, we show that it is possible to train a deep model with a limited number of anomalous examples. In sequential detection problems, the presence of time-variant nuisance parameters affect the detection performance. In the last part of this dissertation, we present a method for adaptively estimating the threshold of sequential detectors using Extreme Value Theory on a Bayesian framework. Finally, conclusions on the results obtained are provided, followed by a discussion of possible future work
    • …
    corecore