Search CORE

815 research outputs found

DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN

Author: Cheung Ngai-Man
Elovici Yuval
Lim Swee Kiat
Loo Yi
Roig Gemma
Tran Ngoc-Trung
Publication venue
Publication date: 23/08/2018
Field of study

Recently, the introduction of the generative adversarial network (GAN) and its variants has enabled the generation of realistic synthetic samples, which has been used for enlarging training sets. Previous work primarily focused on data augmentation for semi-supervised and supervised tasks. In this paper, we instead focus on unsupervised anomaly detection and propose a novel generative data augmentation framework optimized for this task. In particular, we propose to oversample infrequent normal samples - normal samples that occur with small probability, e.g., rare normal events. We show that these samples are responsible for false positives in anomaly detection. However, oversampling of infrequent normal samples is challenging for real-world high-dimensional data with multimodal distributions. To address this challenge, we propose to use a GAN variant known as the adversarial autoencoder (AAE) to transform the high-dimensional multimodal data distributions into low-dimensional unimodal latent distributions with well-defined tail probability. Then, we systematically oversample at the `edge' of the latent distributions to increase the density of infrequent normal samples. We show that our oversampling pipeline is a unified one: it is generally applicable to datasets with different complex data distributions. To the best of our knowledge, our method is the first data augmentation technique focused on improving performance in unsupervised anomaly detection. We validate our method by demonstrating consistent improvements across several real-world datasets.Comment: Published as a conference paper at ICDM 2018 (IEEE International Conference on Data Mining

arXiv.org e-Print Archive

Crossref

Anomaly Detection Approaches for Semiconductor Manufacturing

Author: Beghi Alessandro
Susto Gian Antonio
Terzi Matteo
Publication venue
Publication date: 01/01/2017
Field of study

Abstract Smart production monitoring is a crucial activity in advanced manufacturing for quality, control and maintenance purposes. Advanced Monitoring Systems aim to detect anomalies and trends; anomalies are data patterns that have different data characteristics from normal instances, while trends are tendencies of production to move in a particular direction over time. In this work, we compare state-of-the-art ML approaches (ABOD, LOF, onlinePCA and osPCA) to detect outliers and events in high-dimensional monitoring problems. The compared anomaly detection strategies have been tested on a real industrial dataset related to a Semiconductor Manufacturing Etching process

Crossref

Open Access Repository

Archivio istituzionale della ricerca - Università di Padova

Isolation Mondrian Forest for Batch and Online Anomaly Detection

Author: Crowley Mark
Ghojogh Benyamin
Ma Haoran
Samad Maria N.
Zheng Dongyu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/08/2020
Field of study

We propose a new method, named isolation Mondrian forest (iMondrian forest), for batch and online anomaly detection. The proposed method is a novel hybrid of isolation forest and Mondrian forest which are existing methods for batch anomaly detection and online random forest, respectively. iMondrian forest takes the idea of isolation, using the depth of a node in a tree, and implements it in the Mondrian forest structure. The result is a new data structure which can accept streaming data in an online manner while being used for anomaly detection. Our experiments show that iMondrian forest mostly performs better than isolation forest in batch settings and has better or comparable performance against other batch and online anomaly detection methods.Comment: Accepted for presentation at the IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2020. The first three authors contributed equally to this wor

arXiv.org e-Print Archive

Crossref

Effective And Efficient Approach for Detecting Outliers

Author: M.Sowmya, Tanuja A.Krishna Mohan
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/10/2014
Field of study

Now a days in machine learning research anomaly detection is the main topic. Anomaly detection is the process of identifying unusual behavior. It is widely used in data mining, for example, medical informatics, computer vision, computer security, sensor networks. Statistical approach aims to find the outliers which deviate from such distributions. Most distribution models are assumed univariate, and thus the lack of robustness for multidimensional data. We proposed an online and conditional anomaly detection method based on oversample PCA osPCA with LOO strategy will amplify the effect of outliers. We can successfully use the variation of the dominant principal direction to identify the presence of rare but abnormal data, for conditional anomaly detection expectation-maximization algorithms for learning the model is used. Our approach is reducing computational costs and memory requirements

International Journal on Recent and Innovation Trends in Computing and Communication

Incremental Principal Component Analysis Based Outliers Detection Methods for Spatiotemporal Data Streams

Author: Bhushan Alka
Karimi Hassan A.
Sharker Monir
Publication venue: 'Copernicus GmbH'
Publication date: 01/07/2015
Field of study

In this paper, we address outliers in spatiotemporal data streams obtained from sensors placed across geographically distributed locations. Outliers may appear in such sensor data due to various reasons such as instrumental error and environmental change. Real-time detection of these outliers is essential to prevent propagation of errors in subsequent analyses and results. Incremental Principal Component Analysis (IPCA) is one possible approach for detecting outliers in such type of spatiotemporal data streams. IPCA has been widely used in many real-time applications such as credit card fraud detection, pattern recognition, and image analysis. However, the suitability of applying IPCA for outlier detection in spatiotemporal data streams is unknown and needs to be investigated. To fill this research gap, this paper contributes by presenting two new IPCA-based outlier detection methods and performing a comparative analysis with the existing IPCA-based outlier detection methods to assess their suitability for spatiotemporal sensor data streams

Crossref

Directory of Open Access Journals

D-Scholarship@Pitt

An Enhanced Detection of Outlier using Independent Component Analysis among Multiple Data Instances via Oversampling

Author: R Karthik
R Krithigarani
Publication venue
Publication date: 24/04/2020
Field of study

CiteSeerX

Undersampling GA-SVM for network intrusion detection

Author: He Zhenyu
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2017
Field of study

Network intrusion detection is one of the hottest issues in the world. An increasing number of researchers and engineers deal with this problem by using machine learning methods. However, how to improve the identification accuracy of all the attack classes remains unsolved since the dataset is an imbalanced one with high imbalance ratio. This thesis work intends to build a classifier to achieve high classification accuracy. It proposes an undersampling Genetic Algorithm-Support Vector Machine (GA-SVM) method to handle this problem. It applies an undersampling method in GA-SVM. To solve the multiclassification problem with a binary classifier, this work proposes to utilize the undersampling GA-SVM with several classic structures. After adjusting the parameter in genetic algorithm and undersampling ratio in each support vector machine, this work concludes that the proposed undersampling GA-SVM improves the performance of an intrusion detection system. Among its variants, the decision tree-based undersampling GA-SVM offers the best performance

Digital Commons @ New Jersey Institute of Technology (NJIT)

Enhancing Smart City Functions through the Mitigation of Electricity Theft in Smart Grids: A Stacked Ensemble Method

Author: Hashim Muhammad
Javaid Nadeem
Khan Laiq
Shaheen Ifra
Ullah Zahid
Publication venue
Publication date: 01/01/2024
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano