12,026 research outputs found

    Decomposable Principal Component Analysis

    Full text link
    We consider principal component analysis (PCA) in decomposable Gaussian graphical models. We exploit the prior information in these models in order to distribute its computation. For this purpose, we reformulate the problem in the sparse inverse covariance (concentration) domain and solve the global eigenvalue problem using a sequence of local eigenvalue problems in each of the cliques of the decomposable graph. We demonstrate the application of our methodology in the context of decentralized anomaly detection in the Abilene backbone network. Based on the topology of the network, we propose an approximate statistical graphical model and distribute the computation of PCA

    Applying PCA for Traffic Anomaly Detection: Problems and Solutions

    Get PDF
    International audienceSpatial Principal Component Analysis (PCA) has been proposed for network-wide anomaly detection. A recent work has shown that PCA is very sensitive to calibration settings. Unfortunately, the authors did not provide further explanations for this observation. In this paper, we fill this gap and provide the reasoning behind the found discrepancies. We revisit PCA for anomaly detection and evaluate its performance on our data. We develop a slightly modified version of PCA that uses only data from a single router. Instead of correlating data across different spatial measurement points, we correlate the data across different metrics. With the help of the analyzed data, we explain the pitfalls of PCA and underline our argumentation with measurement results. We show that the main problem is that PCA fails to capture temporal correlation. We propose a solution to deal with this problem by replacing PCA with the Karhunen-Loeve transform. We find that when we consider temporal correlation, anomaly detection results are significantly improved

    A Family of Joint Sparse PCA Algorithms for Anomaly Localization in Network Data Streams

    Get PDF
    Determining anomalies in data streams that are collected and transformed from various types of networks has recently attracted significant research interest. Principal Component Analysis (PCA) is arguably the most widely applied unsupervised anomaly detection technique for networked data streams due to its simplicity and efficiency. However, none of existing PCA based approaches addresses the problem of identifying the sources that contribute most to the observed anomaly, or anomaly localization. In this paper, we first proposed a novel joint sparse PCA method to perform anomaly detection and localization for network data streams. Our key observation is that we can detect anomalies and localize anomalous sources by identifying a low dimensional abnormal subspace that captures the abnormal behavior of data. To better capture the sources of anomalies, we incorporated the structure of the network stream data in our anomaly localization framework. Also, an extended version of PCA, multidimensional KLE, was introduced to stabilize the localization performance. We performed comprehensive experimental studies on four real-world data sets from different application domains and compared our proposed techniques with several state-of-the-arts. Our experimental studies demonstrate the utility of the proposed methods

    Volume anomaly detection in data networks : an optimal detection algorithm vs. the PCA approach

    Get PDF
    The crucial future role of Internet in society makes of network monitoring a critical issue for network operators in future network scenarios. The Future Internet will have to cope with new and different anomalies, motivating the development of accurate detection algorithms. This paper presents a novel approach to detect unexpected and large traffic variations in data networks. We introduce an optimal volume anomaly detection algorithm in which the anomaly-free traffic is treated as a nuisance parameter. The algorithm relies on an original parsimonious model for traffic demands which allows detecting anomalies from link traffic measurements, reducing the overhead of data collection. The performance of the method is compared to that obtained with the Principal Components Analysis (PCA) approach. We choose this method as benchmark given its relevance in the anomaly detection literature. Our proposal is validated using data from an operational network, showing how the method outperforms the PCA approac

    Implementation of Anomaly Based Network Intrusion Detection by Using Q-learning Technique

    Get PDF
    Network Intrusion detection System (NIDS) is an intrusion detection system that tries to discover malicious activity such as service attacks, port scans or even attempts to break into computers by monitoring network traffic. Data mining techniques make it possible to search large amounts of data for characteristic rules and patterns. If applied to network monitoring data recorded on a host or in a network, they can be used to detect intrusions, attacks or anomalies. We proposed “machine learning method”, cascading Principal Component Analysis (PCA) and the Q-learning methods to classifying anomalous and normal activities in a computer network. This paper investigates the use of PCA to reduce high dimensional data and to improve the predictive performance. On the reduced data, representing a density region of normal or anomaly instances, Q-learning strategies are applied for the creation of agents that can adapt to unknown, complex environments. We attempted to create an agent that would learn to explore an environment and collect the malicious within it. We obtained interesting results where agents were able to re-adapt their learning quickly to the new traffic and network information as compare to the other machine learning method such as supervised learning and unsupervised learning. Keywords: Intrusion, Anomaly Detection, Data Mining, KDD Cup’99, PCA, Q-learning

    Intrusion Detection System Using Multivariate Control Chart Hotelling's T2 Based on PCA

    Get PDF
    Statistical Process Control (SPC) has been widely used in industry and services. The SPC can be applied not only to monitor manufacture processes but also can be applied to the Intrusion Detection System (IDS). In network monitoring and intrusion detection, SPC can be a powerful tool to ensure system security and stability in a network. Theoretically, Hotelling’s T2 chart can be used in intrusion detection. However, there are two reasons why the chart is not suitable to be used. First, the intrusion detection data involves large volumes of high-dimensional process data. Second, intrusion detection requires a fast computational process so an intrusion can be detected as soon as possible. To overcome the problems caused by a large number of quality characteristics, Principal Component Analysis (PCA) can be used. The PCA can reduce not only the dimension leading a faster computational, but also can eliminate the multicollinearity (among characteristic variables) problem. This paper is focused on the usage of multivariate control chart T2 based on PCA for IDS. The KDD99 dataset is used to evaluate the performance of the proposed method. Furthermore, the performance of T2 based PCA will be compared with conventional T2 control chart. The empirical results of this research show that the multivariate control chart using Hotelling’s T2 based on PCA has excellent performance to detect an anomaly in the network. Compared to conventional T2 control chart, the T2 based on PCA has similar performance with 97 percent hit rate. It also requires shorter computation time.
    • …
    corecore