809 research outputs found
Scalable network-wide anomaly detection using compressed data
Detecting network traffic volume anomalies in real time is a key problem as it enables measures to be taken to prevent network congestion which severely affects the end users. Several techniques based on principal component analysis (PCA) have been outlined in the past which detect volume anomalies as outliers in the residual subspace. However, these methods are not scalable to networks with a large number of links. We address this scalability issue with a new approach inspired from the recently developed compressed sensing (CS) theory. This theory induces a universal information sampling sheme right at the network sensory level to reduce the data overhead. Specifically, we address exploit the compressibility characteristics of the network data and describe a framework for anomaly detection in the compressed domain. Our main theoretical contribution is a detailed theoretical analysis of the new approach which obtains the probabilistic bounds on the principal eigenvalues of the compressed data. Subsequently, we prove that volume anomaly detection using compressed data can achieve equivalent performance as it does using the original uncompressed and reduces the computational cost significantly. The experimental results on both the Abiliene and synthetic datasets support our theoretical findings and demonstrate the advantages of the new approach over the existing methods
Effective anomaly detection in sensor networks data streams
This paper addresses a major challenge in data mining applications where the full information about the underlying processes, such as sensor networks or large online database, cannot be practically obtained due to physical limitations such as low bandwidth or memory, storage, or computing power. Motivated by the recent theory on direct information sampling called compressed sensing (CS), we propose a framework for detecting anomalies from these largescale data mining applications where the full information is not practically possible to obtain. Exploiting the fact that the intrinsic dimension of the data in these applications are typically small relative to the raw dimension and the fact that compressed sensing is capable of capturing most information with few measurements, our work show that spectral methods that used for volume anomaly detection can be directly applied to the CS data with guarantee on performance. Our theoretical contributions are supported by extensive experimental results on large datasets which show satisfactory performance.<br /
Structural Analysis of Network Traffic Matrix via Relaxed Principal Component Pursuit
The network traffic matrix is widely used in network operation and
management. It is therefore of crucial importance to analyze the components and
the structure of the network traffic matrix, for which several mathematical
approaches such as Principal Component Analysis (PCA) were proposed. In this
paper, we first argue that PCA performs poorly for analyzing traffic matrix
that is polluted by large volume anomalies, and then propose a new
decomposition model for the network traffic matrix. According to this model, we
carry out the structural analysis by decomposing the network traffic matrix
into three sub-matrices, namely, the deterministic traffic, the anomaly traffic
and the noise traffic matrix, which is similar to the Robust Principal
Component Analysis (RPCA) problem previously studied in [13]. Based on the
Relaxed Principal Component Pursuit (Relaxed PCP) method and the Accelerated
Proximal Gradient (APG) algorithm, we present an iterative approach for
decomposing a traffic matrix, and demonstrate its efficiency and flexibility by
experimental results. Finally, we further discuss several features of the
deterministic and noise traffic. Our study develops a novel method for the
problem of structural analysis of the traffic matrix, which is robust against
pollution of large volume anomalies.Comment: Accepted to Elsevier Computer Network
SENATUS: An Approach to Joint Traffic Anomaly Detection and Root Cause Analysis
In this paper, we propose a novel approach, called SENATUS, for joint traffic
anomaly detection and root-cause analysis. Inspired from the concept of a
senate, the key idea of the proposed approach is divided into three stages:
election, voting and decision. At the election stage, a small number of
\nop{traffic flow sets (termed as senator flows)}senator flows are chosen\nop{,
which are used} to represent approximately the total (usually huge) set of
traffic flows. In the voting stage, anomaly detection is applied on the senator
flows and the detected anomalies are correlated to identify the most possible
anomalous time bins. Finally in the decision stage, a machine learning
technique is applied to the senator flows of each anomalous time bin to find
the root cause of the anomalies. We evaluate SENATUS using traffic traces
collected from the Pan European network, GEANT, and compare against another
approach which detects anomalies using lossless compression of traffic
histograms. We show the effectiveness of SENATUS in diagnosing anomaly types:
network scans and DoS/DDoS attacks
- …