1 research outputs found
Detecting Clusters of Anomalies on Low-Dimensional Feature Subsets with Application to Network Traffic Flow Data
In a variety of applications, one desires to detect groups of anomalous data
samples, with a group potentially manifesting its atypicality (relative to a
reference model) on a low-dimensional subset of the full measured set of
features. Samples may only be weakly atypical individually, whereas they may be
strongly atypical when considered jointly. What makes this group anomaly
detection problem quite challenging is that it is a priori unknown which subset
of features jointly manifests a particular group of anomalies. Moreover, it is
unknown how many anomalous groups are present in a given data batch. In this
work, we develop a group anomaly detection (GAD) scheme to identify the subset
of samples and subset of features that jointly specify an anomalous cluster. We
apply our approach to network intrusion detection to detect BotNet and
peer-to-peer flow clusters. Unlike previous studies, our approach captures and
exploits statistical dependencies that may exist between the measured features.
Experiments on real world network traffic data demonstrate the advantage of our
proposed system, and highlight the importance of exploiting feature dependency
structure, compared to the feature (or test) independence assumption made in
previous studies