3 research outputs found
Setting the threshold for high throughput detectors: A mathematical approach for ensembles of dynamic, heterogeneous, probabilistic anomaly detectors
Anomaly detection (AD) has garnered ample attention in security research, as
such algorithms complement existing signature-based methods but promise
detection of never-before-seen attacks. Cyber operations manage a high volume
of heterogeneous log data; hence, AD in such operations involves multiple
(e.g., per IP, per data type) ensembles of detectors modeling heterogeneous
characteristics (e.g., rate, size, type) often with adaptive online models
producing alerts in near real time. Because of high data volume, setting the
threshold for each detector in such a system is an essential yet underdeveloped
configuration issue that, if slightly mistuned, can leave the system useless,
either producing a myriad of alerts and flooding downstream systems, or giving
none. In this work, we build on the foundations of Ferragut et al. to provide a
set of rigorous results for understanding the relationship between threshold
values and alert quantities, and we propose an algorithm for setting the
threshold in practice. Specifically, we give an algorithm for setting the
threshold of multiple, heterogeneous, possibly dynamic detectors completely a
priori, in principle. Indeed, if the underlying distribution of the incoming
data is known (closely estimated), the algorithm provides provably manageable
thresholds. If the distribution is unknown (e.g., has changed over time) our
analysis reveals how the model distribution differs from the actual
distribution, indicating a period of model refitting is necessary. We provide
empirical experiments showing the efficacy of the capability by regulating the
alert rate of a system with 2,500 adaptive detectors scoring over 1.5M
events in 5 hours. Further, we demonstrate on the real network data and
detection framework of Harshaw et al. the alternative case, showing how the
inability to regulate alerts indicates the detection model is a bad fit to the
data.Comment: 11 pages, 5 figures. Proceedings of IEEE Big Data Conference, 201
Towards Malware Detection via CPU Power Consumption: Data Collection Design and Analytics (Extended Version)
This paper presents an experimental design and data analytics approach aimed
at power-based malware detection on general-purpose computers. Leveraging the
fact that malware executions must consume power, we explore the postulate that
malware can be accurately detected via power data analytics. Our experimental
design and implementation allow for programmatic collection of CPU power
profiles for fixed tasks during uninfected and infected states using five
different rootkits. To characterize the power consumption profiles, we use both
simple statistical and novel, sophisticated features. We test a one-class
anomaly detection ensemble (that baselines non-infected power profiles) and
several kernel-based SVM classifiers (that train on both uninfected and
infected profiles) in detecting previously unseen malware and clean profiles.
The anomaly detection system exhibits perfect detection when using all features
and tasks, with smaller false detection rate than the supervised classifiers.
The primary contribution is the proof of concept that baselining power of fixed
tasks can provide accurate detection of rootkits. Moreover, our treatment
presents engineering hurdles needed for experimentation and allows analysis of
each statistical feature individually. This work appears to be the first step
towards a viable power-based detection capability for general-purpose
computers, and presents next steps toward this goal.Comment: Published version appearing in IEEE TrustCom-18. This version
contains more details on mathematics and data collectio
1 Automatic Construction of Anomaly Detectors from Graphical Models
Abstract—Detection of rare or previously unseen attacks in cyber security presents a central challenge: how does one search for a sufficiently wide variety of types of anomalies and yet allow the process to scale to increasingly complex data? In particular, creating each anomaly detector manually and training each one separately presents untenable strains on both human and computer resources. In this paper we propose a systematic method for constructing a potentially very large number of complementary anomaly detectors from a single probabilistic model of the data. Only one model needs to be trained, but numerous detectors can then be implemented. This approach promises to scale better than manual methods to the complex heterogeneity of real-life data. As an example, we develop a Latent Dirichlet Allocation probability model of TCP connections entering Oak Ridge National Laboratory. We show that several detectors can be automatically constructed from the model and will provide anomaly detection at flow, sub-flow, and host (both server and client) levels. This demonstrates how the fundamental connection between anomaly detection and probabilistic modeling can be exploited to develop more robust operational solutions. I