1,541 research outputs found
Querying Streaming System Monitoring Data for Enterprise System Anomaly Detection
The need for countering Advanced Persistent Threat (APT) attacks has led to
the solutions that ubiquitously monitor system activities in each enterprise
host, and perform timely abnormal system behavior detection over the stream of
monitoring data. However, existing stream-based solutions lack explicit
language constructs for expressing anomaly models that capture abnormal system
behaviors, thus facing challenges in incorporating expert knowledge to perform
timely anomaly detection over the large-scale monitoring data. To address these
limitations, we build SAQL, a novel stream-based query system that takes as
input, a real-time event feed aggregated from multiple hosts in an enterprise,
and provides an anomaly query engine that queries the event feed to identify
abnormal behaviors based on the specified anomaly models. SAQL provides a
domain-specific query language, Stream-based Anomaly Query Language (SAQL),
that uniquely integrates critical primitives for expressing major types of
anomaly models. In the demo, we aim to show the complete usage scenario of SAQL
by (1) performing an APT attack in a controlled environment, and (2) using SAQL
to detect the abnormal behaviors in real time by querying the collected stream
of system monitoring data that contains the attack traces. The audience will
have the option to interact with the system and detect the attack footprints in
real time via issuing queries and checking the query results through a
command-line UI.Comment: Accepted paper at ICDE 2020 demonstrations track. arXiv admin note:
text overlap with arXiv:1806.0933
Process Flow Features as a Host-based Event Knowledge Representation
The detection of malware is of great importance but even non-malicious software can be used for malicious purposes. Monitoring processes and their associated information can characterize normal behavior and help identify malicious processes or malicious use of normal process by measuring deviations from the learned baseline. This exploratory research describes a novel host feature generation process that calculates statistics of an executing process during a window of time called a process flow. Process flows are calculated from key process data structures extracted from computer memory using virtual machine introspection. Each flow cluster generated using k-means of the flow features represents a behavior where the members of the cluster all exhibit similar behavior. Testing explores associations between behavior and process flows that in the future may be useful for detecting unauthorized behavior or behavioral trends on a host. Analysis of two data collections demonstrate that this novel way of thinking of process behavior as process flows can produce baseline models in the form of clusters that do represent specific behaviors
Edge-Detect: Edge-centric Network Intrusion Detection using Deep Neural Network
Edge nodes are crucial for detection against multitudes of cyber attacks on
Internet-of-Things endpoints and is set to become part of a multi-billion
industry. The resource constraints in this novel network infrastructure tier
constricts the deployment of existing Network Intrusion Detection System with
Deep Learning models (DLM). We address this issue by developing a novel light,
fast and accurate 'Edge-Detect' model, which detects Distributed Denial of
Service attack on edge nodes using DLM techniques. Our model can work within
resource restrictions i.e. low power, memory and processing capabilities, to
produce accurate results at a meaningful pace. It is built by creating layers
of Long Short-Term Memory or Gated Recurrent Unit based cells, which are known
for their excellent representation of sequential data. We designed a practical
data science pipeline with Recurring Neural Network to learn from the network
packet behavior in order to identify whether it is normal or attack-oriented.
The model evaluation is from deployment on actual edge node represented by
Raspberry Pi using current cybersecurity dataset (UNSW2015). Our results
demonstrate that in comparison to conventional DLM techniques, our model
maintains a high testing accuracy of 99% even with lower resource utilization
in terms of cpu and memory. In addition, it is nearly 3 times smaller in size
than the state-of-art model and yet requires a much lower testing time
- …