Search CORE

2 research outputs found

On Preempting Advanced Persistent Threats Using Probabilistic Graphical Models

Author: Cao Phuong
Publication venue
Publication date: 21/03/2019
Field of study

This paper presents PULSAR, a framework for pre-empting Advanced Persistent Threats (APTs). PULSAR employs a probabilistic graphical model (specifically a Factor Graph) to infer the time evolution of an attack based on observed security events at runtime. PULSAR (i) learns the statistical significance of patterns of events from past attacks; (ii) composes these patterns into FGs to capture the progression of the attack; and (iii) decides on preemptive actions. PULSAR's accuracy and its performance are evaluated in three experiments at SystemX: (i) a study with a dataset containing 120 successful APTs over the past 10 years (PULSAR accurately identifies 91.7%); (ii) replaying of a set of ten unseen APTs (PULSAR stops 8 out of 10 replayed attacks before system integrity violation, and all ten before data exfiltration); and (iii) a production deployment of PULSAR (during a month-long deployment, PULSAR took an average of one second to make a decision)

arXiv.org e-Print Archive

Neurlux: Dynamic Malware Analysis Without Feature Engineering

Author: Aghakhani Hojjat
Jindal Chani
Kruegel Christopher
Long Keith
Salls Christopher
Vigna Giovanni
Publication venue
Publication date: 24/10/2019
Field of study

Malware detection plays a vital role in computer security. Modern machine learning approaches have been centered around domain knowledge for extracting malicious features. However, many potential features can be used, and it is time consuming and difficult to manually identify the best features, especially given the diverse nature of malware. In this paper, we propose Neurlux, a neural network for malware detection. Neurlux does not rely on any feature engineering, rather it learns automatically from dynamic analysis reports that detail behavioral information. Our model borrows ideas from the field of document classification, using word sequences present in the reports to predict if a report is from a malicious binary or not. We investigate the learned features of our model and show which components of the reports it tends to give the highest importance. Then, we evaluate our approach on two different datasets and report formats, showing that Neurlux improves on the state of the art and can effectively learn from the dynamic analysis reports. Furthermore, we show that our approach is portable to other malware analysis environments and generalizes to different datasets

arXiv.org e-Print Archive