1,994 research outputs found
Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection
In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology
and framework for efficient and effective real-time malware detection,
leveraging the best of conventional machine learning (ML) and deep learning
(DL) algorithms. In PROPEDEUTICA, all software processes in the system start
execution subjected to a conventional ML detector for fast classification. If a
piece of software receives a borderline classification, it is subjected to
further analysis via more performance expensive and more accurate DL methods,
via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays
to the execution of software subjected to deep learning analysis as a way to
"buy time" for DL analysis and to rate-limit the impact of possible malware in
the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and
877 commonly used benign software samples from various categories for the
Windows OS. Our results show that the false positive rate for conventional ML
methods can reach 20%, and for modern DL methods it is usually below 6%.
However, the classification time for DL can be 100X longer than conventional ML
methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional
ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the
percentage of software subjected to DL analysis was approximately 40% on
average. Further, the application of delays in software subjected to ML reduced
the detection time by approximately 10%. Finally, we found and discussed a
discrepancy between the detection accuracy offline (analysis after all traces
are collected) and on-the-fly (analysis in tandem with trace collection). Our
insights show that conventional ML and modern DL-based malware detectors in
isolation cannot meet the needs of efficient and effective malware detection:
high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure
Statistical analysis driven optimized deep learning system for intrusion detection
Attackers have developed ever more sophisticated and intelligent ways to hack
information and communication technology systems. The extent of damage an
individual hacker can carry out upon infiltrating a system is well understood.
A potentially catastrophic scenario can be envisaged where a nation-state
intercepting encrypted financial data gets hacked. Thus, intelligent
cybersecurity systems have become inevitably important for improved protection
against malicious threats. However, as malware attacks continue to dramatically
increase in volume and complexity, it has become ever more challenging for
traditional analytic tools to detect and mitigate threat. Furthermore, a huge
amount of data produced by large networks has made the recognition task even
more complicated and challenging. In this work, we propose an innovative
statistical analysis driven optimized deep learning system for intrusion
detection. The proposed intrusion detection system (IDS) extracts optimized and
more correlated features using big data visualization and statistical analysis
methods (human-in-the-loop), followed by a deep autoencoder for potential
threat detection. Specifically, a pre-processing module eliminates the outliers
and converts categorical variables into one-hot-encoded vectors. The feature
extraction module discard features with null values and selects the most
significant features as input to the deep autoencoder model (trained in a
greedy-wise manner). The NSL-KDD dataset from the Canadian Institute for
Cybersecurity is used as a benchmark to evaluate the feasibility and
effectiveness of the proposed architecture. Simulation results demonstrate the
potential of our proposed system and its outperformance as compared to existing
state-of-the-art methods and recently published novel approaches. Ongoing work
includes further optimization and real-time evaluation of our proposed IDS.Comment: To appear in the 9th International Conference on Brain Inspired
Cognitive Systems (BICS 2018
- …