Search CORE

3 research outputs found

Towards Accurate Run-Time Hardware-Assisted Stealthy Malware Detection: A Lightweight, yet Effective Time Series CNN-Based Approach

Author: Costa Paulo Cesar
Gao Yifeng
Homayoun Houman
Lin Jessica
Makrani Hosein Mohammadi
Rafatirad Setareh
Sayadi Hossein
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/10/2021
Field of study

According to recent security analysis reports, malicious software (a.k.a. malware) is rising at an alarming rate in numbers, complexity, and harmful purposes to compromise the security of modern computer systems. Recently, malware detection based on low-level hardware features (e.g., Hardware Performance Counters (HPCs) information) has emerged as an effective alternative solution to address the complexity and performance overheads of traditional software-based detection methods. Hardware-assisted Malware Detection (HMD) techniques depend on standard Machine Learning (ML) classifiers to detect signatures of malicious applications by monitoring built-in HPC registers during execution at run-time. Prior HMD methods though effective have limited their study on detecting malicious applications that are spawned as a separate thread during application execution, hence detecting stealthy malware patterns at run-time remains a critical challenge. Stealthy malware refers to harmful cyber attacks in which malicious code is hidden within benign applications and remains undetected by traditional malware detection approaches. In this paper, we first present a comprehensive review of recent advances in hardware-assisted malware detection studies that have used standard ML techniques to detect the malware signatures. Next, to address the challenge of stealthy malware detection at the processor’s hardware level, we propose StealthMiner, a novel specialized time series machine learning-based approach to accurately detect stealthy malware trace at run-time using branch instructions, the most prominent HPC feature. StealthMiner is based on a lightweight time series Fully Convolutional Neural Network (FCN) model that automatically identifies potentially contaminated samples in HPC-based time series data and utilizes them to accurately recognize the trace of stealthy malware. Our analysis demonstrates that using state-of-the-art ML-based malware detection methods is not effective in detecting stealthy malware samples since the captured HPC data not only represents malware but also carries benign applications’ microarchitectural data. The experimental results demonstrate that with the aid of our novel intelligent approach, stealthy malware can be detected at run-time with 94% detection performance on average with only one HPC feature, outperforming the detection performance of state-of-the-art HMD and general time series classification methods by up to 42% and 36%, respectively

Directory of Open Access Journals

Dynamic resource provisioning for data center workloads with data constraints

Author: Li Shen
Publication venue
Publication date: 01/05/2016
Field of study

Dynamic resource provisioning, as an important data center software building block, helps to achieve high resource usage efficiency, leading to enormous monetary benefits. Most existing work for data center dynamic provisioning target on stateless servers, where any request can be routed to any server. However, the assumption of stateless behaviors no longer holds for subsystems that subject to data constraints, as a request may depend on a certain dataset stored on a small subset of servers. Routing a request to a server without the required dataset violates data locality or data availability properties, which may negatively impact on the response times. To solve this problem, this thesis provides an unified framework consisting of two main steps: 1) determining the proper amount of resources to serve the workload by analyzing the schedulability utilization bound; 2) avoiding transition penalties during cluster resizing operations by deliberately design data distribution policies. We apply this framework to both storage and computing subsystems, where the former includes distributed file systems, databases, memory caches, and the latter refers to systems such as Hadoop, Spark, and Storm. Proposed solutions are implemented into MemCached, HBase/HDFS, and Spark, and evaluated using various datasets, including Wikipedia, NYC taxi trace, Twitter traces, etc