12 research outputs found

    Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers

    Get PDF
    Malware classifiers are subject to training-time exploitation due to the need to regularly retrain using samples collected from the wild. Recent work has demonstrated the feasibility of backdoor attacks against malware classifiers, and yet the stealthiness of such attacks is not well understood. In this paper, we focus on Android malware classifiers and investigate backdoor attacks under the clean-label setting (i.e., attackers do not have complete control over the training process or the labeling of poisoned data). Empirically, we show that existing backdoor attacks against malware classifiers are still detectable by recent defenses such as MNTD. To improve stealthiness, we propose a new attack, Jigsaw Puzzle (JP), based on the key observation that malware authors have little to no incentive to protect any other authors' malware but their own. As such, Jigsaw Puzzle learns a trigger to complement the latent patterns of the malware author's samples, and activates the backdoor only when the trigger and the latent pattern are pieced together in a sample. We further focus on realizable triggers in the problem space (e.g., software code) using bytecode gadgets broadly harvested from benign software. Our evaluation confirms that Jigsaw Puzzle is effective as a backdoor, remains stealthy against state-of-the-art defenses, and is a threat in realistic settings that depart from reasoning about feature-space-only attacks. We conclude by exploring promising approaches to improve backdoor defenses

    Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors

    Get PDF
    Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while feature-space drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research

    INSOMNIA: Towards Concept-Drift Robustness in Network Intrusion Detection

    No full text
    Despite decades of research in network traffic analysis and incredible advances in artificial intelligence, network intrusion detection systems based on machine learning (ML) have yet to prove their worth. One core obstacle is the existence of concept drift, an issue for all adversary-facing security systems. Additionally, specific challenges set intrusion detection apart from other ML-based security tasks, such as malware detection. In this work, we offer a new perspective on these challenges. We propose INSOMNIA, a semi-supervised intrusion detector which continuously updates the underlying ML model as network traffic characteristics are affected by concept drift. We use active learning to reduce latency in the model updates, label estimation to reduce labeling overhead, and apply explainable AI to better interpret how the model reacts to the shifting distribution. To evaluate INSOMNIA, we extend TESSERACT-a framework originally proposed for performing sound time-Aware evaluations of ML-based malware detectors-to the network intrusion domain. Our evaluation shows that accounting for drifting scenarios is vital for effective intrusion detection systems

    Glyph: Efficient ML-Based Detection of Heap Spraying Attacks

    No full text
    Heap spraying is probably the most simple and effective memory corruption attack, which fills the memory with malicious payloads and then jumps at a random location in hopes of starting the attacker's routines. To counter this threat, GRAFFITI has been recently proposed as the first OS-agnostic framework for monitoring memory allocations of arbitrary applications at runtime; however, the main contributions of GRAFFITI are on the monitoring system, and its detection engine only considers simple heuristics which are tailored to certain attack vectors and are easily evaded. In this article, we aim to overcome this limitation and propose GLYPH as the first ML-based heap spraying detection system, which is designed to be effective, efficient, and resilient to evasive attackers. GLYPH relies on the information monitored by GRAFFITI, and we investigate the effectiveness of different feature spaces based on information entropy and memory n-grams, and discuss the several engineering challenges we have faced to make GLYPH efficient with an overhead compatible with that of GRAFFITI. To evaluate GLYPH, we build a representative dataset with several variants of heap spraying attacks, and assess GLYPH's resilience against evasive attackers through selective hold-out experiments. Results show that GLYPH achieves high accuracy in detecting spraying and is able to generalize well, outperforming the state-of-the-art approach for heap spraying detection, NOZZLE. Finally, we thoroughly discuss the trade-offs between detection performance and runtime overhead of GLYPH's different configurations

    Glyph: Efficient ML-Based Detection of Heap Spraying Attacks

    No full text
    Heap spraying is probably the most simple and effective memory corruption attack, which fills the memory with malicious payloads and then jumps at a random location in hopes of starting the attacker's routines. To counter this threat, GRAFFITI has been recently proposed as the first OS-agnostic framework for monitoring memory allocations of arbitrary applications at runtime; however, the main contributions of GRAFFITI are on the monitoring system, and its detection engine only considers simple heuristics which are tailored to certain attack vectors and are easily evaded. In this article, we aim to overcome this limitation and propose GLYPH as the first ML-based heap spraying detection system, which is designed to be effective, efficient, and resilient to evasive attackers. GLYPH relies on the information monitored by GRAFFITI, and we investigate the effectiveness of different feature spaces based on information entropy and memory n-grams, and discuss the several engineering challenges we have faced to make GLYPH efficient with an overhead compatible with that of GRAFFITI. To evaluate GLYPH, we build a representative dataset with several variants of heap spraying attacks, and assess GLYPH's resilience against evasive attackers through selective hold-out experiments. Results show that GLYPH achieves high accuracy in detecting spraying and is able to generalize well, outperforming the state-of-the-art approach for heap spraying detection, NOZZLE. Finally, we thoroughly discuss the trade-offs between detection performance and runtime overhead of GLYPH's different configurations

    DBank: Predictive Behavioral Analysis of Recent Android Banking Trojans

    Get PDF
    Using a novel dataset of Android banking trojans (ABTs), other Android malware, and goodware, we develop the DBank system to predict whether a given Android APK is a banking trojan or not. We introduce the novel concept of a Triadic Suspicion Graph (TSG for short) which contains three kinds of nodes: goodware, banking trojans, and API packages. We develop a novel feature space based on two classes of scores derived from TSGs: suspicion scores (SUS) and suspicion ranks (SR)-the latter yields a family of features that generalize PageRank. While TSG features (based on SUS/SR scores) provide very high predictive accuracy on their own in predicting recent (2016-2017) ABTs, we show that the combination of TSG features with previously studied lightweight static and dynamic features in the literature yields the highest accuracy in distinguishing ABTs from goodware, while preserving the same accuracy of prior feature combinations in distinguishing ABTs from other Android malware. In particular, DBank's overall accuracy in predicting whether an APK is a banking trojan or not is up to 99.9% AUC with 0.3% false positive rate. Moreover, we have already reported two unlabeled APKs from VirusTotal (which DBank has detected as ABTs) to the Google Android Security Team-in one case, we discovered it before any of the 63 anti-virus products on VirusTotal did, and in the other case, we beat 62 of 63 anti-viruses on VirusTotal. This suggests that DBank is capable of making new discoveries in the wild before other established vendors. We also show that our novel TSG features have some interesting defensive properties as they are robust to knowledge of the training set by an adversary: even if the adversary uses 90% of our training set and uses the exact TSG features that we use, it is difficult for him to infer DBank's predictions on APKs. We additionally identify the features that best separate and characterize ABTs from goodware as well as from other Android malware. Finally, we develop a detailed data-driven analysis of five major recent ABT families: FakeToken, Svpeng, Asacub, BankBot, and Marcher, and identify the features that best separate them from goodware and other malware

    The energy amplifier: an analysis and a research proposal

    No full text
    Consiglio Nazionale delle Ricerche - Biblioteca Centrale -. P.le Aldo Moro, 7, Rome / CNR - Consiglio Nazionale delle RichercheSIGLEITItal
    corecore