2 research outputs found

    Why an Android App is Classified as Malware? Towards Malware Classification Interpretation

    Full text link
    Machine learning (ML) based approach is considered as one of the most promising techniques for Android malware detection and has achieved high accuracy by leveraging commonly-used features. In practice, most of the ML classifications only provide a binary label to mobile users and app security analysts. However, stakeholders are more interested in the reason why apps are classified as malicious in both academia and industry. This belongs to the research area of interpretable ML but in a specific research domain (i.e., mobile malware detection). Although several interpretable ML methods have been exhibited to explain the final classification results in many cutting-edge Artificial Intelligent (AI) based research fields, till now, there is no study interpreting why an app is classified as malware or unveiling the domain-specific challenges. In this paper, to fill this gap, we propose a novel and interpretable ML-based approach (named XMal) to classify malware with high accuracy and explain the classification result meanwhile. (1) The first classification phase of XMal hinges multi-layer perceptron (MLP) and attention mechanism, and also pinpoints the key features most related to the classification result. (2) The second interpreting phase aims at automatically producing neural language descriptions to interpret the core malicious behaviors within apps. We evaluate the behavior description results by comparing with the existing interpretable ML-based methods (i.e., Drebin and LIME) to demonstrate the effectiveness of XMal. We find that XMal is able to reveal the malicious behaviors more accurately. Additionally, our experiments show that XMal can also interpret the reason why some samples are misclassified by ML classifiers. Our study peeks into the interpretable ML through the research of Android malware detection and analysis

    Towards Interpretable Ensemble Learning for Image-based Malware Detection

    Full text link
    Deep learning (DL) models for image-based malware detection have exhibited their capability in producing high prediction accuracy. But model interpretability is posing challenges to their widespread application in security and safety-critical application domains. This paper aims for designing an Interpretable Ensemble learning approach for image-based Malware Detection (IEMD). We first propose a Selective Deep Ensemble Learning-based (SDEL) detector and then design an Ensemble Deep Taylor Decomposition (EDTD) approach, which can give the pixel-level explanation to SDEL detector outputs. Furthermore, we develop formulas for calculating fidelity, robustness and expressiveness on pixel-level heatmaps in order to assess the quality of EDTD explanation. With EDTD explanation, we develop a novel Interpretable Dropout approach (IDrop), which establishes IEMD by training SDEL detector. Experiment results exhibit the better explanation of our EDTD than the previous explanation methods for image-based malware detection. Besides, experiment results indicate that IEMD achieves a higher detection accuracy up to 99.87% while exhibiting interpretability with high quality of prediction results. Moreover, experiment results indicate that IEMD interpretability increases with the increasing detection accuracy during the construction of IEMD. This consistency suggests that IDrop can mitigate the tradeoff between model interpretability and detection accuracy
    corecore