2 research outputs found
Why an Android App is Classified as Malware? Towards Malware Classification Interpretation
Machine learning (ML) based approach is considered as one of the most
promising techniques for Android malware detection and has achieved high
accuracy by leveraging commonly-used features. In practice, most of the ML
classifications only provide a binary label to mobile users and app security
analysts. However, stakeholders are more interested in the reason why apps are
classified as malicious in both academia and industry. This belongs to the
research area of interpretable ML but in a specific research domain (i.e.,
mobile malware detection). Although several interpretable ML methods have been
exhibited to explain the final classification results in many cutting-edge
Artificial Intelligent (AI) based research fields, till now, there is no study
interpreting why an app is classified as malware or unveiling the
domain-specific challenges.
In this paper, to fill this gap, we propose a novel and interpretable
ML-based approach (named XMal) to classify malware with high accuracy and
explain the classification result meanwhile. (1) The first classification phase
of XMal hinges multi-layer perceptron (MLP) and attention mechanism, and also
pinpoints the key features most related to the classification result. (2) The
second interpreting phase aims at automatically producing neural language
descriptions to interpret the core malicious behaviors within apps. We evaluate
the behavior description results by comparing with the existing interpretable
ML-based methods (i.e., Drebin and LIME) to demonstrate the effectiveness of
XMal. We find that XMal is able to reveal the malicious behaviors more
accurately. Additionally, our experiments show that XMal can also interpret the
reason why some samples are misclassified by ML classifiers. Our study peeks
into the interpretable ML through the research of Android malware detection and
analysis
Towards Interpretable Ensemble Learning for Image-based Malware Detection
Deep learning (DL) models for image-based malware detection have exhibited
their capability in producing high prediction accuracy. But model
interpretability is posing challenges to their widespread application in
security and safety-critical application domains. This paper aims for designing
an Interpretable Ensemble learning approach for image-based Malware Detection
(IEMD). We first propose a Selective Deep Ensemble Learning-based (SDEL)
detector and then design an Ensemble Deep Taylor Decomposition (EDTD) approach,
which can give the pixel-level explanation to SDEL detector outputs.
Furthermore, we develop formulas for calculating fidelity, robustness and
expressiveness on pixel-level heatmaps in order to assess the quality of EDTD
explanation. With EDTD explanation, we develop a novel Interpretable Dropout
approach (IDrop), which establishes IEMD by training SDEL detector. Experiment
results exhibit the better explanation of our EDTD than the previous
explanation methods for image-based malware detection. Besides, experiment
results indicate that IEMD achieves a higher detection accuracy up to 99.87%
while exhibiting interpretability with high quality of prediction results.
Moreover, experiment results indicate that IEMD interpretability increases with
the increasing detection accuracy during the construction of IEMD. This
consistency suggests that IDrop can mitigate the tradeoff between model
interpretability and detection accuracy