A Malware Detection System For Android

Abstract

Android security is built upon a permission-based mechanism, which restricts access of third-party Android applications to critical resources on an Android device. The user must accept the set of permissions that an application requires, before the installation proceeds. This is to inform the user about the risks of installing and using an application. It has two problems. The first one is that users are not aware enough of existing threats and trust either the application store or the popularity of the application and accept the installation without analysing the intentions of the developer. The second one is that Android does not display the specific resource needed by the application and the corresponding permissions during its installation. It rather presents different categories representing the set of resources with a description. The categories include implicitly permissions necessary to access some resources. The user grants more authorisations than necessary probably confused by the management of permissions, increasing the difficulty of detecting malicious applications and constituting the basis for many attacks. The thesis defines a system for detecting Android malware based only on requested permissions. It focuses on 222 permissions including some exclusively for third-party applications. It is a static analysis technique, which combines two reliable strategies. The the first one focuses on the discriminating metric based on the frequency of permissions and the proportion of requests by malicious applications within the whole sample. The second one relies on security risks related to granting permissions. A comparison has shown that the four protection levels of permissions defined by Google are coarse-grained, hiding the real sense of permissions. The first strategy is fine-grained and more precise in terms of permission semantics. We collected a dataset with 6783 malicious and 1993 normal applications, which have been tested and validated. Profiles for each sample have been generated, depending on both strategies and used as input for training and learning processes. Seven classifiers have been applied to the models to output performance results. We select the good ones to define our classifier, which provides outstanding performance in detection and prediction. A dataset of associations of permissions to weights that can be reused in a different research has been released from our work. Evaluations indicate that our model is one of the best tools with only requested permissions as a feature. It is able to detect around 99.20% of 1260 cases of malware released by the Genome project, which represents behaviour of nowadays malware. This work provides a scheme for weighting permissions possibly applicable to an unknown samples dataset, while keeping a good performance in classification. The model is good in detecting Android malware with around 97% of the True Positive Rate and predicting Android malware with around 95% of the True Positive Rate. This means that it is capable to discriminate almost all cases of malware in detection and prediction. The Area Under Curve (AUC) metric is between 97% and 99%, which confers the outsatnding property of the outstanding detection system for the detection of malware. We propose additionally a system that can be embedded into an Android hand-held device for real-time detection. The results of the comparison to three renowned antiviruses reveal that our framework clearly outperforms two of them

    Similar works