2 research outputs found
Obfuscation-resilient Android Malware Analysis Based on Contrastive Learning
Due to its open-source nature, Android operating system has been the main
target of attackers to exploit. Malware creators always perform different code
obfuscations on their apps to hide malicious activities. Features extracted
from these obfuscated samples through program analysis contain many useless and
disguised features, which leads to many false negatives. To address the issue,
in this paper, we demonstrate that obfuscation-resilient malware analysis can
be achieved through contrastive learning. We take the Android malware
classification as an example to demonstrate our analysis. The key insight
behind our analysis is that contrastive learning can be used to reduce the
difference introduced by obfuscation while amplifying the difference between
malware and benign apps (or other types of malware).
Based on the proposed analysis, we design a system that can achieve robust
and interpretable classification of Android malware. To achieve robust
classification, we perform contrastive learning on malware samples to learn an
encoder that can automatically extract robust features from malware samples. To
achieve interpretable classification, we transform the function call graph of a
sample into an image by centrality analysis. Then the corresponding heatmaps
are obtained by visualization techniques. These heatmaps can help users
understand why the malware is classified as this family. We implement IFDroid
and perform extensive evaluations on two widely used datasets. Experimental
results show that IFDroid is superior to state-of-the-art Android malware
familial classification systems. Moreover, IFDroid is capable of maintaining
98.2% true positive rate on classifying 8,112 obfuscated malware samples
Leveraging the Use of API Call Traces for Mobile Security
The growing popularity of Android applications has generated increased concerns over the danger of piracy and the spread of malware. A popular way to distribute malware in the mobile world is through the repackaging of legitimate apps. This process consists of downloading, unpacking, manipulating, recompiling an application, and publishing it again in an app store. In this thesis, we conduct an empirical study of over 15,000 apps to gain insights into the factors that drive the spread of repackaged apps. We also examine the motivations of developers who publish repackaged apps and those of users who download them, as well as the factors that determine which apps are chosen for repackaging, and the ways in which the apps are modified during the repackaging process. We have also studied android applications structure to investigate the locations where malicious code are more probable to be embedded into legitimate applications. We observed that service components contain key characteristics that entice attackers to misuse them. Therefore, we have focus on studying the behavior of malicious and benign services. Whereas benign services tend to inform the user of the background operations, malicious services tend to do long running operations and have a loose connection with rest of the code. These findings lead us to propose an approach to detect malware by studying the services’ behavior. To model the services’ behavior, we used API calls as feature sets. We proposed a hybrid approach using static and dynamic analysis to extract the API calls through the service lifecycle. Finally, we used the list of API calls preponderantly present in both malware as well as benign services as the feature set. We applied machine learning algorithms to use the feature set to classify malicious services and benign services