Search CORE

6 research outputs found

A Survey and Evaluation of Android-Based Malware Evasion Techniques and Detection Frameworks

Author: Bhan Rhati
Bhatia Sajal
El Madhoun Nour
Faruki Parvez
Jain Vinesh
Pamula Rajendra
Publication venue: DigitalCommons@SHU
Publication date: 01/01/2023
Field of study

Android platform security is an active area of research where malware detection techniques continuously evolve to identify novel malware and improve the timely and accurate detection of existing malware. Adversaries are constantly in charge of employing innovative techniques to avoid or prolong malware detection effectively. Past studies have shown that malware detection systems are susceptible to evasion attacks where adversaries can successfully bypass the existing security defenses and deliver the malware to the target system without being detected. The evolution of escape-resistant systems is an open research problem. This paper presents a detailed taxonomy and evaluation of Android-based malware evasion techniques deployed to circumvent malware detection. The study characterizes such evasion techniques into two broad categories, polymorphism and metamorphism, and analyses techniques used for stealth malware detection based on the malware’s unique characteristics. Furthermore, the article also presents a qualitative and systematic comparison of evasion detection frameworks and their detection methodologies for Android-based malware. Finally, the survey discusses open-ended questions and potential future directions for continued research in mobile malware detection

Sacred Heart University: DigitalCommons@SHU

Rapid Permissions-Based Detection and Analysis of Mobile Malware Using Random Decision Forests

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Feature Selection on Permissions, Intents and APIs for Android Malware Detection

Author: Guyton Fred
Publication venue: NSUWorks
Publication date: 01/01/2021
Field of study

Malicious applications pose an enormous security threat to mobile computing devices. Currently 85% of all smartphones run Android, Google’s open-source operating system, making that platform the primary threat vector for malware attacks. Android is a platform that hosts roughly 99% of known malware to date, and is the focus of most research efforts in mobile malware detection due to its open source nature. One of the main tools used in this effort is supervised machine learning. While a decade of work has made a lot of progress in detection accuracy, there is an obstacle that each stream of research is forced to overcome, feature selection, i.e., determining which attributes of Android are most effective as inputs into machine learning models. This dissertation aims to address that problem by providing the community with an exhaustive analysis of the three primary types of Android features used by researchers: Permissions, Intents and API Calls. The intent of the report is not to describe a best performing feature set or a best performing machine learning model, nor to explain why certain Permissions, Intents or API Calls get selected above others, but rather to provide a holistic methodology to help guide feature selection for Android malware detection. The experiments used eleven different feature selection techniques covering filter methods, wrapper methods and embedded methods. Each feature selection technique was applied to seven different datasets based on the seven combinations available of Permissions, Intents and API Calls. Each of those seven datasets are from a base set of 119k Android apps. All of the result sets were then validated against three different machine learning models, Random Forest, SVM and a Neural Net, to test applicability across algorithm type. The experiments show that using a combination of Permissions, Intents and API Calls produced higher accuracy than using any of those alone or in any other combination and that feature selection should be performed on the combined dataset, not by feature type and then combined. The data also shows that, in general, a feature set size of 200 or more attributes is required for optimal results. Finally, the feature selection methods Relief, Correlation-based Feature Selection (CFS) and Recursive Feature Elimination (RFE) using a Neural Net are not satisfactory approaches for Android malware detection work. Based on the proposed methodology and experiments, this research provided insights into feature selection – a significant but often overlooked issue in Android malware detection. We believe the results reported herein is an important step for effective feature evaluation and selection in assisting malware detection especially for datasets with a large number of features. The methodology also has the potential to be applied to similar malware detection tasks or even in broader domains such as pattern recognition

ProQuest OAI Repository

NSU Works

Recommended from our members

Investigating Android permissions and intents for malware detection

Author: Abro F. I.
Publication venue
Publication date
Field of study

Today’s smart phones are used for wider range of activities. This extended range of functionalities has also seen the infiltration of new security threats. Android has been the favorite target of cyber criminals. The malicious parties are using highly stealthy techniques to perform the targeted operations, which are hard to detect by the conventional signature and behaviour based approaches. Additionally, the limited resources of mobile device are inadequate to perform the extensive malware detection tasks. Impulsively emerging Android malware merit a robust and effective malware detection solution. In this thesis, we present the PIndroid ― a novel Permissions and Intents based framework for identifying Android malware apps. To the best of author’s knowledge, PIndroid is the first solution that uses a combination of permissions and intents supplemented with ensemble methods for malware detection. It overcomes the drawbacks of some of the existing malware detection methods. Our goal is to provide mobile users with an effective malware detection and prevention solution keeping in view the limited resources of mobile devices and versatility of malware behavior. Our detection engine classifies the apps against certain distinguishing combinations of permissions and intents. We conducted a comparative study of different machine learning algorithms against several performance measures to demonstrate their relative advantages. The proposed approach, when applied to 1,745 real world applications, provides more than 99% accuracy (which is best reported to date). Empirical results suggest that the proposed framework is effective in detection of malware apps including the obfuscated ones. In this thesis, we also present AndroPIn—an Android based malware detection algorithm using Permissions and Intents. It is designed with the methodology proposed in PInDroid. AndroPIn overcomes the limitation of stealthy techniques used by malware by exploiting the usage pattern of permissions and intents. These features, which play a major role in sharing user data and device resources cannot be obfuscated or altered. These vital features are well suited for resource constrained smartphones. Experimental evaluation on a corpus of real-world malware and benign apps demonstrate that the proposed algorithm can effectively detect malicious apps and is resilient to common obfuscations methods. Besides PInDroid and AndroPIn, this thesis consists of three additional studies, which supplement the proposed methodology. First study investigates if there is any correlation between permissions and intents which can be exploited to detect malware apps. For this, the statistical significance test is applied to investigate the correlation between permissions and intents. We found statistical evidence of a strong correlation between permissions and intents which could be exploited to detect malware applications. The second study is conducted to investigate if the performance of classifiers can further be improved with ensemble learning methods. We applied different ensemble methods such as bagging, boosting and stacking. The experiments with ensemble methods yielded much improved results. The third study is related to investigating if the permissions and intents based system can be used to detect the ever challenging colluding apps. Application collusion is an emerging threat to Android based devices. We discuss the current state of research on app collusion and open challenges to the detection of colluding apps. We compare existing approaches and present an integrated approach that can be used to detect the malicious app collusion

City Research Online

Data Science for Software Maintenance

Author: Inozemtseva Laura Michelle McLean
Publication venue: 'University of Waterloo'
Publication date: 01/01/2017
Field of study

Maintaining and evolving modern software systems is a difficult task: their scope and complexity mean that seemingly inconsequential changes can have far-reaching consequences. Most software development companies attempt to reduce the number of faults introduced by adopting maintenance processes. These processes can be developed in various ways. In this thesis, we argue that data science techniques can be used to support process development. Specifically, we claim that robust development processes are necessary to minimize the number of faults introduced when evolving complex software systems. These processes should be based on empirical research findings. Data science techniques allow software engineering researchers to develop research insights that may be difficult or impossible to obtain with other research methodologies. These research insights support the creation of development processes. Thus, data science techniques support the creation of empirically-based development processes. We support this argument with three examples. First, we present insights into automated malicious Android application (app) detection. Many of the prior studies done on this topic used small corpora that may provide insufficient variety to create a robust app classifier. Currently, no empirically established guidelines for corpus size exist, meaning that previous studies have used anywhere from tens of apps to hundreds of thousands of apps to draw their conclusions. This variability makes it difficult to judge if the findings of any one study generalize. We attempted to establish such guidelines and found that 1,000 apps may be sufficient for studies that are concerned with what the majority of apps do, while more than a million apps may be required in studies that want to identify outliers. Moreover, many prior studies of malicious app detection used outdated malware corpora in their experiments that, combined with the rapid evolution of the Android API, may have influenced the accuracy of the studies. We investigated this problem by studying 1.3 million apps and showed that the evolution of the API does affect classifier accuracy, but not in the way we originally predicted. We also used our API usage data to identify the most infrequently used API methods. The use of data science techniques allowed us to study an order of magnitude more apps than previous work in the area; additionally, our insights into infrequently used methods illustrate how data science can be used to guide API deprecation. Second, we present insights into the costs and benefits of regression testing. Regression test suites grow over time, and while a comprehensive suite can detect faults that are introduced into the system, such a suite can be expensive to write, maintain, and execute. These costs may or may not be justified, depending on the number and severity of faults the suite can detect. By studying 61 projects that use Travis CI, a continuous integration system, we were able to characterize the cost/benefit tradeoff of their test suites. For example, we found that only 74% of non-flaky test failures are caused by defects in the system under test; the other 26% were caused by incorrect or obsolete tests and thus represent a maintenance cost rather than a benefit of the suite. Data about the costs and benefits of testing can help system maintainers understand whether their test suite is a good investment, shaping their subsequent maintenance decisions. The use of data science techniques allowed us to study a large number of projects, increasing the external generalizability of the study and making the insights gained more useful. Third, we present insights into the use of mutants to replace real faulty programs in testing research. Mutants are programs that contain deliberately injected faults, where the faults are generated by applying mutation operators. Applying an operator means making a small change to the program source code, such as replacing a constant with another constant. The use of mutants is appealing because large numbers of mutants can be automatically generated and used when known faults are unavailable or insufficient in number. However, prior to this work, there was little experimental evidence to support the use of mutants as a replacement for real faults. We studied this problem and found that, in general, mutants are an adequate substitute for faults when conducting testing research. That is, a test suite’s ability to detect mutants is correlated with its ability to detect real faults that developers have fixed, for both developer-written and automatically-generated test suites. However, we also found that additional mutation operators should be developed and some classes of faults cannot be generated via mutation. The use of data science techniques was an essential part of generating the set of real faults used in the study. Taken together, the results of these three studies provide evidence that data science techniques allow software engineering researchers to develop insights that are difficult or impossible to obtain using other research methodologie

University of Waterloo's Institutional Repository