14 research outputs found

    Ensemble Method for Mobile Malware Detection using N-Gram Sequences of System Calls

    Get PDF
    Mobile device has become an essential tool among the community across the globe and has turned into a necessity in daily life. An extensive usage of mobile devices for everyday life tasks such as online banking, online shopping and exchanging e-mails has enable mobile devices to become data storage for users. The data stored in these mobile devices can contain sensitive and critical information to the users. Hence, making mobile devices as the prime target for cybercriminal. To date, Android based mobile devices is one of the mobile devices that are dominating the phone market. Moreover, the ease of use and open-source feature has made Android based mobile devices popular. However, the widely used Android mobile devices has encourage malware author to write malicious application. In a short duration of time mobile malware has rapidly evolve and have the capability to bypass signature detection approach which requires a constant signature update to detect mobile malware. To overcome this drawback an anomaly detection approach can be used to mitigate this issue. Yet, using a single classifier in an anomaly detection approach will not improve the classification detection performance. Based on this reason, this research formulates an ensemble classification method of different n-gram system call sequence features to improve the accuracy of mobile malware detection. This research proposes n-number of classifier models for each different n-gram sequence call feature. The probability output of each classifier is then combined to produce a better classification performance which is better compared to a single classifier

    apk2vec: Semi-supervised multi-view representation learning for profiling Android applications

    Full text link
    Building behavior profiles of Android applications (apps) with holistic, rich and multi-view information (e.g., incorporating several semantic views of an app such as API sequences, system calls, etc.) would help catering downstream analytics tasks such as app categorization, recommendation and malware analysis significantly better. Towards this goal, we design a semi-supervised Representation Learning (RL) framework named apk2vec to automatically generate a compact representation (aka profile/embedding) for a given app. More specifically, apk2vec has the three following unique characteristics which make it an excellent choice for largescale app profiling: (1) it encompasses information from multiple semantic views such as API sequences, permissions, etc., (2) being a semi-supervised embedding technique, it can make use of labels associated with apps (e.g., malware family or app category labels) to build high quality app profiles, and (3) it combines RL and feature hashing which allows it to efficiently build profiles of apps that stream over time (i.e., online learning). The resulting semi-supervised multi-view hash embeddings of apps could then be used for a wide variety of downstream tasks such as the ones mentioned above. Our extensive evaluations with more than 42,000 apps demonstrate that apk2vec's app profiles could significantly outperform state-of-the-art techniques in four app analytics tasks namely, malware detection, familial clustering, app clone detection and app recommendation.Comment: International Conference on Data Mining, 201

    Detecting Repackaged Android Applications Using Perceptual Hashing

    Get PDF
    The last decade has shown a steady rate of Android device dominance in market share and the emergence of hundreds of thousands of apps available to the public. Because of the ease of reverse engineering Android applications, repackaged malicious apps that clone existing code have become a severe problem in the marketplace. This research proposes a novel repackaged detection system based on perceptual hashes of vetted Android apps and their associated dynamic user interface (UI) behavior. Results show that an average hash approach produces 88% accuracy (indicating low false negative and false positive rates) in a sample set of 4878 Android apps, including 2151 repackaged apps. The approach is the first dynamic method proposed in the research community using image-based hashing techniques with reasonable performance to other known dynamic approaches and the possibility for practical implementation at scale for new applications entering the Android market

    Explainable Malware Detection System Using Transformers-Based Transfer Learning and Multi-Model Visual Representation

    Get PDF
    Android has become the leading mobile ecosystem because of its accessibility and adaptability. It has also become the primary target of widespread malicious apps. This situation needs the immediate implementation of an effective malware detection system. In this study, an explainable malware detection system was proposed using transfer learning and malware visual features. For effective malware detection, our technique leverages both textual and visual features. First, a pre-trained model called the Bidirectional Encoder Representations from Transformers (BERT) model was designed to extract the trained textual features. Second, the malware-to-image conversion algorithm was proposed to transform the network byte streams into a visual representation. In addition, the FAST (Features from Accelerated Segment Test) extractor and BRIEF (Binary Robust Independent Elementary Features) descriptor were used to efficiently extract and mark important features. Third, the trained and texture features were combined and balanced using the Synthetic Minority Over-Sampling (SMOTE) method; then, the CNN network was used to mine the deep features. The balanced features were then input into the ensemble model for efficient malware classification and detection. The proposed method was analyzed extensively using two public datasets, CICMalDroid 2020 and CIC-InvesAndMal2019. To explain and validate the proposed methodology, an interpretable artificial intelligence (AI) experiment was conducted

    An Efficient Multistage Fusion Approach for Smartphone Security Analysis

    Get PDF
    Android smartphone ecosystem is inundated with innumerable applications mainly developed by third party contenders leading to high vulnerability of these devices. In addition, proliferation of smartphone usage along with their potential applications in diverse field entice malware community to develop new malwares to attack these devices. In order to overcome these issues, an android malware detection framework is proposed wherein an efficient multistage fusion approach is introduced. For this, a robust unified feature vector is created by fusion of transformed feature matrices corresponding to multi-cue using non-linear graph based cross-diffusion. Unified feature is further subjected to multiple classifiers to obtain their classification scores. Classifier scores are further optimally fused employing Dezert-Smarandache Theory (DSmT). Strength of suggested model is assessed both qualitatively and quantitatively by ten-fold cross-validation on the benchmarked datasets. On an average of outcome, we achieved detection accuracy of 98.97% and F-measure of 0.9936.&nbsp

    Android Malware Family Classification and Analysis: Current Status and Future Directions

    Get PDF
    Android receives major attention from security practitioners and researchers due to the influx number of malicious applications. For the past twelve years, Android malicious applications have been grouped into families. In the research community, detecting new malware families is a challenge. As we investigate, most of the literature reviews focus on surveying malware detection. Characterizing the malware families can improve the detection process and understand the malware patterns. For this reason, we conduct a comprehensive survey on the state-of-the-art Android malware familial detection, identification, and categorization techniques. We categorize the literature based on three dimensions: type of analysis, features, and methodologies and techniques. Furthermore, we report the datasets that are commonly used. Finally, we highlight the limitations that we identify in the literature, challenges, and future research directions regarding the Android malware family.https://doi.org/10.3390/electronics906094

    A Framework for Hybrid Intrusion Detection Systems

    Get PDF
    Web application security is a definite threat to the world’s information technology infrastructure. The Open Web Application Security Project (OWASP), generally defines web application security violations as unauthorized or unintentional exposure, disclosure, or loss of personal information. These breaches occur without the company’s knowledge and it often takes a while before the web application attack is revealed to the public, specifically because the security violations are fixed. Due to the need to protect their reputation, organizations have begun researching solutions to these problems. The most widely accepted solution is the use of an Intrusion Detection System (IDS). Such systems currently rely on either signatures of the attack used for the data breach or changes in the behavior patterns of the system to identify an intruder. These systems, either signature-based or anomaly-based, are readily understood by attackers. Issues arise when attacks are not noticed by an existing IDS because the attack does not fit the pre-defined attack signatures the IDS is implemented to discover. Despite current IDSs capabilities, little research has identified a method to detect all potential attacks on a system. This thesis intends to address this problem. A particular emphasis will be placed on detecting advanced attacks, such as those that take place at the application layer. These types of attacks are able to bypass existing IDSs, increase the potential for a web application security breach to occur and not be detected. In particular, the attacks under study are all web application layer attacks. Those included in this thesis are SQL injection, cross-site scripting, directory traversal and remote file inclusion. This work identifies common and existing data breach detection methods as well as the necessary improvements for IDS models. Ultimately, the proposed approach combines an anomaly detection technique measured by cross entropy and a signature-based attack detection framework utilizing genetic algorithm. The proposed hybrid model for data breach detection benefits organizations by increasing security measures and allowing attacks to be identified in less time and more efficiently
    corecore