Search CORE

239 research outputs found

N-opcode Analysis for Android Malware Classification and Categorization

Author: Kang B.
McLaughlin K.
Sezer Sakir
Yerima Suleiman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Malware detection is a growing problem particularly on the Android mobile platform due to its increasing popularity and accessibility to numerous third party app markets. This has also been made worse by the increasingly sophisticated detection avoidance techniques employed by emerging malware families. This calls for more effective techniques for detection and classification of Android malware. Hence, in this paper we present an n-opcode analysis based approach that utilizes machine learning to classify and categorize Android malware. This approach enables automated feature discovery that eliminates the need for applying expert or domain knowledge to define the needed features. Our experiments on 2520 samples that were performed using up to 10-gram opcode features showed that an f-measure of 98% is achievable using this approach

arXiv.org e-Print Archive

De Montfort University Open Research Archive

Machine Learning Aided Static Malware Analysis: A Survey and Tutorial

Author: Andrii Shalaginov
D Krishna Sandeep Reddy
Farid Daryabar
Igor Santos
Reinaldo Jose Mangialardo
Smita Naval
Steve Watson
Teuvo Kohonen
Yanfang Ye
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/08/2018
Field of study

Malware analysis and detection techniques have been evolving during the last decade as a reflection to development of different malware techniques to evade network-based and host-based security protections. The fast growth in variety and number of malware species made it very difficult for forensics investigators to provide an on time response. Therefore, Machine Learning (ML) aided malware analysis became a necessity to automate different aspects of static and dynamic malware investigation. We believe that machine learning aided static analysis can be used as a methodological approach in technical Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware analysis that has been thoroughly studied before. In this paper, we address this research gap by conducting an in-depth survey of different machine learning methods for classification of static characteristics of 32-bit malicious Portable Executable (PE32) Windows files and develop taxonomy for better understanding of these techniques. Afterwards, we offer a tutorial on how different machine learning techniques can be utilized in extraction and analysis of a variety of static characteristic of PE binaries and evaluate accuracy and practical generalization of these techniques. Finally, the results of experimental study of all the method using common data was given to demonstrate the accuracy and complexity. This paper may serve as a stepping stone for future researchers in cross-disciplinary field of machine learning aided malware forensics.Comment: 37 Page

arXiv.org e-Print Archive

Op2Vec: An Opcode Embedding Technique and Dataset Design for End-to-End Detection of Android Malware

Author: Ali Sikandar
Ghani Anwar
Khan Kaleem Nawaz
Khan Muhammad Salman
Nauman Mohammad
Ullah Najeeb
Publication venue: 'Hindawi Limited'
Publication date: 01/03/2022
Field of study

Android is one of the leading operating systems for smart phones in terms of market share and usage. Unfortunately, it is also an appealing target for attackers to compromise its security through malicious applications. To tackle this issue, domain experts and researchers are trying different techniques to stop such attacks. All the attempts of securing Android platform are somewhat successful. However, existing detection techniques have severe shortcomings, including the cumbersome process of feature engineering. Designing representative features require expert domain knowledge. There is a need for minimizing human experts' intervention by circumventing handcrafted feature engineering. Deep learning could be exploited by extracting deep features automatically. Previous work has shown that operational codes (opcodes) of executables provide key information to be used with deep learning models for detection process of malicious applications. The only challenge is to feed opcodes information to deep learning models. Existing techniques use one-hot encoding to tackle the challenge. However, the one-hot encoding scheme has severe limitations. In this paper, we introduce; (1) a novel technique for opcodes embedding, which we name Op2Vec, (2) based on the learned Op2Vec we have developed a dataset for end-to-end detection of android malware. Introducing the end-to-end Android malware detection technique avoids expert-intensive handcrafted features extraction, and ensures automation. Some of the recent deep learning-based techniques showed significantly improved results when tested with the proposed approach and achieved an average detection accuracy of 97.47%, precision of 0.976 and F1 score of 0.979

arXiv.org e-Print Archive

Android Malware Clustering through Malicious Payload Mining

Author: I Santos
J Crussell
J Kim
J Leskovec
K Rieck
M Sebastián
S Hanna
U Bayer
Publication venue
Publication date: 15/07/2017
Field of study

Clustering has been well studied for desktop malware analysis as an effective triage method. Conventional similarity-based clustering techniques, however, cannot be immediately applied to Android malware analysis due to the excessive use of third-party libraries in Android application development and the widespread use of repackaging in malware development. We design and implement an Android malware clustering system through iterative mining of malicious payload and checking whether malware samples share the same version of malicious payload. Our system utilizes a hierarchical clustering technique and an efficient bit-vector format to represent Android apps. Experimental results demonstrate that our clustering approach achieves precision of 0.90 and recall of 0.75 for Android Genome malware dataset, and average precision of 0.98 and recall of 0.96 with respect to manually verified ground-truth.Comment: Proceedings of the 20th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2017

arXiv.org e-Print Archive

Malware Detection Approaches based on Operational Codes (OpCodes) of Executable Programs: A Review

Author: Saleh Mohammed A.
Publication venue: IAES Indonesia Section
Publication date: 30/06/2023
Field of study

A malicious software, or Malware for a short, poses a threat to computer systems, which need to be analyzed, detected, and eliminated. Generally, malware is analyzed in two ways: dynamic malware analysis and static malware analysis. The former collects features dataset during running of the malware, and involves malware APIs, registry activities, file activities, process activities, and network activities based features. The latter collects features dataset prior and without running the malware, and involves Operational Codes (OpCodes) and text based (Bytecodes) features. However, several previous researchers addressed and reviewed malware detection approaches based on various aspects, but none of them addressed and reviewed the approaches merely based on malware OpCodes. Therefore, this paper aims to review Malware Detection Approaches based on OpCodes. The review explores, demonstrates, and compares the existing approaches for detecting malware according to their OpCodes only, and finally presents a comprehensive comparable envisage about them

Robust Malware Detection for Internet Of (Battlefield) Things Devices Using Deep Eigenspace Learning

Author: Azmoodeh A.
Choo K.-K.R.
Dehghantanha A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Internet of Things (IoT) in military setting generally consists of a diverse range of Internet-connected devices and nodes (e.g. medical devices to wearable combat uniforms), which are a valuable target for cyber criminals, particularly state-sponsored or nation state actors. A common attack vector is the use of malware. In this paper, we present a deep learning based method to detect Internet Of Battlefield Things (IoBT) malware via the device's Operational Code (OpCode) sequence. We transmute OpCodes into a vector space and apply a deep Eigenspace learning approach to classify malicious and bening application. We also demonstrate the robustness of our proposed approach in malware detection and its sustainability against junk code insertion attacks. Lastly, we make available our malware sample on Github, which hopefully will benefit future research efforts (e.g. for evaluation of proposed malware detection approaches)

SaaS: A situational awareness and analysis system for massive android malware detection

Author: Ren Wei
Ren Yi
Zhang Yaocheng
Zhu Tianqing
Publication venue: 'Elsevier BV'
Publication date: 01/06/2019
Field of study

A large amount of mobile applications (Apps) are uploaded, distributed and updated in various Android markets, e.g., Google Play and Huawei AppGallery every day. One of the ongoing challenges is to detect malicious Apps (also known as malware) among those massive newcomers accurately and efficiently in the daily security management of Android App markets. Customers rely on those detection results in the selection of Apps upon downloading, and undetected malware may result in great damages. In this paper, we propose a cloud-based malware detection system called SaaS by leveraging and marrying multiple approaches from diverse domains such as natural language processing (n-gram), image processing (GLCM), cryptography (fuzzy hash), machine learning (random forest) and complex networks. We firstly extract n-gram features and GLCM features from an App's smali code and DEX file, respectively. We next feed those features into training data set, to create a machine learning detect model. The model is further enhanced by fuzzy hash to detect whether inspected App is repackaged or not. Extensive experiments (involving 1495 samples) demonstrates that the detecting accuracy is more than 98.5%, and support a large-scale detecting and monitoring. Besides, our proposed system can be deployed as a service in clouds and customers can access cloud services on demand

OPUS - University of Technology Sydney