Search CORE

341 research outputs found

Machine Learning Aided Static Malware Analysis: A Survey and Tutorial

Author: Andrii Shalaginov
D Krishna Sandeep Reddy
Farid Daryabar
Igor Santos
Reinaldo Jose Mangialardo
Smita Naval
Steve Watson
Teuvo Kohonen
Yanfang Ye
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/08/2018
Field of study

Malware analysis and detection techniques have been evolving during the last decade as a reflection to development of different malware techniques to evade network-based and host-based security protections. The fast growth in variety and number of malware species made it very difficult for forensics investigators to provide an on time response. Therefore, Machine Learning (ML) aided malware analysis became a necessity to automate different aspects of static and dynamic malware investigation. We believe that machine learning aided static analysis can be used as a methodological approach in technical Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware analysis that has been thoroughly studied before. In this paper, we address this research gap by conducting an in-depth survey of different machine learning methods for classification of static characteristics of 32-bit malicious Portable Executable (PE32) Windows files and develop taxonomy for better understanding of these techniques. Afterwards, we offer a tutorial on how different machine learning techniques can be utilized in extraction and analysis of a variety of static characteristic of PE binaries and evaluate accuracy and practical generalization of these techniques. Finally, the results of experimental study of all the method using common data was given to demonstrate the accuracy and complexity. This paper may serve as a stepping stone for future researchers in cross-disciplinary field of machine learning aided malware forensics.Comment: 37 Page

arXiv.org e-Print Archive

Crossref

Op2Vec: An Opcode Embedding Technique and Dataset Design for End-to-End Detection of Android Malware

Author: Ali Sikandar
Ghani Anwar
Khan Kaleem Nawaz
Khan Muhammad Salman
Nauman Mohammad
Ullah Najeeb
Publication venue: 'Hindawi Limited'
Publication date: 01/03/2022
Field of study

Android is one of the leading operating systems for smart phones in terms of market share and usage. Unfortunately, it is also an appealing target for attackers to compromise its security through malicious applications. To tackle this issue, domain experts and researchers are trying different techniques to stop such attacks. All the attempts of securing Android platform are somewhat successful. However, existing detection techniques have severe shortcomings, including the cumbersome process of feature engineering. Designing representative features require expert domain knowledge. There is a need for minimizing human experts' intervention by circumventing handcrafted feature engineering. Deep learning could be exploited by extracting deep features automatically. Previous work has shown that operational codes (opcodes) of executables provide key information to be used with deep learning models for detection process of malicious applications. The only challenge is to feed opcodes information to deep learning models. Existing techniques use one-hot encoding to tackle the challenge. However, the one-hot encoding scheme has severe limitations. In this paper, we introduce; (1) a novel technique for opcodes embedding, which we name Op2Vec, (2) based on the learned Op2Vec we have developed a dataset for end-to-end detection of android malware. Introducing the end-to-end Android malware detection technique avoids expert-intensive handcrafted features extraction, and ensures automation. Some of the recent deep learning-based techniques showed significantly improved results when tested with the proposed approach and achieved an average detection accuracy of 97.47%, precision of 0.976 and F1 score of 0.979

arXiv.org e-Print Archive

Mobile Malware Behaviour through Opcode Analysis

Author: Anuar Noor Azleen
Bahamanan Nazrulazhar
Mas’ud Mohd Zaki
Mat Ariff Nor Azman
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 16/04/2022
Field of study

As the popularity of mobile devices are on the rise, millions of users are now exposed to mobile malware threats. Malware is known for its ability in causing damage to mobile devices. Attackers often use it as a way to use the resources available and for other cybercriminal benefits such stealing users’ data, credentials and credit card number. Various detection techniques have been introduced in mitigating mobile malware, yet the malware author has its own method to overcome the detection method. This paper presents mobile malware analysis approaches through opcode analysis. Opcode analysis on mobile malware reveals the behaviour of malicious application in the binary level. The comparison made between the numbers of opcode occurrence from a malicious application and benign shows a significance traits. These differences can be used in classifying the malicious and benign mobile application

International Journal of Communication Networks and Information Security (IJCNIS)

Effectiveness of Opcode ngrams for Detection of Multi Family Android Malware

Author: Canfora Gerardo
De Lorenzo Andrea
Medvet Eric
Mercaldo Francesco
Visaggio Corrado Aaron
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

With the wide diffusion of smartphones and their usage in a plethora of processes and activities, these devices have been handling an increasing variety of sensitive resources. Attackers are hence producing a large number of malware applications for Android (the most spread mobile platform), often by slightly modifying existing applications, which results in malware being organized in families. Some works in the literature showed that opcodes are informative for detecting malware, not only in the Android platform. In this paper, we investigate if frequencies of ngrams of opcodes are effective in detecting Android malware and if there is some significant malware family for which they are more or less effective. To this end, we designed a method based on state-of-the-art classifiers applied to frequencies of opcodes ngrams. Then, we experimentally evaluated it on a recent dataset composed of 11120 applications, 5560 of which are malware belonging to several different families. Results show that an accuracy of 97% can be obtained on the average, whereas perfect detection rate is achieved for more than one malware family

Archivio istituzionale della ricerca - Università di Trieste

Crossref

Exploring the Usage of Topic Modeling for Android Malware Static Analysis

Author: Medvet Eric
Mercaldo Francesco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The rapid growth in smartphone and tablet usage over the last years has led to the inevitable rise in targeting of these devices by cyber-criminals. The exponential growth of Android devices, and the buoyant and largely unregulated Android app market, produced a sharp rise in malware targeting that platform. Furthermore, malware writers have been developing detection-evasion techniques which rapidly make anti-malware technologies ineffective. It is hence advisable that security expert are provided with tools which can aid them in the analysis of existing and new Android malware. In this paper, we explore the use of topic modeling as a technique which can assist experts to analyse malware applications in order to discover their characteristic. We apply Latend Dirichlet Allocation (LDA) to mobile applications represented as opcode sequences, hence considering a topic as a discrete distribution of opcode. Our experiments on a dataset of 900 malware applications of different families show that the information provided by topic modeling may help in better understanding malware characteristics and similarities

Archivio istituzionale della ricerca - Università di Trieste

Università degli Studi del Molise: IRIS

Crossref

Malicious code detection in android : the role of sequence characteristics and disassembling methods

Author: Acartürk Cengiz
Balikcioglu Pinar G.
Kucuk Ozge A.
Sirlanci Melih
Turkmen Ramazan K.
Ulukapi Bulut
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

The acceptance and widespread use of the Android operating system drew the attention of both legitimate developers and malware authors, which resulted in a significant number of benign and malicious applications available on various online markets. Since the signature-based methods fall short for detecting malicious software effectively considering the vast number of applications, machine learning techniques in this field have also become widespread. In this context, stating the acquired accuracy values in the contingency tables in malware detection studies has become a popular and efficient method and enabled researchers to evaluate their methodologies comparatively. In this study, we wanted to investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers, particularly the disassembly method and the input data characteristics. Firstly, we developed a model that tackles the malware detection problem from a Natural Language Processing (NLP) perspective using Long Short-Term Memory (LSTM). Then, we experimented with different base units (instruction, basic block, method, and class) and representations of source code obtained from three commonly used disassembling tools (JEB, IDA, and Apktool) and examined the results. Our findings exhibit that the disassembly method and different input representations affect the model results. More specifically, the datasets collected by the Apktool achieved better results compared to the other two disassemblers

Jagiellonian Univeristy Repository