2,116 research outputs found
R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections
The influence of Deep Learning on image identification and natural language
processing has attracted enormous attention globally. The convolution neural
network that can learn without prior extraction of features fits well in
response to the rapid iteration of Android malware. The traditional solution
for detecting Android malware requires continuous learning through
pre-extracted features to maintain high performance of identifying the malware.
In order to reduce the manpower of feature engineering prior to the condition
of not to extract pre-selected features, we have developed a coloR-inspired
convolutional neuRal networks (CNN)-based AndroiD malware Detection (R2-D2)
system. The system can convert the bytecode of classes.dex from Android archive
file to rgb color code and store it as a color image with fixed size. The color
image is input to the convolutional neural network for automatic feature
extraction and training. The data was collected from Jan. 2017 to Aug 2017.
During the period of time, we have collected approximately 2 million of benign
and malicious Android apps for our experiments with the help from our research
partner Leopard Mobile Inc. Our experiment results demonstrate that the
proposed system has accurate security analysis on contracts. Furthermore, we
keep our research results and experiment materials on http://R2D2.TWMAN.ORG.Comment: Verison 2018/11/15, IEEE BigData 2018, Seattle, WA, USA, Dec 10-13,
2018. (Accepted
Dynamic Analysis of Executables to Detect and Characterize Malware
It is needed to ensure the integrity of systems that process sensitive
information and control many aspects of everyday life. We examine the use of
machine learning algorithms to detect malware using the system calls generated
by executables-alleviating attempts at obfuscation as the behavior is monitored
rather than the bytes of an executable. We examine several machine learning
techniques for detecting malware including random forests, deep learning
techniques, and liquid state machines. The experiments examine the effects of
concept drift on each algorithm to understand how well the algorithms
generalize to novel malware samples by testing them on data that was collected
after the training data. The results suggest that each of the examined machine
learning algorithms is a viable solution to detect malware-achieving between
90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the
performance evaluation on an operational network may not match the performance
achieved in training. Namely, the CAA may be about the same, but the values for
precision and recall over the malware can change significantly. We structure
experiments to highlight these caveats and offer insights into expected
performance in operational environments. In addition, we use the induced models
to gain a better understanding about what differentiates the malware samples
from the goodware, which can further be used as a forensics tool to understand
what the malware (or goodware) was doing to provide directions for
investigation and remediation.Comment: 9 pages, 6 Tables, 4 Figure
From Malware Samples to Fractal Images: A New Paradigm for Classification. (Version 2.0, Previous version paper name: Have you ever seen malware?)
To date, a large number of research papers have been written on the
classification of malware, its identification, classification into different
families and the distinction between malware and goodware. These works have
been based on captured malware samples and have attempted to analyse malware
and goodware using various techniques, including techniques from the field of
artificial intelligence. For example, neural networks have played a significant
role in these classification methods. Some of this work also deals with
analysing malware using its visualisation. These works usually convert malware
samples capturing the structure of malware into image structures, which are
then the object of image processing. In this paper, we propose a very
unconventional and novel approach to malware visualisation based on dynamic
behaviour analysis, with the idea that the images, which are visually very
interesting, are then used to classify malware concerning goodware. Our
approach opens an extensive topic for future discussion and provides many new
directions for research in malware analysis and classification, as discussed in
conclusion. The results of the presented experiments are based on a database of
6 589 997 goodware, 827 853 potentially unwanted applications and 4 174 203
malware samples provided by ESET and selected experimental data (images,
generating polynomial formulas and software generating images) are available on
GitHub for interested readers. Thus, this paper is not a comprehensive compact
study that reports the results obtained from comparative experiments but rather
attempts to show a new direction in the field of visualisation with possible
applications in malware analysis.Comment: This paper is under review; the section describing conversion from
malware structure to fractal figure is temporarily erased here to protect our
idea. It will be replaced by a full version when accepte
GRASE: Granulometry Analysis with Semi Eager Classifier to Detect Malware
Technological advancement in communication leading to 5G, motivates everyone to get connected to the internet including ‘Devices’, a technology named Web of Things (WoT). The community benefits from this large-scale network which allows monitoring and controlling of physical devices. But many times, it costs the security as MALicious softWARE (MalWare) developers try to invade the network, as for them, these devices are like a ‘backdoor’ providing them easy ‘entry’. To stop invaders from entering the network, identifying malware and its variants is of great significance for cyberspace. Traditional methods of malware detection like static and dynamic ones, detect the malware but lack against new techniques used by malware developers like obfuscation, polymorphism and encryption. A machine learning approach to detect malware, where the classifier is trained with handcrafted features, is not potent against these techniques and asks for efforts to put in for the feature engineering. The paper proposes a malware classification using a visualization methodology wherein the disassembled malware code is transformed into grey images. It presents the efficacy of Granulometry texture analysis technique for improving malware classification. Furthermore, a Semi Eager (SemiE) classifier, which is a combination of eager learning and lazy learning technique, is used to get robust classification of malware families. The outcome of the experiment is promising since the proposed technique requires less training time to learn the semantics of higher-level malicious behaviours. Identifying the malware (testing phase) is also done faster. A benchmark database like malimg and Microsoft Malware Classification challenge (BIG-2015) has been utilized to analyse the performance of the system. An overall average classification accuracy of 99.03 and 99.11% is achieved, respectively
MDFRCNN: Malware Detection using Faster Region Proposals Convolution Neural Network
Technological advancement of smart devices has opened up a new trend: Internet of Everything (IoE), where all devices are connected to the web. Large scale networking benefits the community by increasing connectivity and giving control of physical devices. On the other hand, there exists an increased ‘Threat’ of an ‘Attack’. Attackers are targeting these devices, as it may provide an easier ‘backdoor entry to the users’ network’.MALicious softWARE (MalWare) is a major threat to user security. Fast and accurate detection of malware attacks are the sine qua non of IoE, where large scale networking is involved. The paper proposes use of a visualization technique where the disassembled malware code is converted into gray images, as well as use of Image Similarity based Statistical Parameters (ISSP) such as Normalized Cross correlation (NCC), Average difference (AD), Maximum difference (MaxD), Singular Structural Similarity Index Module (SSIM), Laplacian Mean Square Error (LMSE), MSE and PSNR. A vector consisting of gray image with statistical parameters is trained using a Faster Region proposals Convolution Neural Network (F-RCNN) classifier. The experiment results are promising as the proposed method includes ISSP with F-RCNN training. Overall training time of learning the semantics of higher-level malicious behaviors is less. Identification of malware (testing phase) is also performed in less time. The fusion of image and statistical parameter enhances system performance with greater accuracy. The benchmark database from Microsoft Malware Classification challenge has been used to analyze system performance, which is available on the Kaggle website. An overall average classification accuracy of 98.12% is achieved by the proposed method
Classification and Analysis of Android Malware Images Using Feature Fusion Technique
The super packed functionalities and artificial intelligence (AI)-powered applications have made the Android operating system a big player in the market. Android smartphones have become an integral part of life and users are reliant on their smart devices for making calls, sending text messages, navigation, games, and financial transactions to name a few. This evolution of the smartphone community has opened new horizons for malware developers. As malware variants are growing at a tremendous rate every year, there is an urgent need to combat against stealth malware techniques. This paper proposes a visualization and machine learning-based framework for classifying Android malware. Android malware applications from the DREBIN dataset were converted into grayscale images. In the first phase of the experiment, the proposed framework transforms Android malware into fifteen different image sections and identifies malware files by exploiting handcrafted features associated with Android malware images. The algorithms such as Gray Level Co-occurrence Matrix-based (GLCM), Global Image deScripTors (GIST), and Local Binary Pattern (LBP) are used to extract the handcrafted features from the image sections. The extracted features were further classified using machine learning algorithms like K-Nearest Neighbors, Support Vector Machines, and Random Forests. In the second phase of the experiment, handcrafted features were fused with CNN features to form the feature fusion strategy. The classification performance was evaluated against every malware image file section. The results obtained using the Feature Fusion strategy are compared with handcrafted features results. The experiment results conclude to the fact that Feature Fusion-SVM model is most suited for the identification and classification of Android malware using the certificate and Android Manifest (CR + AM) malware images. It attained an high accuracy of 93.24%
- …