2,116 research outputs found

    R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections

    Full text link
    The influence of Deep Learning on image identification and natural language processing has attracted enormous attention globally. The convolution neural network that can learn without prior extraction of features fits well in response to the rapid iteration of Android malware. The traditional solution for detecting Android malware requires continuous learning through pre-extracted features to maintain high performance of identifying the malware. In order to reduce the manpower of feature engineering prior to the condition of not to extract pre-selected features, we have developed a coloR-inspired convolutional neuRal networks (CNN)-based AndroiD malware Detection (R2-D2) system. The system can convert the bytecode of classes.dex from Android archive file to rgb color code and store it as a color image with fixed size. The color image is input to the convolutional neural network for automatic feature extraction and training. The data was collected from Jan. 2017 to Aug 2017. During the period of time, we have collected approximately 2 million of benign and malicious Android apps for our experiments with the help from our research partner Leopard Mobile Inc. Our experiment results demonstrate that the proposed system has accurate security analysis on contracts. Furthermore, we keep our research results and experiment materials on http://R2D2.TWMAN.ORG.Comment: Verison 2018/11/15, IEEE BigData 2018, Seattle, WA, USA, Dec 10-13, 2018. (Accepted

    Dynamic Analysis of Executables to Detect and Characterize Malware

    Full text link
    It is needed to ensure the integrity of systems that process sensitive information and control many aspects of everyday life. We examine the use of machine learning algorithms to detect malware using the system calls generated by executables-alleviating attempts at obfuscation as the behavior is monitored rather than the bytes of an executable. We examine several machine learning techniques for detecting malware including random forests, deep learning techniques, and liquid state machines. The experiments examine the effects of concept drift on each algorithm to understand how well the algorithms generalize to novel malware samples by testing them on data that was collected after the training data. The results suggest that each of the examined machine learning algorithms is a viable solution to detect malware-achieving between 90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the performance evaluation on an operational network may not match the performance achieved in training. Namely, the CAA may be about the same, but the values for precision and recall over the malware can change significantly. We structure experiments to highlight these caveats and offer insights into expected performance in operational environments. In addition, we use the induced models to gain a better understanding about what differentiates the malware samples from the goodware, which can further be used as a forensics tool to understand what the malware (or goodware) was doing to provide directions for investigation and remediation.Comment: 9 pages, 6 Tables, 4 Figure

    From Malware Samples to Fractal Images: A New Paradigm for Classification. (Version 2.0, Previous version paper name: Have you ever seen malware?)

    Full text link
    To date, a large number of research papers have been written on the classification of malware, its identification, classification into different families and the distinction between malware and goodware. These works have been based on captured malware samples and have attempted to analyse malware and goodware using various techniques, including techniques from the field of artificial intelligence. For example, neural networks have played a significant role in these classification methods. Some of this work also deals with analysing malware using its visualisation. These works usually convert malware samples capturing the structure of malware into image structures, which are then the object of image processing. In this paper, we propose a very unconventional and novel approach to malware visualisation based on dynamic behaviour analysis, with the idea that the images, which are visually very interesting, are then used to classify malware concerning goodware. Our approach opens an extensive topic for future discussion and provides many new directions for research in malware analysis and classification, as discussed in conclusion. The results of the presented experiments are based on a database of 6 589 997 goodware, 827 853 potentially unwanted applications and 4 174 203 malware samples provided by ESET and selected experimental data (images, generating polynomial formulas and software generating images) are available on GitHub for interested readers. Thus, this paper is not a comprehensive compact study that reports the results obtained from comparative experiments but rather attempts to show a new direction in the field of visualisation with possible applications in malware analysis.Comment: This paper is under review; the section describing conversion from malware structure to fractal figure is temporarily erased here to protect our idea. It will be replaced by a full version when accepte

    GRASE: Granulometry Analysis with Semi Eager Classifier to Detect Malware

    Get PDF
    Technological advancement in communication leading to 5G, motivates everyone to get connected to the internet including ‘Devices’, a technology named Web of Things (WoT). The community benefits from this large-scale network which allows monitoring and controlling of physical devices. But many times, it costs the security as MALicious softWARE (MalWare) developers try to invade the network, as for them, these devices are like a ‘backdoor’ providing them easy ‘entry’. To stop invaders from entering the network, identifying malware and its variants is of great significance for cyberspace. Traditional methods of malware detection like static and dynamic ones, detect the malware but lack against new techniques used by malware developers like obfuscation, polymorphism and encryption. A machine learning approach to detect malware, where the classifier is trained with handcrafted features, is not potent against these techniques and asks for efforts to put in for the feature engineering. The paper proposes a malware classification using a visualization methodology wherein the disassembled malware code is transformed into grey images. It presents the efficacy of Granulometry texture analysis technique for improving malware classification. Furthermore, a Semi Eager (SemiE) classifier, which is a combination of eager learning and lazy learning technique, is used to get robust classification of malware families. The outcome of the experiment is promising since the proposed technique requires less training time to learn the semantics of higher-level malicious behaviours. Identifying the malware (testing phase) is also done faster. A benchmark database like malimg and Microsoft Malware Classification challenge (BIG-2015) has been utilized to analyse the performance of the system. An overall average classification accuracy of 99.03 and 99.11% is achieved, respectively

    MDFRCNN: Malware Detection using Faster Region Proposals Convolution Neural Network

    Get PDF
    Technological advancement of smart devices has opened up a new trend: Internet of Everything (IoE), where all devices are connected to the web. Large scale networking benefits the community by increasing connectivity and giving control of physical devices. On the other hand, there exists an increased ‘Threat’ of an ‘Attack’. Attackers are targeting these devices, as it may provide an easier ‘backdoor entry to the users’ network’.MALicious softWARE (MalWare) is a major threat to user security. Fast and accurate detection of malware attacks are the sine qua non of IoE, where large scale networking is involved. The paper proposes use of a visualization technique where the disassembled malware code is converted into gray images, as well as use of Image Similarity based Statistical Parameters (ISSP) such as Normalized Cross correlation (NCC), Average difference (AD), Maximum difference (MaxD), Singular Structural Similarity Index Module (SSIM), Laplacian Mean Square Error (LMSE), MSE and PSNR. A vector consisting of gray image with statistical parameters is trained using a Faster Region proposals Convolution Neural Network (F-RCNN) classifier. The experiment results are promising as the proposed method includes ISSP with F-RCNN training. Overall training time of learning the semantics of higher-level malicious behaviors is less. Identification of malware (testing phase) is also performed in less time. The fusion of image and statistical parameter enhances system performance with greater accuracy. The benchmark database from Microsoft Malware Classification challenge has been used to analyze system performance, which is available on the Kaggle website. An overall average classification accuracy of 98.12% is achieved by the proposed method

    Classification and Analysis of Android Malware Images Using Feature Fusion Technique

    Get PDF
    The super packed functionalities and artificial intelligence (AI)-powered applications have made the Android operating system a big player in the market. Android smartphones have become an integral part of life and users are reliant on their smart devices for making calls, sending text messages, navigation, games, and financial transactions to name a few. This evolution of the smartphone community has opened new horizons for malware developers. As malware variants are growing at a tremendous rate every year, there is an urgent need to combat against stealth malware techniques. This paper proposes a visualization and machine learning-based framework for classifying Android malware. Android malware applications from the DREBIN dataset were converted into grayscale images. In the first phase of the experiment, the proposed framework transforms Android malware into fifteen different image sections and identifies malware files by exploiting handcrafted features associated with Android malware images. The algorithms such as Gray Level Co-occurrence Matrix-based (GLCM), Global Image deScripTors (GIST), and Local Binary Pattern (LBP) are used to extract the handcrafted features from the image sections. The extracted features were further classified using machine learning algorithms like K-Nearest Neighbors, Support Vector Machines, and Random Forests. In the second phase of the experiment, handcrafted features were fused with CNN features to form the feature fusion strategy. The classification performance was evaluated against every malware image file section. The results obtained using the Feature Fusion strategy are compared with handcrafted features results. The experiment results conclude to the fact that Feature Fusion-SVM model is most suited for the identification and classification of Android malware using the certificate and Android Manifest (CR + AM) malware images. It attained an high accuracy of 93.24%
    • …
    corecore