11 research outputs found

    IoT Botnet Malware Classification Using Weka Tool and Scikit-learn Machine Learning

    Get PDF
    Botnet is one of the threats to internet network security-Botmaster in carrying out attacks on the network by relying on communication on network traffic. Internet of Things (IoT) network infrastructure consists of devices that are inexpensive, low-power, always-on, always connected to the network, and are inconspicuous and have ubiquity and inconspicuousness characteristics so that these characteristics make IoT devices an attractive target for botnet malware attacks. In identifying whether packet traffic is a malware attack or not, one can use machine learning classification methods. By using Weka and Scikit-learn analysis tools machine learning, this paper implements four machine learning algorithms, i.e.: AdaBoost, Decision Tree, Random Forest, and Naïve Bayes. Then experiments are conducted to measure the performance of the four algorithms in terms of accuracy, execution time, and false positive rate (FPR). Experiment results show that the Weka tool provides more accurate and efficient classification methods. However, in false positive rate, the use of Scikit-learn provides better results

    Malware Detection and Analysis Tools

    Get PDF
    The huge amounts of data and information that need to be analyzed for possible malicious intent are one ofthe big and significant challenges that the Web faces today. Malicious software, also referred to as malware developed by attackers, is polymorphic and metamorphic in nature which can modify the code as it spreads.In addition, the diversity and volume of their variants severely undermine the effectiveness of traditional defenses that typically use signature-based techniques and are unable to detect malicious executables previously unknown. Malware family variants share typical patterns of behavior that indicate their origin and purpose. The behavioral trends observed either statically or dynamically can be manipulated by usingmachine learning techniques to identify and classify unknown malware into their established families. Thissurvey paper gives an overview of the malware detection and analysis techniques and tools

    GRASE: Granulometry Analysis with Semi Eager Classifier to Detect Malware

    Get PDF
    Technological advancement in communication leading to 5G, motivates everyone to get connected to the internet including ‘Devices’, a technology named Web of Things (WoT). The community benefits from this large-scale network which allows monitoring and controlling of physical devices. But many times, it costs the security as MALicious softWARE (MalWare) developers try to invade the network, as for them, these devices are like a ‘backdoor’ providing them easy ‘entry’. To stop invaders from entering the network, identifying malware and its variants is of great significance for cyberspace. Traditional methods of malware detection like static and dynamic ones, detect the malware but lack against new techniques used by malware developers like obfuscation, polymorphism and encryption. A machine learning approach to detect malware, where the classifier is trained with handcrafted features, is not potent against these techniques and asks for efforts to put in for the feature engineering. The paper proposes a malware classification using a visualization methodology wherein the disassembled malware code is transformed into grey images. It presents the efficacy of Granulometry texture analysis technique for improving malware classification. Furthermore, a Semi Eager (SemiE) classifier, which is a combination of eager learning and lazy learning technique, is used to get robust classification of malware families. The outcome of the experiment is promising since the proposed technique requires less training time to learn the semantics of higher-level malicious behaviours. Identifying the malware (testing phase) is also done faster. A benchmark database like malimg and Microsoft Malware Classification challenge (BIG-2015) has been utilized to analyse the performance of the system. An overall average classification accuracy of 99.03 and 99.11% is achieved, respectively

    MDFRCNN: Malware Detection using Faster Region Proposals Convolution Neural Network

    Get PDF
    Technological advancement of smart devices has opened up a new trend: Internet of Everything (IoE), where all devices are connected to the web. Large scale networking benefits the community by increasing connectivity and giving control of physical devices. On the other hand, there exists an increased ‘Threat’ of an ‘Attack’. Attackers are targeting these devices, as it may provide an easier ‘backdoor entry to the users’ network’.MALicious softWARE (MalWare) is a major threat to user security. Fast and accurate detection of malware attacks are the sine qua non of IoE, where large scale networking is involved. The paper proposes use of a visualization technique where the disassembled malware code is converted into gray images, as well as use of Image Similarity based Statistical Parameters (ISSP) such as Normalized Cross correlation (NCC), Average difference (AD), Maximum difference (MaxD), Singular Structural Similarity Index Module (SSIM), Laplacian Mean Square Error (LMSE), MSE and PSNR. A vector consisting of gray image with statistical parameters is trained using a Faster Region proposals Convolution Neural Network (F-RCNN) classifier. The experiment results are promising as the proposed method includes ISSP with F-RCNN training. Overall training time of learning the semantics of higher-level malicious behaviors is less. Identification of malware (testing phase) is also performed in less time. The fusion of image and statistical parameter enhances system performance with greater accuracy. The benchmark database from Microsoft Malware Classification challenge has been used to analyze system performance, which is available on the Kaggle website. An overall average classification accuracy of 98.12% is achieved by the proposed method

    Survey of Machine Learning Techniques for Malware Analysis

    Get PDF
    Coping with malware is getting more and more challenging, given their relentless growth in complexity and volume. One of the most common approaches in literature is using machine learning techniques, to automatically learn models and patterns behind such complexity, and to develop technologies for keeping pace with the speed of development of novel malware. This survey aims at providing an overview on the way machine learning has been used so far in the context of malware analysis. We systematize surveyed papers according to their objectives (i.e., the expected output, what the analysis aims to), what information about malware they specifically use (i.e., the features), and what machine learning techniques they employ (i.e., what algorithm is used to process the input and produce the output). We also outline a number of problems concerning the datasets used in considered works, and finally introduce the novel concept of malware analysis economics, regarding the study of existing tradeoffs among key metrics, such as analysis accuracy and economical costs

    Tree-Based Classifier Ensembles for PE Malware Analysis: A Performance Revisit

    Get PDF
    Given their escalating number and variety, combating malware is becoming increasingly strenuous. Machine learning techniques are often used in the literature to automatically discover the models and patterns behind such challenges and create solutions that can maintain the rapid pace at which malware evolves. This article compares various tree-based ensemble learning methods that have been proposed in the analysis of PE malware. A tree-based ensemble is an unconventional learning paradigm that constructs and combines a collection of base learners (e.g., decision trees), as opposed to the conventional learning paradigm, which aims to construct individual learners from training data. Several tree-based ensemble techniques, such as random forest, XGBoost, CatBoost, GBM, and LightGBM, are taken into consideration and are appraised using different performance measures, such as accuracy, MCC, precision, recall, AUC, and F1. In addition, the experiment includes many public datasets, such as BODMAS, Kaggle, and CIC-MalMem-2022, to demonstrate the generalizability of the classifiers in a variety of contexts. Based on the test findings, all tree-based ensembles performed well, and performance differences between algorithms are not statistically significant, particularly when their respective hyperparameters are appropriately configured. The proposed tree-based ensemble techniques also outperformed other, similar PE malware detectors that have been published in recent years

    Analyzing and Detecting Internet of Things Malware Using Residual Static Graph- and String-Based Artifacts

    Get PDF
    Recently, the Internet of Things (IoT) has become wider and adopted many features from social networks and mainly uses sensing devices technologies, causing a rapid increase in production and adoption. However, security and privacy are serious threats that users usually take precautions to protect their devices and information. Thus, understanding the security shortcomings at first stage will educate IoT users to protect their connected things. Understanding IoT software through analysis, comparison (with other types of malware), and detection (from benign IoT) is an essential problem to mitigate security threats. We focus on two central perspectives, the graph and string representations of the software, typically extracted from the software binaries. First, we look into a comparative study of Android and IoT malware through the lenses of graph measurements. We construct the abstract structures of the malware, using Control Flow Graph (CFG) to represent malware binaries, and use them to conduct an in-depth analysis of malicious graphs. Machine Learning (ML) algorithms are actively used in the process of detecting and classifying malicious software. Toward detection, we use different CFG-based features as mentioned above, and augment them with CFGs of the benign dataset and build a detection system. Furthermore, we classify the IoT malware to their corresponding families. However, adversarial ML attacks on malware detectors are proposed in the literature. For example, Adversarial Examples (AEs) on the CFG can be generated by applying small perturbation to the graph features that force the model to misclassification. Thus, we propose Soteria, a CFG-based AEs detector utilizing deep learning with random walks to construct in-depth features. Moreover, we detect the malicious shell commands by extracting and analyzing the malicious commands of IoT malware. We utilize Natural Language Processing (NLP) for feature generation, followed by a deep learning model to detect malicious commands, hence detecting malware samples
    corecore