100 research outputs found

    Malware Resistant Data Protection in Hyper-connected Networks: A survey

    Full text link
    Data protection is the process of securing sensitive information from being corrupted, compromised, or lost. A hyperconnected network, on the other hand, is a computer networking trend in which communication occurs over a network. However, what about malware. Malware is malicious software meant to penetrate private data, threaten a computer system, or gain unauthorised network access without the users consent. Due to the increasing applications of computers and dependency on electronically saved private data, malware attacks on sensitive information have become a dangerous issue for individuals and organizations across the world. Hence, malware defense is critical for keeping our computer systems and data protected. Many recent survey articles have focused on either malware detection systems or single attacking strategies variously. To the best of our knowledge, no survey paper demonstrates malware attack patterns and defense strategies combinedly. Through this survey, this paper aims to address this issue by merging diverse malicious attack patterns and machine learning (ML) based detection models for modern and sophisticated malware. In doing so, we focus on the taxonomy of malware attack patterns based on four fundamental dimensions the primary goal of the attack, method of attack, targeted exposure and execution process, and types of malware that perform each attack. Detailed information on malware analysis approaches is also investigated. In addition, existing malware detection techniques employing feature extraction and ML algorithms are discussed extensively. Finally, it discusses research difficulties and unsolved problems, including future research directions.Comment: 30 pages, 9 figures, 7 tables, no where submitted ye

    Machine-Learning Classifiers for Malware Detection Using Data Features

    Get PDF
    The spread of ransomware has risen exponentially over the past decade, causing huge financial damage to multiple organizations. Various anti-ransomware firms have suggested methods for preventing malware threats. The growing pace, scale and sophistication of malware provide the anti-malware industry with more challenges. Recent literature indicates that academics and anti-virus organizations have begun to use artificial learning as well as fundamental modeling techniques for the research and identification of malware. Orthodox signature-based anti-virus programs struggle to identify unfamiliar malware and track new forms of malware. In this study, a malware evaluation framework focused on machine learning was adopted that consists of several modules: dataset compiling in two separate classes (malicious and benign software), file disassembly, data processing, decision making, and updated malware identification. The data processing module uses grey images, functions for importing and Opcode n-gram to remove malware functionality. The decision making module detects malware and recognizes suspected malware. Different classifiers were considered in the research methodology for the detection and classification of malware. Its effectiveness was validated on the basis of the accuracy of the complete process

    Clustering versus SVM for Malware Detection

    Get PDF
    Previous work has shown that we can effectively cluster certain classes of mal- ware into their respective families. In this research, we extend this previous work to the problem of developing an automated malware detection system. We first compute clusters for a collection of malware families. Then we analyze the effectiveness of clas- sifying new samples based on these existing clusters. We compare results obtained using �-means and Expectation Maximization (EM) clustering to those obtained us- ing Support Vector Machines (SVM). Using clustering, we are able to detect some malware families with an accuracy comparable to that of SVMs. One advantage of the clustering approach is that there is no need to retrain for new malware families

    A Hybrid Model for Android Malware Detection using Decision Tree and KNN

    Get PDF
    Malwares are becoming a major problem nowadays all around the world in android operating systems. The malware is a piece of software developed for harming or exploiting certain other hardware as well as software. The term Malware is also known as malicious software which is utilized to define Trojans, viruses, as well as other kinds of spyware. There have been developed many kinds of techniques for protecting the android operating systems from malware during the last decade. However, the existing techniques have numerous drawbacks such as accuracy to detect the type of malware in real-time in a quick manner for protecting the android operating systems. In this article, the authors developed a hybrid model for android malware detection using a decision tree and KNN (k-nearest neighbours) technique. First, Dalvik opcode, as well as real opcode, was pulled out by using the reverse procedure of the android software. Secondly, eigenvectors of sampling were produced by utilizing the n-gram model. Our suggested hybrid model efficiently combines KNN along with the decision tree for effective detection of the android malware in real-time. The outcome of the proposed scheme illustrates that the proposed hybrid model is better in terms of the accurate detection of any kind of malware from the Android operating system in a fast and accurate manner. In this experiment, 815 sample size was selected for the normal samples and the 3268-sample size was selected for the malicious samples. Our proposed hybrid model provides pragmatic values of the parameters namely precision, ACC along with the Recall, and F1 such as 0.93, 0.98, 0.96, and 0.99 along with 0.94, 0.99, 0.93, and 0.99 respectively. In the future, there are vital possibilities to carry out more research in this field to develop new methods for Android malware detection

    Malware Detection and Analysis

    Get PDF
    Malicious software poses a serious threat to the cybersecurity of network infrastructures and is a global pandemic in the form of computer viruses, Trojan horses, and Internet worms. Studies imply that the effects of malware are deteriorating. The main defense against malware is malware detectors. The methods that such a detector employ define its level of quality. Therefore, it is crucial that we research malware detection methods and comprehend their advantages and disadvantages. Attackers are creating malware that is polymorphic and metamorphic and has the capacity to modify their source code as they spread. Furthermore, existing defenses, which often utilize signature-based approaches and are unable to identify the previously undiscovered harmful executables, are significantly undermined by the diversity and volume of their variations. Malware families\u27 variations exhibit common behavioral characteristics that reveal their origin and function. Machine learning techniques may be used to detect and categorize novel viruses into their recognized families utilizing the behavioral patterns discovered via static or dynamic analysis. In this paper, we\u27ll talk about malware, its various forms, malware concealment strategies, and malware attack mechanisms. Additionally, many detection methods and classification models are presented in this study. The method of malware analysis is demonstrated by conducting an analysis of a malware program in a contained environment

    A Survey on Malware Analysis Techniques: Static, Dynamic, Hybrid and Memory Analysis

    Get PDF
    Now a day the threat of malware is increasing rapidly. A software that sneaks to your computer system without your knowledge with a harmful intent to disrupt your computer operations. Due to the vast number of malware, it is impossible to handle malware by human engineers. Therefore, security researchers are taking great efforts to develop accurate and effective techniques to detect malware. This paper presents a semantic and detailed survey of methods used for malware detection like signature-based and heuristic-based. The Signature-based technique is largely used today by anti-virus software to detect malware, is fast and capable to detect known malware. However, it is not effective in detecting zero-day malware and it is easily defeated by malware that use obfuscation techniques. Likewise, a considerable false positive rate and high amount of scanning time are the main limitations of heuristic-based techniques. Alternatively, memory analysis is a promising technique that gives a comprehensive view of malware and it is expected to become more popular in malware analysis. The main contributions of this paper are: (1) providing an overview of malware types and malware detection approaches, (2) discussing the current malware analysis techniques, their findings and limitations, (3) studying the malware obfuscation, attacking and anti-analysis techniques, and (4) exploring the structure of memory-based analysis in malware detection. The detection approaches have been compared with each other according to their techniques, selected features, accuracy rates, and their advantages and disadvantages. This paper aims to help the researchers to have a general view of malware detection field and to discuss the importance of memory-based analysis in malware detection

    Static malware detection Using Stacked BiLSTM and GPT-2

    Get PDF
    In recent years, cyber threats and malicious software attacks have been escalated on various platforms. Therefore, it has become essential to develop automated machine learning methods for defending against malware. In the present study, we propose stacked bidirectional long short-term memory (Stacked BiLSTM) and generative pre-trained transformer based (GPT-2) deep learning language models for detecting malicious code. We developed language models using assembly instructions extracted from .text sections of malicious and benign Portable Executable (PE) files. We treated each instruction as a sentence and each .text section as a document. We also labeled each sentence and document as benign or malicious, according to the file source. We created three datasets from those sentences and documents. The first dataset, composed of documents, was fed into a Document Level Analysis Model (DLAM) based on Stacked BiLSTM. The second dataset, composed of sentences, was used in Sentence Level Analysis Models (SLAMs) based on Stacked BiLSTM and DistilBERT, Domain Specific Language Model GPT-2 (DSLM-GPT2), and General Language Model GPT-2 (GLM-GPT2). Lastly, we merged all assembly instructions without labels for creating the third dataset; then we fed a custom pre-trained model with it. We then compared malware detection performances. The results showed that the pre-trained model improved the DSLM-GPT2 and GLM-GPT2 detection performance. The experiments showed that the DLAM, the SLAM based on DistilBERT, the DSLM-GPT2, and the GLM-GPT2 achieved 98.3%, 70.4%, 86.0%, and 76.2% F1 scores, respectively
    corecore