452 research outputs found

    GBJOF: Gradient Boosting Integrated with Jaya Algorithm to Optimize the Features in Malware Analysis

    Get PDF
    Malware analysis is used to identify suspicious file transferring in the network. It can be identified efficiently by using the reverse engineering hybrid approach. Implementing a hybrid approach depends on the feature selection because the dataset contains static and dynamic parameters. The given dataset contains 85 attributes with 10 different class labels. Since it has high dimensional and multi-classification data, existing approaches of ML could be more efficient in reducing the features. The model combines the enhanced JAYA genetic algorithm with a gradient boosting technique to identify the efficiency and a smaller number of features. Many existing approaches for feature selection either implement correlation analysis or wrapper techniques. The major disadvantages of these issues are that they are facing fitting problems with a very small number of features. With the Usage of the genetic approach, this paper has achieved 95% accuracy with 12 features, approximately 7% greater than ML approaches

    Performance of Malware Classification on Machine Learning using Feature Selection

    Get PDF
    The exponential growth of malware has created a significant threat in our daily lives, which heavily rely on computers running all kinds of software. Malware writers create malicious software by creating new variants, new innovations, new infections and more obfuscated malware by using techniques such as packing and encrypting techniques. Malicious software classification and detection play an important role and a big challenge for cyber security research. Due to the increasing rate of false alarm, the accurate classification and detection of malware is a big necessity issue to be solved. In this research, eight malware family have been classifying according to their family the research provides four feature selection algorithms to select best feature for multiclass classification problem. Comparing. Then find these algorithms top 100 features are selected to performance evaluations. Five machine learning algorithms is compared to find best models. Then frequency distribution of features are find by feature ranking of best model. At last it is said that frequency distribution of every character of API call sequence can be used to classify malware family

    Network Traffic Based Botnet Detection Using Machine Learning

    Get PDF
    The field of information and computer security is rapidly developing in today’s world as the number of security risks is continuously being explored every day. The moment a new software or a product is launched in the market, a new exploit or vulnerability is exposed and exploited by the attackers or malicious users for different motives. Many attacks are distributed in nature and carried out by botnets that cause widespread disruption of network activity by carrying out DDoS (Distributed Denial of Service) attacks, email spamming, click fraud, information and identity theft, virtual deceit and distributed resource usage for cryptocurrency mining. Botnet detection is still an active area of research as no single technique is available that can detect the entire ecosystem of a botnet like Neris, Rbot, and Virut. They tend to have different configurations and heavily armored by malware writers to evade detection systems by employing sophisticated evasion techniques. This report provides a detailed overview of a botnet and its characteristics and the existing work that is done in the domain of botnet detection. The study aims to evaluate the preprocessing techniques like variance thresholding and one-hot encoding to clean the botnet dataset and feature selection technique like filter, wrapper and embedded method to boost the machine learning model performance. This study addresses the dataset imbalance issues through techniques like undersampling, oversampling, ensemble learning and gradient boosting by using random forest, decision tree, AdaBoost and XGBoost. Lastly, the optimal model is then trained and tested on the dataset of different attacks to study its performance

    Detection of Android Malware using Feature Selection with a Hybrid Genetic Algorithm and Simulated Annealing (SVM and DBN)

    Get PDF
    Because of the widespread use of the Android operating system and the simplicity with which applications can be created on the Android platform, anyone can easily create malware using pre-made tools. Due to the spread of malware among many helpful applications, Android users are experiencing issues. In this study, we showed how to use permissions gleaned from static analysis to identify Android malware. Utilising support vector machines and deep belief networks, we choose the pertinent features from the set of permissions based on this methodology. The suggested technique increases the effectiveness of Android malware detection

    Intellectual Feature Ranking Model with Correlated Feature Set based Malware Detection in Cloud environment using Machine Learning

    Get PDF
    Malware detection for cloud systems has been studied extensively, and many different approaches have been developed and implemented in an effort to stay ahead of this ever-evolving threat. Malware refers to any programme or defect that is designed to duplicate itself or cause damage to the system's hardware or software. These attacks are designed specifically to cause harm to operational systems, but they are invisible to the human eye. One of the most exciting developments in data storage and service delivery today is cloud computing. There are significant benefits to be gained over more conventional protection methods by making use of this fast evolving technology to protect computer-based systems from cyber-related threats. Assets to be secured may reside in any networked computing environment, including but not limited to Cyber Physical Systems (CPS), critical systems, fixed and portable computers, mobile devices, and the Internet of Things (IoT). Malicious software or malware refers to any programme that intentionally compromises a computer system in order to compromise its security, privacy, or availability. A cloud-based intelligent behavior analysis model for malware detection system using feature set is proposed to identify the ever-increasing malware attacks. The suggested system begins by collecting malware samples from several virtual machines, from which unique characteristics can be extracted easily. Then, the malicious and safe samples are separated using the features provided to the learning-based and rule-based detection agents. To generate a relevant feature set for accurate malware detection, this research proposes an Intellectual Feature Ranking Model with Correlated Feature Set (IFR-CFS) model using enhanced logistic regression model for accurate detection of malware in the cloud environment. The proposed model when compared to the traditional feature selection model, performs better in generation of feature set for accurate detection of malware

    Enhancing Efficiency and Privacy in Memory-Based Malware Classification through Feature Selection

    Full text link
    Malware poses a significant security risk to individuals, organizations, and critical infrastructure by compromising systems and data. Leveraging memory dumps that offer snapshots of computer memory can aid the analysis and detection of malicious content, including malware. To improve the efficacy and address privacy concerns in malware classification systems, feature selection can play a critical role as it is capable of identifying the most relevant features, thus, minimizing the amount of data fed to classifiers. In this study, we employ three feature selection approaches to identify significant features from memory content and use them with a diverse set of classifiers to enhance the performance and privacy of the classification task. Comprehensive experiments are conducted across three levels of malware classification tasks: i) binary-level benign or malware classification, ii) malware type classification (including Trojan horse, ransomware, and spyware), and iii) malware family classification within each family (with varying numbers of classes). Results demonstrate that the feature selection strategy, incorporating mutual information and other methods, enhances classifier performance for all tasks. Notably, selecting only 25\% and 50\% of input features using Mutual Information and then employing the Random Forest classifier yields the best results. Our findings reinforce the importance of feature selection for malware classification and provide valuable insights for identifying appropriate approaches. By advancing the effectiveness and privacy of malware classification systems, this research contributes to safeguarding against security threats posed by malicious software.Comment: Accepted in IEEE ICMLA-2023 Conferenc

    MalDetConv: Automated Behaviour-based Malware Detection Framework Based on Natural Language Processing and Deep Learning Techniques

    Full text link
    The popularity of Windows attracts the attention of hackers/cyber-attackers, making Windows devices the primary target of malware attacks in recent years. Several sophisticated malware variants and anti-detection methods have been significantly enhanced and as a result, traditional malware detection techniques have become less effective. This work presents MalBehavD-V1, a new behavioural dataset of Windows Application Programming Interface (API) calls extracted from benign and malware executable files using the dynamic analysis approach. In addition, we present MalDetConV, a new automated behaviour-based framework for detecting both existing and zero-day malware attacks. MalDetConv uses a text processing-based encoder to transform features of API calls into a suitable format supported by deep learning models. It then uses a hybrid of convolutional neural network (CNN) and bidirectional gated recurrent unit (CNN-BiGRU) automatic feature extractor to select high-level features of the API Calls which are then fed to a fully connected neural network module for malware classification. MalDetConv also uses an explainable component that reveals features that contributed to the final classification outcome, helping the decision-making process for security analysts. The performance of the proposed framework is evaluated using our MalBehavD-V1 dataset and other benchmark datasets. The detection results demonstrate the effectiveness of MalDetConv over the state-of-the-art techniques with detection accuracy of 96.10%, 95.73%, 98.18%, and 99.93% achieved while detecting unseen malware from MalBehavD-V1, Allan and John, Brazilian, and Ki-D datasets, respectively. The experimental results show that MalDetConv is highly accurate in detecting both known and zero-day malware attacks on Windows devices
    corecore