650 research outputs found

    Multi-level analysis of Malware using Machine Learning

    Get PDF
    Multi-level analysis of Malware using Machine Learnin

    Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection

    Full text link
    In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology and framework for efficient and effective real-time malware detection, leveraging the best of conventional machine learning (ML) and deep learning (DL) algorithms. In PROPEDEUTICA, all software processes in the system start execution subjected to a conventional ML detector for fast classification. If a piece of software receives a borderline classification, it is subjected to further analysis via more performance expensive and more accurate DL methods, via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays to the execution of software subjected to deep learning analysis as a way to "buy time" for DL analysis and to rate-limit the impact of possible malware in the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and 877 commonly used benign software samples from various categories for the Windows OS. Our results show that the false positive rate for conventional ML methods can reach 20%, and for modern DL methods it is usually below 6%. However, the classification time for DL can be 100X longer than conventional ML methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the percentage of software subjected to DL analysis was approximately 40% on average. Further, the application of delays in software subjected to ML reduced the detection time by approximately 10%. Finally, we found and discussed a discrepancy between the detection accuracy offline (analysis after all traces are collected) and on-the-fly (analysis in tandem with trace collection). Our insights show that conventional ML and modern DL-based malware detectors in isolation cannot meet the needs of efficient and effective malware detection: high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure

    Developing resilient cyber-physical systems: A review of state-of-the-art malware detection approaches, gaps, and future directions

    Get PDF
    Cyber-physical systems (CPSes) are rapidly evolving in critical infrastructure (CI) domains such as smart grid, healthcare, the military, and telecommunication. These systems are continually threatened by malicious software (malware) attacks by adversaries due to their improvised tactics and attack methods. A minor configuration change in a CPS through malware has devastating effects, which the world has seen in Stuxnet, BlackEnergy, Industroyer, and Triton. This paper is a comprehensive review of malware analysis practices currently being used and their limitations and efficacy in securing CPSes. Using well-known real-world incidents, we have covered the significant impacts when a CPS is compromised. In particular, we have prepared exhaustive hypothetical scenarios to discuss the implications of false positives on CPSes. To improve the security of critical systems, we believe that nature-inspired metaheuristic algorithms can effectively counter the overwhelming malware threats geared toward CPSes. However, our detailed review shows that these algorithms have not been adapted to their full potential to counter malicious software. Finally, the gaps identified through this research have led us to propose future research directions using nature-inspired algorithms that would help in bringing optimization by reducing false positives, thereby increasing the security of such systems

    Comparing a Hybrid Multi-layered Machine Learning Intrusion Detection System to Single-layered and Deep Learning Models

    Get PDF
    Advancements in computing technology have created additional network attack surface, allowed the development of new attack types, and increased the impact caused by an attack. Researchers agree, current intrusion detection systems (IDSs) are not able to adapt to detect these new attack forms, so alternative IDS methods have been proposed. Among these methods are machine learning-based intrusion detection systems. This research explores the current relevant studies related to intrusion detection systems and machine learning models and proposes a new hybrid machine learning IDS model consisting of the Principal Component Analysis (PCA) and Support Vector Machine (SVM) learning algorithms. The NSL-KDD Dataset, benchmark dataset for IDSs, is used for comparing the models’ performance. The performance accuracy and false-positive rate of the hybrid model are compared to the results of the model’s individual algorithmic components to determine which components most impact attack prediction performance. The performance metrics of the hybrid model are also compared to two deep learning Autoencoder Neuro Network models and the results found that the complexity of the model does not add to the performance accuracy. The research showed that pre-processing and feature selection impact the predictive accuracy across models. Future research recommendations were to implement the proposed hybrid IDS model into a live network for testing and analysis, and to focus research into the pre-processing algorithms that improve performance accuracy, and lower false-positive rate. This research indicated that pre-processing and feature selection/feature extraction can increase model performance accuracy and decrease false-positive rate helping businesses to improve network security

    Research on Network Data Algorithm Based on Association Rules

    Get PDF
    The network data algorithm on account of association can effectively describe the development process of historical data and predict the development trend of data. Draw support from the corresponding data algorithm to ameliorate the mining efficiency and execution efficiency of association, more users pay more attention to the rules, so it has important research and utilization value. On account of this, this paper first analyses the concept and mining process of data association, then studies the mining algorithm of data association, and finally gives the structure and utilization effect of cyber data algorithm on account of association

    On the Sequential Pattern and Rule Mining in the Analysis of Cyber Security Alerts

    Get PDF
    Data mining is well-known for its ability to extract concealed and indistinct patterns in the data, which is a common task in the field of cyber security. However, data mining is not always used to its full potential among cyber security community. In this paper, we discuss usability of sequential pattern and rule mining, a subset of data mining methods, in an analysis of cyber security alerts. First, we survey the use case of data mining, namely alert correlation and attack prediction. Subsequently, we evaluate sequential pattern and rule mining methods to find the one that is both fast and provides valuable results while dealing with the peculiarities of security alerts. An experiment was performed using the dataset of real alerts from an alert sharing platform. Finally, we present lessons learned from the experiment and a comparison of the selected methods based on their performance and soundness of the results

    From Intrusion Detection to Attacker Attribution: A Comprehensive Survey of Unsupervised Methods

    Get PDF
    Over the last five years there has been an increase in the frequency and diversity of network attacks. This holds true, as more and more organisations admit compromises on a daily basis. Many misuse and anomaly based Intrusion Detection Systems (IDSs) that rely on either signatures, supervised or statistical methods have been proposed in the literature, but their trustworthiness is debatable. Moreover, as this work uncovers, the current IDSs are based on obsolete attack classes that do not reflect the current attack trends. For these reasons, this paper provides a comprehensive overview of unsupervised and hybrid methods for intrusion detection, discussing their potential in the domain. We also present and highlight the importance of feature engineering techniques that have been proposed for intrusion detection. Furthermore, we discuss that current IDSs should evolve from simple detection to correlation and attribution. We descant how IDS data could be used to reconstruct and correlate attacks to identify attackers, with the use of advanced data analytics techniques. Finally, we argue how the present IDS attack classes can be extended to match the modern attacks and propose three new classes regarding the outgoing network communicatio

    Malware Resistant Data Protection in Hyper-connected Networks: A survey

    Full text link
    Data protection is the process of securing sensitive information from being corrupted, compromised, or lost. A hyperconnected network, on the other hand, is a computer networking trend in which communication occurs over a network. However, what about malware. Malware is malicious software meant to penetrate private data, threaten a computer system, or gain unauthorised network access without the users consent. Due to the increasing applications of computers and dependency on electronically saved private data, malware attacks on sensitive information have become a dangerous issue for individuals and organizations across the world. Hence, malware defense is critical for keeping our computer systems and data protected. Many recent survey articles have focused on either malware detection systems or single attacking strategies variously. To the best of our knowledge, no survey paper demonstrates malware attack patterns and defense strategies combinedly. Through this survey, this paper aims to address this issue by merging diverse malicious attack patterns and machine learning (ML) based detection models for modern and sophisticated malware. In doing so, we focus on the taxonomy of malware attack patterns based on four fundamental dimensions the primary goal of the attack, method of attack, targeted exposure and execution process, and types of malware that perform each attack. Detailed information on malware analysis approaches is also investigated. In addition, existing malware detection techniques employing feature extraction and ML algorithms are discussed extensively. Finally, it discusses research difficulties and unsolved problems, including future research directions.Comment: 30 pages, 9 figures, 7 tables, no where submitted ye
    • …
    corecore