175 research outputs found

    Machine Learning Aided Static Malware Analysis: A Survey and Tutorial

    Full text link
    Malware analysis and detection techniques have been evolving during the last decade as a reflection to development of different malware techniques to evade network-based and host-based security protections. The fast growth in variety and number of malware species made it very difficult for forensics investigators to provide an on time response. Therefore, Machine Learning (ML) aided malware analysis became a necessity to automate different aspects of static and dynamic malware investigation. We believe that machine learning aided static analysis can be used as a methodological approach in technical Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware analysis that has been thoroughly studied before. In this paper, we address this research gap by conducting an in-depth survey of different machine learning methods for classification of static characteristics of 32-bit malicious Portable Executable (PE32) Windows files and develop taxonomy for better understanding of these techniques. Afterwards, we offer a tutorial on how different machine learning techniques can be utilized in extraction and analysis of a variety of static characteristic of PE binaries and evaluate accuracy and practical generalization of these techniques. Finally, the results of experimental study of all the method using common data was given to demonstrate the accuracy and complexity. This paper may serve as a stepping stone for future researchers in cross-disciplinary field of machine learning aided malware forensics.Comment: 37 Page

    An investigation of a deep learning based malware detection system

    Full text link
    We investigate a Deep Learning based system for malware detection. In the investigation, we experiment with different combination of Deep Learning architectures including Auto-Encoders, and Deep Neural Networks with varying layers over Malicia malware dataset on which earlier studies have obtained an accuracy of (98%) with an acceptable False Positive Rates (1.07%). But these results were done using extensive man-made custom domain features and investing corresponding feature engineering and design efforts. In our proposed approach, besides improving the previous best results (99.21% accuracy and a False Positive Rate of 0.19%) indicates that Deep Learning based systems could deliver an effective defense against malware. Since it is good in automatically extracting higher conceptual features from the data, Deep Learning based systems could provide an effective, general and scalable mechanism for detection of existing and unknown malware.Comment: 13 Pages, 4 figure

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

    Intelligent Agents for Active Malware Analysis

    Get PDF
    The main contribution of this thesis is to give a novel perspective on Active Malware Analysis modeled as a decision making process between intelligent agents. We propose solutions aimed at extracting the behaviors of malware agents with advanced Artificial Intelligence techniques. In particular, we devise novel action selection strategies for the analyzer agents that allow to analyze malware by selecting sequences of triggering actions aimed at maximizing the information acquired. The goal is to create informative models representing the behaviors of the malware agents observed while interacting with them during the analysis process. Such models can then be used to effectively compare a malware against others and to correctly identify the malware famil

    Detection and Classification of Malicious Processes Using System Call Analysis

    Get PDF
    Despite efforts to mitigate the malware threat, the proliferation of malware continues, with record-setting numbers of malware samples being discovered each quarter. Malware are any intentionally malicious software, including software designed for extortion, sabotage, and espionage. Traditional malware defenses are primarily signature-based and heuristic-based, and include firewalls, intrusion detection systems, and antivirus software. Such defenses are reactive, performing well against known threats but struggling against new malware variants and zero-day threats. Together, the reactive nature of traditional defenses and the continuing spread of malware motivate the development of new techniques to detect such threats. One promising set of techniques uses features extracted from system call traces to infer malicious behaviors. This thesis studies the problem of detecting and classifying malicious processes using system call trace analysis. The goal of this study is to identify techniques that are `lightweight' enough and exhibit a low enough false positive rate to be deployed in production environments. The major contributions of this work are (1) a study of the effects of feature extraction strategy on malware detection performance; (2) the comparison of signature-based and statistical analysis techniques for malware detection and classification; (3) the use of sequential detection techniques to identify malicious behaviors as quickly as possible; (4) a study of malware detection performance at very low false positive rates; and (5) an extensive empirical evaluation, wherein the performance of the malware detection and classification systems are evaluated against data collected from production hosts and from the execution of recently discovered malware samples. The outcome of this study is a proof-of-concept system that detects the execution of malicious processes in production environments and classifies them according to their similarity to known malware.Ph.D., Electrical Engineering -- Drexel University, 201

    AndroDialysis: Analysis of Android Intent Effectiveness in Malware Detection

    Get PDF
    © 2016 Elsevier Ltd The wide popularity of Android systems has been accompanied by increase in the number of malware targeting these systems. This is largely due to the open nature of the Android framework that facilitates the incorporation of third-party applications running on top of any Android device. Inter-process communication is one of the most notable features of the Android framework as it allows the reuse of components across process boundaries. This mechanism is used as gateway to access different sensitive services in the Android framework. In the Android platform, this communication system is usually driven by a late runtime binding messaging object known as Intent. In this paper, we evaluate the effectiveness of Android Intents (explicit and implicit) as a distinguishing feature for identifying malicious applications. We show that Intents are semantically rich features that are able to encode the intentions of malware when compared to other well-studied features such as permissions. We also argue that this type of feature is not the ultimate solution. It should be used in conjunction with other known features. We conducted experiments using a dataset containing 7406 applications that comprise 1846 clean and 5560 infected applications. The results show detection rate of 91% using Android Intent against 83% using Android permission. Additionally, experiment on combination of both features results in detection rate of 95.5%

    The Dark Side(-Channel) of Mobile Devices: A Survey on Network Traffic Analysis

    Full text link
    In recent years, mobile devices (e.g., smartphones and tablets) have met an increasing commercial success and have become a fundamental element of the everyday life for billions of people all around the world. Mobile devices are used not only for traditional communication activities (e.g., voice calls and messages) but also for more advanced tasks made possible by an enormous amount of multi-purpose applications (e.g., finance, gaming, and shopping). As a result, those devices generate a significant network traffic (a consistent part of the overall Internet traffic). For this reason, the research community has been investigating security and privacy issues that are related to the network traffic generated by mobile devices, which could be analyzed to obtain information useful for a variety of goals (ranging from device security and network optimization, to fine-grained user profiling). In this paper, we review the works that contributed to the state of the art of network traffic analysis targeting mobile devices. In particular, we present a systematic classification of the works in the literature according to three criteria: (i) the goal of the analysis; (ii) the point where the network traffic is captured; and (iii) the targeted mobile platforms. In this survey, we consider points of capturing such as Wi-Fi Access Points, software simulation, and inside real mobile devices or emulators. For the surveyed works, we review and compare analysis techniques, validation methods, and achieved results. We also discuss possible countermeasures, challenges and possible directions for future research on mobile traffic analysis and other emerging domains (e.g., Internet of Things). We believe our survey will be a reference work for researchers and practitioners in this research field.Comment: 55 page
    corecore