956 research outputs found

    Machine Learning Aided Static Malware Analysis: A Survey and Tutorial

    Full text link
    Malware analysis and detection techniques have been evolving during the last decade as a reflection to development of different malware techniques to evade network-based and host-based security protections. The fast growth in variety and number of malware species made it very difficult for forensics investigators to provide an on time response. Therefore, Machine Learning (ML) aided malware analysis became a necessity to automate different aspects of static and dynamic malware investigation. We believe that machine learning aided static analysis can be used as a methodological approach in technical Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware analysis that has been thoroughly studied before. In this paper, we address this research gap by conducting an in-depth survey of different machine learning methods for classification of static characteristics of 32-bit malicious Portable Executable (PE32) Windows files and develop taxonomy for better understanding of these techniques. Afterwards, we offer a tutorial on how different machine learning techniques can be utilized in extraction and analysis of a variety of static characteristic of PE binaries and evaluate accuracy and practical generalization of these techniques. Finally, the results of experimental study of all the method using common data was given to demonstrate the accuracy and complexity. This paper may serve as a stepping stone for future researchers in cross-disciplinary field of machine learning aided malware forensics.Comment: 37 Page

    Automatically combining static malware detection techniques

    Get PDF
    Malware detection techniques come in many different flavors, and cover different effectiveness and efficiency trade-offs. This paper evaluates a number of machine learning techniques to combine multiple static Android malware detection techniques using automatically constructed decision trees. We identify the best methods to construct the trees. We demonstrate that those trees classify sample apps better and faster than individual techniques alone

    A Study of Android Malware Detection Techniques and Machine Learning

    Get PDF
    Android OS is one of the widely used mobile Operating Systems. The number of malicious applications and adwares are increasing constantly on par with the number of mobile devices. A great number of commercial signature based tools are available on the market which prevent to an extent the penetration and distribution of malicious applications. Numerous researches have been conducted which claims that traditional signature based detection system work well up to certain level and malware authors use numerous techniques to evade these tools. So given this state of affairs, there is an increasing need for an alternative, really tough malware detection system to complement and rectify the signature based system. Recent substantial research focused on machine learning algorithms that analyze features from malicious application and use those features to classify and detect unknown malicious applications. This study summarizes the evolution of malware detection techniques based on machine learning algorithms focused on the Android OS

    Artificial intelligence in the cyber domain: Offense and defense

    Get PDF
    Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41

    The Dark Side(-Channel) of Mobile Devices: A Survey on Network Traffic Analysis

    Full text link
    In recent years, mobile devices (e.g., smartphones and tablets) have met an increasing commercial success and have become a fundamental element of the everyday life for billions of people all around the world. Mobile devices are used not only for traditional communication activities (e.g., voice calls and messages) but also for more advanced tasks made possible by an enormous amount of multi-purpose applications (e.g., finance, gaming, and shopping). As a result, those devices generate a significant network traffic (a consistent part of the overall Internet traffic). For this reason, the research community has been investigating security and privacy issues that are related to the network traffic generated by mobile devices, which could be analyzed to obtain information useful for a variety of goals (ranging from device security and network optimization, to fine-grained user profiling). In this paper, we review the works that contributed to the state of the art of network traffic analysis targeting mobile devices. In particular, we present a systematic classification of the works in the literature according to three criteria: (i) the goal of the analysis; (ii) the point where the network traffic is captured; and (iii) the targeted mobile platforms. In this survey, we consider points of capturing such as Wi-Fi Access Points, software simulation, and inside real mobile devices or emulators. For the surveyed works, we review and compare analysis techniques, validation methods, and achieved results. We also discuss possible countermeasures, challenges and possible directions for future research on mobile traffic analysis and other emerging domains (e.g., Internet of Things). We believe our survey will be a reference work for researchers and practitioners in this research field.Comment: 55 page

    Signal processing for malware analysis

    Get PDF
    This Project is an experimental analysis of Android malware through images. The analysis is based on classifying the malware into families or differentiating between goodware and malware. This analysis has been done considering two approaches. These two approaches have a common starting point, which is the transformation of Android applications into PNG images. After this conversion, the first approach was subtracting each image from the testing set with the images of the training set, in order to establish which unknown malware belongs to a specific family or to distinguish between goodware and malware. Although the accuracy was higher than the one defined in the requirements, this approach was a time consuming task, so we consider another approach to reduce the time and get the same or better accuracy. The second approach was extracting features from all the images and then using a machine learning classifier to get a precise differentiation. After this second approach, the resulting time for 100,000 samples was less than 4 hours and the accuracy 83.04%, which fulfill the requirements specified. To perform the analysis, we have used two heterogeneous datasets. The Malgenome dataset which contains 49 kinds of malware Android applications (49 malware families). It was used to perform the measurements and the different tests. The M0droid dataset, which contains goodware and malware Android applications. It was used to corroborate the previous analysis.Este proyecto es un análisis experimental de aplicaciones de Android mediante imágenes. Este análisis se basa en clasificar las imágenes en familias o en diferenciarlas entre goodware o malware. Para ello, se han considerado dos enfoques. Estas dos aproximaciones tienen como punto en común la transformación de las aplicaciones de Android en imágenes de tipo PNG. Después de este proceso de transformación a imágenes, la primera aproximación se basó en restar cada imagen perteneciente al grupo de pruebas con las imágenes del grupo de entrenamiento, de esta forma se pudo saber la familia a la que pertenecía cada malware desconocido o distinguir entre aplicaciones goodware y malware. Sin embargo, a pesar de que la precisión de acierto era más alta que la definida en los requisitos, este enfoque era una tarea que consumía mucho tiempo, así que consideramos otra aproximación para reducir el tiempo y conseguir una precisión parecida o mejor que la anterior. Este segundo enfoque fue extraer las características de las imágenes para después usar un clasificador y así obtener una diferenciación precisa. Con esta segunda aproximación, conseguimos un tiempo total menor a las 4 horas para 100000 muestras con una precisión del 83.04%, cumpliendo y superando de esta forma los requisitos que habían sido especificados. Este análisis se ha llevado a cabo usando dos sets de datos heterogéneos. Uno de ellos fue el perteneciente a un proyecto llamado Malgenome, éste contiene 49 tipos de familias de malware en Android. El set de datos de Malgenome se usó para realizar los diferentes ensayos o pruebas y sobre el que se realizaron las medidas de tiempo y precisión. El set de datos de M0droid se usó para corroborar el análisis previo y así establecer una clasificación final.Ingeniería Informátic
    corecore