5 research outputs found

    A Review on Malware Analysis by using an Approach of Machine Learning Techniques

    Get PDF
    In the Internet age, malware (such as viruses, trojans, ransomware, and bots) has posed serious andevolving security threats to Internet users. To protect legitimate users from these threats, anti-malware softwareproducts from different companies, including Comodo, Kaspersky, Kingsoft, and Symantec, provide the majordefense against malware. Unfortunately, driven by the economic benefits, the number of new malware sampleshas explosively increased: anti-malware vendors are now confronted with millions of potential malware samplesper year. In order to keep on combating the increase in malware samples, there is an urgent need to developintelligent methods for effective and efficient malware detection from the real and large daily sample collection.One of the most common approaches in literature is using machine learning techniques, to automatically learnmodels and patterns behind such complexity, and to develop technologies to keep pace with malware evolution.This survey aims at providing an overview on the way machine learning has been used so far in the context ofmalware analysis in Windows environments. This paper gives an survey on the features related to malware filesor documents and what machine learning techniques they employ (i.e., what algorithm is used to process the inputand produce the output). Different issues and challenges are also discussed

    Malware Analysis with Machine Learning

    Get PDF
    Tese de mestrado, Segurança Informática, Universidade de Lisboa, Faculdade de Ciências, 2022Malware attacks have been one of the most serious cyber risks in recent years. Almost every week, the number of vulnerability reports is increasing in the security communities. One of the key causes for the exponential growth is the fact that malware authors started introducing mutations to avoid detection. This means that malicious files from the same malware family, with the same malicious behaviour, are constantly modified or obfuscated using a variety of technics to make them appear to be different. Characteristics retrieved from raw binary files or disassembled code are used in existing machine learning-based malware categorization algorithms. The variety of such attributes has made it difficult to develop generic malware categorization methods that operate well in a variety of operating scenarios. To be effective in evaluating and categorizing such enormous volumes of data, it is necessary to divide them into groups and identify their respective families based on their behaviour. Malicious software is converted to a greyscale image representation, due to the possibility to capture subtle changes while keeping the global structure helps to detect variations. Motivated by the Machine Learning results achieved in the ImageNet challenge, this dissertation proposes an agnostic deep learning solution, for efficiently classifying malware into families based on a collection of discriminant patterns retrieved from its visualization as images. In this thesis, we present Malwizard, an adaptable Python solution suited for companies or end users, that allows them to automatically obtain a fast malware analysis. The solution was implemented as an Outlook add-in and an API service for the SOAR platforms, as emails are the first vector for this type of attack, with companies being the most attractive targets. The Microsoft Classification Challenge dataset was used in the evaluation of the noble approach. Therefore, its image representation was ciphered and generated the correspondent ciphered image to evaluate if the same patterns could be identified using traditional machine learning techniques. Thus, allowing the privacy concerns to be addressed, maintaining the data analysed by neural networks secure to unauthorized parties. Experimental comparison demonstrates the noble approach performed close to the best analysed model on a plain text dataset, completing the task in one-third of the time. Regarding the encrypted dataset, classical techniques need to be adapted in order to be efficient

    Convolutional neural networks for malware classification

    Get PDF
    According to AV vendors malicious software has been growing exponentially last years. One of the main reasons for these high volumes is that in order to evade detection, malware authors started using polymorphic and metamorphic techniques. As a result, traditional signature-based approaches to detect malware are being insufficient against new malware and the categorization of malware samples had become essential to know the basis of the behavior of malware and to fight back cybercriminals. During the last decade, solutions that fight against malicious software had begun using machine learning approaches. Unfortunately, there are few opensource datasets available for the academic community. One of the biggest datasets available was released last year in a competition hosted on Kaggle with data provided by Microsoft for the Big Data Innovators Gathering (BIG 2015). This thesis presents two novel and scalable approaches using Convolutional Neural Networks (CNNs) to assign malware to its corresponding family. On one hand, the first approach makes use of CNNs to learn a feature hierarchy to discriminate among samples of malware represented as gray-scale images. On the other hand, the second approach uses the CNN architecture introduced by Yoon Kim [12] to classify malware samples according their x86 instructions. The proposed methods achieved an improvement of 93.86% and 98,56% with respect to the equal probability benchmark

    Pattern-Based Vulnerability Discovery

    Get PDF
    corecore