956 research outputs found
Machine Learning Aided Static Malware Analysis: A Survey and Tutorial
Malware analysis and detection techniques have been evolving during the last
decade as a reflection to development of different malware techniques to evade
network-based and host-based security protections. The fast growth in variety
and number of malware species made it very difficult for forensics
investigators to provide an on time response. Therefore, Machine Learning (ML)
aided malware analysis became a necessity to automate different aspects of
static and dynamic malware investigation. We believe that machine learning
aided static analysis can be used as a methodological approach in technical
Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware
analysis that has been thoroughly studied before. In this paper, we address
this research gap by conducting an in-depth survey of different machine
learning methods for classification of static characteristics of 32-bit
malicious Portable Executable (PE32) Windows files and develop taxonomy for
better understanding of these techniques. Afterwards, we offer a tutorial on
how different machine learning techniques can be utilized in extraction and
analysis of a variety of static characteristic of PE binaries and evaluate
accuracy and practical generalization of these techniques. Finally, the results
of experimental study of all the method using common data was given to
demonstrate the accuracy and complexity. This paper may serve as a stepping
stone for future researchers in cross-disciplinary field of machine learning
aided malware forensics.Comment: 37 Page
Automatically combining static malware detection techniques
Malware detection techniques come in many different flavors, and cover different effectiveness and efficiency trade-offs. This paper evaluates a number of machine learning techniques to combine multiple static Android malware detection techniques using automatically constructed decision trees. We identify the best methods to construct the trees. We demonstrate that those trees classify sample apps better and faster than individual techniques alone
A Study of Android Malware Detection Techniques and Machine Learning
Android OS is one of the widely used mobile Operating Systems. The number of malicious applications and adwares are increasing constantly on par with the number of mobile devices. A great number of commercial signature based tools are available on the market which prevent to an extent the penetration and distribution of malicious applications. Numerous researches have been conducted which claims that traditional signature based detection system work well up to certain level and malware authors use numerous techniques to evade these tools. So given this state of affairs, there is an increasing need for an alternative, really tough malware detection system to complement and rectify the signature based system. Recent substantial research focused on machine learning algorithms that analyze features from malicious application and use those features to classify and detect unknown malicious applications. This study summarizes the evolution of malware detection techniques based on machine learning algorithms focused on the Android OS
Artificial intelligence in the cyber domain: Offense and defense
Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41
The Dark Side(-Channel) of Mobile Devices: A Survey on Network Traffic Analysis
In recent years, mobile devices (e.g., smartphones and tablets) have met an
increasing commercial success and have become a fundamental element of the
everyday life for billions of people all around the world. Mobile devices are
used not only for traditional communication activities (e.g., voice calls and
messages) but also for more advanced tasks made possible by an enormous amount
of multi-purpose applications (e.g., finance, gaming, and shopping). As a
result, those devices generate a significant network traffic (a consistent part
of the overall Internet traffic). For this reason, the research community has
been investigating security and privacy issues that are related to the network
traffic generated by mobile devices, which could be analyzed to obtain
information useful for a variety of goals (ranging from device security and
network optimization, to fine-grained user profiling).
In this paper, we review the works that contributed to the state of the art
of network traffic analysis targeting mobile devices. In particular, we present
a systematic classification of the works in the literature according to three
criteria: (i) the goal of the analysis; (ii) the point where the network
traffic is captured; and (iii) the targeted mobile platforms. In this survey,
we consider points of capturing such as Wi-Fi Access Points, software
simulation, and inside real mobile devices or emulators. For the surveyed
works, we review and compare analysis techniques, validation methods, and
achieved results. We also discuss possible countermeasures, challenges and
possible directions for future research on mobile traffic analysis and other
emerging domains (e.g., Internet of Things). We believe our survey will be a
reference work for researchers and practitioners in this research field.Comment: 55 page
Signal processing for malware analysis
This Project is an experimental analysis of Android malware through images. The analysis
is based on classifying the malware into families or differentiating between goodware and
malware. This analysis has been done considering two approaches. These two
approaches have a common starting point, which is the transformation of Android
applications into PNG images. After this conversion, the first approach was subtracting
each image from the testing set with the images of the training set, in order to establish
which unknown malware belongs to a specific family or to distinguish between goodware
and malware. Although the accuracy was higher than the one defined in the
requirements, this approach was a time consuming task, so we consider another
approach to reduce the time and get the same or better accuracy. The second approach
was extracting features from all the images and then using a machine learning classifier
to get a precise differentiation. After this second approach, the resulting time for 100,000
samples was less than 4 hours and the accuracy 83.04%, which fulfill the requirements
specified.
To perform the analysis, we have used two heterogeneous datasets. The Malgenome
dataset which contains 49 kinds of malware Android applications (49 malware families). It
was used to perform the measurements and the different tests. The M0droid dataset,
which contains goodware and malware Android applications. It was used to corroborate
the previous analysis.Este proyecto es un análisis experimental de aplicaciones de Android mediante
imágenes. Este análisis se basa en clasificar las imágenes en familias o en diferenciarlas
entre goodware o malware. Para ello, se han considerado dos enfoques. Estas dos
aproximaciones tienen como punto en común la transformación de las aplicaciones de
Android en imágenes de tipo PNG. Después de este proceso de transformación a
imágenes, la primera aproximación se basó en restar cada imagen perteneciente al
grupo de pruebas con las imágenes del grupo de entrenamiento, de esta forma se pudo
saber la familia a la que pertenecía cada malware desconocido o distinguir entre
aplicaciones goodware y malware. Sin embargo, a pesar de que la precisión de acierto
era más alta que la definida en los requisitos, este enfoque era una tarea que consumía
mucho tiempo, así que consideramos otra aproximación para reducir el tiempo y
conseguir una precisión parecida o mejor que la anterior. Este segundo enfoque fue
extraer las características de las imágenes para después usar un clasificador y así
obtener una diferenciación precisa. Con esta segunda aproximación, conseguimos un
tiempo total menor a las 4 horas para 100000 muestras con una precisión del 83.04%,
cumpliendo y superando de esta forma los requisitos que habían sido especificados.
Este análisis se ha llevado a cabo usando dos sets de datos heterogéneos. Uno de ellos
fue el perteneciente a un proyecto llamado Malgenome, éste contiene 49 tipos de
familias de malware en Android. El set de datos de Malgenome se usó para realizar los
diferentes ensayos o pruebas y sobre el que se realizaron las medidas de tiempo y
precisión. El set de datos de M0droid se usó para corroborar el análisis previo y así
establecer una clasificación final.Ingeniería Informátic
- …