13 research outputs found
Experimental Toolkit for Manipulating Executable Packing
Executable packing is a well-known problematic especially in the field of malware analysis. It often consists in applying compression or encryption to a binary file and embedding a stub for reversing these transformations at runtime. This way, the packed executable is more difficult to reverse-engineer and/or is obfuscated, which is effective for evading static detection techniques. Many detection approaches, including machine learning, have been proposed in the literature so far, but most studies rely on questionable ground truths and do not provide any open implementation, making the comparison of state-of-the-art solutions tedious. We thus think that first solving the issue of repeatability shall help to compare existing executable packing static detection techniques. Given this challenge, we propose an experimental toolkit, named Packing Box, that leverages automation and containerization in an open source platform that brings a unified solution to the research community. We present our engineering approach for designing and implementing our solution. We then showcase it with a few basic experiments, including a performance evaluation of open source static packing detectors and training a model with machine learning pipeline automation. This introduces the toolset that will be used in further studies
Highlighting the Impact of Packed Executable Alterations with Unsupervised Learning
For many years, executable packing has been used for a variety of applications, including software protection but also malware obfuscation. Even today, this evasion technique remains an open issue, particularly in malware analysis. Numerous studies have proposed static detection techniques based on various algorithms and features, taking advantage of machine learning to build increasingly powerful models. These studies have focused in particular on supervised learning, but unsupervised learning remains relatively unexploited yet. Furthermore, most studies related to adversarial learning focused on attacks in the feature space while those targeting features identified as significant in supervised models are still rather limited. Such features may be still manipulated from the problem space for causing misclassification. The objective of this study is to apply alterations on packed samples based on realistic modifications and visualize their effect using unsupervised learning. To this end, the Packing Box experimental toolkit is used to build a dataset, train models, apply alterations, retrain models and then highlight the consequences of these alterations on the trained models
