6 research outputs found
Simplification of General Mixed Boolean-Arithmetic Expressions: GAMBA
Malware code often resorts to various self-protection techniques to
complicate analysis. One such technique is applying Mixed-Boolean Arithmetic
(MBA) expressions as a way to create opaque predicates and diversify and
obfuscate the data flow.
In this work we aim to provide tools for the simplification of nonlinear MBA
expressions in a very practical context to compete in the arms race between the
generation of hard, diverse MBAs and their analysis. The proposed algorithm
GAMBA employs algebraic rewriting at its core and extends SiMBA. It achieves
efficient deobfuscation of MBA expressions from the most widely tested public
datasets and simplifies expressions to their ground truths in most cases,
surpassing peer tools
Defeating Opaque Predicates Statically through Machine Learning and Binary Analysis
International audienceWe present a new approach that bridges binary analysis techniques with machine learning classification for the purpose of providing a static and generic evaluation technique for opaque predicates, regardless of their constructions. We use this technique as a static automated deobfuscation tool to remove the opaque predicates introduced by obfuscation mechanisms. According to our experimental results, our models have up to 98% accuracy at detecting and deob-fuscating state-of-the-art opaque predicates patterns. By contrast, the leading edge deobfuscation methods based on symbolic execution show less accuracy mostly due to the SMT solvers constraints and the lack of scalability of dynamic symbolic analyses. Our approach underlines the efficiency of hybrid symbolic analysis and machine learning techniques for a static and generic deobfuscation methodology
Evaluation Methodologies in Software Protection Research
Man-at-the-end (MATE) attackers have full control over the system on which
the attacked software runs, and try to break the confidentiality or integrity
of assets embedded in the software. Both companies and malware authors want to
prevent such attacks. This has driven an arms race between attackers and
defenders, resulting in a plethora of different protection and analysis
methods. However, it remains difficult to measure the strength of protections
because MATE attackers can reach their goals in many different ways and a
universally accepted evaluation methodology does not exist. This survey
systematically reviews the evaluation methodologies of papers on obfuscation, a
major class of protections against MATE attacks. For 572 papers, we collected
113 aspects of their evaluation methodologies, ranging from sample set types
and sizes, over sample treatment, to performed measurements. We provide
detailed insights into how the academic state of the art evaluates both the
protections and analyses thereon. In summary, there is a clear need for better
evaluation methodologies. We identify nine challenges for software protection
evaluations, which represent threats to the validity, reproducibility, and
interpretation of research results in the context of MATE attacks
Defeating MBA-based Obfuscation
International audienceMixed Boolean-Arithmetic expressions are presented as a strong protection in the context of data flow obfuscation. As there is very little literature on the analysis of such obfus-cated expressions, two important subjects of interest are: to define what simplifying those expressions means, and how to design a simplification solution. We focus on evaluating the resilience of this technique, by giving theoretical elements to justify its efficiency and proposing a simplification algorithm using a pattern matching approach. The implementation of this solution is capable of simplifying the public examples of MBA-obfuscated expressions, demonstrating that at least a subset of MBA obfuscation lacks resilience against pattern matching analysis