Search CORE

85 research outputs found

Automated forensic extraction of encryption keys using behavioural analysis

Author: Owen Gareth
Publication venue
Publication date: 01/06/2012
Field of study

Portsmouth University Research Portal (Pure)

SEEAD:A Semantic-based Approach for Automatic Binary Code De-obfuscation

Author: Tang Zhanyong
Wang Lei
Kuang Kaiyuan
Xue Chao
Gong Xiaoqing
Chen Xiaojiang
Fang Dingyi
Wang Zheng
Publication venue: IEEE
Publication date: 11/09/2017
Field of study

Increasingly sophisticated code obfuscation techniques are quickly adopted by malware developers to escape from malware detection and to thwart the reverse engineering effort of security analysts. State-of-the-art de-obfuscation approaches rely on dynamic analysis, but face the challenge of low code coverage as not all software execution paths and behavior will be exposed at specific profiling runs. As a result, these approaches often fail to discover hidden malicious patterns. This paper introduces SEEAD, a novel and generic semantic-based de-obfuscation system. When building SEEAD, we try to rely on as few assumptions about the structure of the obfuscation tool as possible, so that the system can keep pace with the fast evolving code obfuscation techniques. To increase the code coverage, SEEAD dynamically directs the target program to execute different paths across different runs. This dynamic profiling scheme is rife with taint and control dependence analysis to reduce the search overhead, and a carefully designed protection scheme to bring the program to an error free status should any error happens during dynamic profile runs. As a result, the increased code coverage enables us to uncover hidden malicious behaviors that are not detected by traditional dynamic analysis based de-obfuscation approaches. We evaluate SEEAD on a range of benign and malicious obfuscated programs. Our experimental results show that SEEAD is able to successfully recover the original logic from obfuscated binaries

Biblioteca Digital de la Comunidad de Madrid

Lancaster E-Prints

Binary Disassembly Block Coverage by Symbolic Execution vs. Recursive Descent

Author: Miller Jonathan D.
Publication venue: AFIT Scholar
Publication date: 22/03/2012
Field of study

This research determines how appropriate symbolic execution is (given its current implementation) for binary analysis by measuring how much of an executable symbolic execution allows an analyst to reason about. Using the S2E Selective Symbolic Execution Engine with a built-in constraint solver (KLEE), this research measures the effectiveness of S2E on a sample of 27 Debian Linux binaries as compared to a traditional static disassembly tool, IDA Pro. Disassembly code coverage and path exploration is used as a metric for determining success. This research also explores the effectiveness of symbolic execution on packed or obfuscated samples of the same binaries to generate a model-based evaluation of success for techniques commonly employed by malware. Obfuscated results were much higher than expected, which lead to the discovery that S2E was not actually handling the multiple executable memory regions present in unpacker runtime code. Three recommendations are made to address the shortcomings of S2E and allow it to process obfuscated samples correctly

AFTI Scholar (Air Force Institute of Technology)

SEEAD:A Semantic-based Approach for Automatic Binary Code De-obfuscation

Author: Chen Xiaojiang
Fang Dingyi
Gong Xiaoqing
Kuang Kaiyuan
Tang Zhanyong
Wang Lei
Wang Zheng
Xue Chao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/09/2017
Field of study

Crossref

Lancaster E-Prints

Recommended from our members

Uncovering Features in Behaviorally Similar Programs

Author: Su Fang-Hsiang
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

The detection of similar code can support many so ware engineering tasks such as program understanding and program classification. Many excellent approaches have been proposed to detect programs having similar syntactic features. However, these approaches are unable to identify programs dynamically or statistically close to each other, which we call behaviorally similar programs. We believe the detection of behaviorally similar programs can enhance or even automate the tasks relevant to program classification. In this thesis, we will discuss our current approaches to identify programs having similar behavioral features in multiple perspectives. We first discuss how to detect programs having similar functionality. While the definition of a program’s functionality is undecidable, we use inputs and outputs (I/Os) of programs as the proxy of their functionality. We then use I/Os of programs as a behavioral feature to detect which programs are functionally similar: two programs are functionally similar if they share similar inputs and outputs. This approach has been studied and developed in the C language to detect functionally equivalent programs having equivalent I/Os. Nevertheless, some natural problems in Object Oriented languages, such as input generation and comparisons between application-specific data types, hinder the development of this approach. We propose a new technique, in-vivo detection, which uses existing and meaningful inputs to drive applications systematically and then applies a novel similarity model considering both inputs and outputs of programs, to detect functionally similar programs. We develop the tool, HitoshiIO, based on our in-vivo detection. In the subjects that we study, HitoshiIO correctly detect 68.4% of functionally similar programs, where its false positive rate is only 16.6%. In addition to functional I/Os of programs, we attempt to discover programs having similar execution behavior. Again, the execution behavior of a program can be undecidable, so we use instructions executed at run-time as a behavioral feature of a program. We create DyCLINK, which observes program executions and encodes them in dynamic instruction graphs. A vertex in a dynamic instruction graph is an instruction and an edge is a type of dependency between two instructions. The problem to detect which programs have similar executions can then be reduced to a problem of solving inexact graph isomorphism. We propose a link analysis based algorithm, LinkSub, which vectorizes each dynamic instruction graph by the importance of every instruction, to solve this graph isomorphism problem efficiently. In a K Nearest Neighbor (KNN) based program classification experiment, DyCLINK achieves 90 + % precision. Because HitoshiIO and DyCLINK both rely on dynamic analysis to expose program behavior, they have better capability to locate and search for behaviorally similar programs than traditional static analysis tools. However, they suffer from some common problems of dynamic analysis, such as input generation and run-time overhead. These problems may make our approaches challenging to scale. Thus, we create the system, Macneto, which integrates static analysis with machine topic modeling and deep learning to approximate program behaviors from their binaries without truly executing programs. In our deobfuscation experiments considering two commercial obfuscators that alter lexical information and syntax in programs, Macneto achieves 90 + % precision, where the groundtruth is that the behavior of a program before and after obfuscation should be the same. In this thesis, we offer a more extensive view of similar programs than the traditional definitions. While the traditional definitions of similar programs mostly use static features, such as syntax and lexical information, we propose to leverage the power of dynamic analysis and machine learning models to trace/collect behavioral features of pro- grams. These behavioral features of programs can then apply to detect behaviorally similar programs. We believe the techniques we invented in this thesis to detect behaviorally similar programs can improve the development of software engineering and security applications, such as code search and deobfuscation

Columbia University Academic Commons

VirtSC: Combining Virtualization Obfuscation with Self-Checksumming

Author: Ahmadvand Mohsen
Banescu Sebastian
Below Daniel
Pretschner Alexander
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Self-checksumming (SC) is a tamper-proofing technique that ensures certain program segments (code) in memory hash to known values at runtime. SC has few restrictions on application and hence can protect a vast majority of programs. The code verification in SC requires computation of the expected hashes after compilation, as the machine-code is not known before. This means the expected hash values need to be adjusted in the binary executable, hence combining SC with other protections is limited due to this adjustment step. However, obfuscation protections are often necessary, as SC protections can be otherwise easily detected and disabled via pattern matching. In this paper, we present a layered protection using virtualization obfuscation, yielding an architecture-agnostic SC protection that requires no post-compilation adjustment. We evaluate the performance of our scheme using a dataset of 25 real-world programs (MiBench and 3 CLI games). Our results show that the SC scheme induces an average overhead of 43% for a complete protection (100% coverage). The overhead is tolerable for less CPU-intensive programs (e.g. games) and when only parts of programs (e.g. license checking) are protected. However, large overheads stemming from the virtualization obfuscation were encountered

arXiv.org e-Print Archive

Crossref

Program variation for software security

Author: Coppens Bart
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography