Binary diffing aims to align portions of control
flow graphs corresponding to the same source code snippets
between two binaries for software security analyses, such as
vulnerability and plagiarism detection tasks. Previous works have
limited effectiveness and inflexible support for cross-compilation
environment scenarios. The main reason is that they perform
matching based on the similarity comparison of basic blocks.
In our work, we propose a novel diffing approach BINALIGNER
to alleviate the above limitations at the binary level. To reduce
the likelihood of false and missed matches corresponding to the
same source code snippets, we present conditional relaxation
strategies to find candidate subgraph pairs. To support a more
flexible binary diffing in cross-compilation environment scenarios,
we use instruction-independent basic block features for subgraph embedding generation. We implement BINALIGNER and
conduct experiments across four cross-compilation environment
scenarios (i.e., cross-version, cross-compiler, cross-optimization
level, and cross-architecture) to evaluate its effectiveness and
support ability for different scenarios. Experimental results show
that BINALIGNER significantly outperforms the state-of-the-art
methods in most scenarios. Especially in the cross-architecture
scenario and multiple combinations of cross-compilation environment scenarios, BINALIGNER exhibits F1-scores that are on
average 65% higher than the baselines. Two case studies using
real-world vulnerabilities and patches further demonstrate the
utility of BINALIGNER
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.