4 research outputs found
Targeted Greybox Fuzzing with Static Lookahead Analysis
Automatic test generation typically aims to generate inputs that explore new
paths in the program under test in order to find bugs. Existing work has,
therefore, focused on guiding the exploration toward program parts that are
more likely to contain bugs by using an offline static analysis.
In this paper, we introduce a novel technique for targeted greybox fuzzing
using an online static analysis that guides the fuzzer toward a set of target
locations, for instance, located in recently modified parts of the program.
This is achieved by first semantically analyzing each program path that is
explored by an input in the fuzzer's test suite. The results of this analysis
are then used to control the fuzzer's specialized power schedule, which
determines how often to fuzz inputs from the test suite. We implemented our
technique by extending a state-of-the-art, industrial fuzzer for Ethereum smart
contracts and evaluate its effectiveness on 27 real-world benchmarks. Using an
online analysis is particularly suitable for the domain of smart contracts
since it does not require any code instrumentation---instrumentation to
contracts changes their semantics. Our experiments show that targeted fuzzing
significantly outperforms standard greybox fuzzing for reaching 83% of the
challenging target locations (up to 14x of median speed-up)
The Progress, Challenges, and Perspectives of Directed Greybox Fuzzing
Most greybox fuzzing tools are coverage-guided as code coverage is strongly
correlated with bug coverage. However, since most covered codes may not contain
bugs, blindly extending code coverage is less efficient, especially for corner
cases. Unlike coverage-guided greybox fuzzers who extend code coverage in an
undirected manner, a directed greybox fuzzer spends most of its time allocation
on reaching specific targets (e.g., the bug-prone zone) without wasting
resources stressing unrelated parts. Thus, directed greybox fuzzing (DGF) is
particularly suitable for scenarios such as patch testing, bug reproduction,
and specialist bug hunting. This paper studies DGF from a broader view, which
takes into account not only the location-directed type that targets specific
code parts, but also the behaviour-directed type that aims to expose abnormal
program behaviours. Herein, the first in-depth study of DGF is made based on
the investigation of 32 state-of-the-art fuzzers (78% were published after
2019) that are closely related to DGF. A thorough assessment of the collected
tools is conducted so as to systemise recent progress in this field. Finally,
it summarises the challenges and provides perspectives for future research.Comment: 16 pages, 4 figure
DAppSCAN: Building Large-Scale Datasets for Smart Contract Weaknesses in DApp Projects
The Smart Contract Weakness Classification Registry (SWC Registry) is a
widely recognized list of smart contract weaknesses specific to the Ethereum
platform. Despite the SWC Registry not being updated with new entries since
2020, the sustained development of smart contract analysis tools for detecting
SWC-listed weaknesses highlights their ongoing significance in the field.
However, evaluating these tools has proven challenging due to the absence of a
large, unbiased, real-world dataset. To address this problem, we aim to build a
large-scale SWC weakness dataset from real-world DApp projects. We recruited 22
participants and spent 44 person-months analyzing 1,199 open source audit
reports from 29 security teams. In total, we identified 9,154 weaknesses and
developed two distinct datasets, i.e., DAPPSCAN-SOURCE and DAPPSCAN-BYTECODE.
The DAPPSCAN-SOURCE dataset comprises 39,904 Solidity files, featuring 1,618
SWC weaknesses sourced from 682 real-world DApp projects. However, the Solidity
files in this dataset may not be directly compilable for further analysis. To
facilitate automated analysis, we developed a tool capable of automatically
identifying dependency relationships within DApp projects and completing
missing public libraries. Using this tool, we created DAPPSCAN-BYTECODE
dataset, which consists of 6,665 compiled smart contract with 888 SWC
weaknesses. Based on DAPPSCAN-BYTECODE, we conducted an empirical study to
evaluate the performance of state-of-the-art smart contract weakness detection
tools. The evaluation results revealed sub-par performance for these tools in
terms of both effectiveness and success detection rate, indicating that future
development should prioritize real-world datasets over simplistic toy
contracts.Comment: Dataset available at https://github.com/InPlusLab/DAppSCA