Understanding Large Language Model Based Fuzz Driver Generation

Bai, Mingqiang; Li, Yeting; Li, Yuekang; Liu, Yang; Ma, Wei; Sun, Limin; Xie, Xiaofei; Zhang, Cen; Zheng, Yaowen

Understanding Large Language Model Based Fuzz Driver Generation

Authors: Mingqiang Bai
Yeting Li
Yuekang Li
Yang Liu
Wei Ma
Limin Sun
Xiaofei Xie
Cen Zhang
Yaowen Zheng
Publication date: 23 July 2023
Publisher

Abstract

Fuzz drivers are a necessary component of API fuzzing. However, automatically generating correct and robust fuzz drivers is a difficult task. Compared to existing approaches, LLM-based (Large Language Model) generation is a promising direction due to its ability to operate with low requirements on consumer programs, leverage multiple dimensions of API usage information, and generate human-friendly output code. Nonetheless, the challenges and effectiveness of LLM-based fuzz driver generation remain unclear. To address this, we conducted a study on the effects, challenges, and techniques of LLM-based fuzz driver generation. Our study involved building a quiz with 86 fuzz driver generation questions from 30 popular C projects, constructing precise effectiveness validation criteria for each question, and developing a framework for semi-automated evaluation. We designed five query strategies, evaluated 36,506 generated fuzz drivers. Furthermore, the drivers were compared with manually written ones to obtain practical insights. Our evaluation revealed that: while the overall performance was promising (passing 91% of questions), there were still practical challenges in filtering out the ineffective fuzz drivers for large scale application; basic strategies achieved a decent correctness rate (53%), but struggled with complex API-specific usage questions. In such cases, example code snippets and iterative queries proved helpful; while LLM-generated drivers showed competent fuzzing outcomes compared to manually written ones, there was still significant room for improvement, such as incorporating semantic oracles for logical bugs detection.Comment: 17 pages, 14 figure

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2307.12469

Last time updated on 28/07/2023