Despite their ability to aid developers in detecting potential defects early
in the software development life cycle, static analysis tools often suffer from
precision issues (i.e., high false positive rates of reported alarms). To
improve the availability of these tools, many automated warning identification
techniques have been proposed to assist developers in classifying false
positive alarms. However, existing approaches mainly focus on using
hand-engineered features or statement-level abstract syntax tree token
sequences to represent the defective code, failing to capture semantics from
the reported alarms. To overcome the limitations of traditional approaches,
this paper employs deep neural networks' powerful feature extraction and
representation abilities to generate code semantics from control flow graph
paths for warning identification. The control flow graph abstractly represents
the execution process of a given program. Thus, the generated path sequences of
the control flow graph can guide the deep neural networks to learn semantic
information about the potential defect more accurately. In this paper, we
fine-tune the pre-trained language model to encode the path sequences and
capture the semantic representations for model building. Finally, this paper
conducts extensive experiments on eight open-source projects to verify the
effectiveness of the proposed approach by comparing it with the
state-of-the-art baselines.Comment: 17 pages, in Chinese language, 9 figure