Adversarial examples are important to test and enhance the robustness of deep
code models. As source code is discrete and has to strictly stick to complex
grammar and semantics constraints, the adversarial example generation
techniques in other domains are hardly applicable. Moreover, the adversarial
example generation techniques specific to deep code models still suffer from
unsatisfactory effectiveness due to the enormous ingredient search space. In
this work, we propose a novel adversarial example generation technique (i.e.,
CODA) for testing deep code models. Its key idea is to use code differences
between the target input (i.e., a given code snippet as the model input) and
reference inputs (i.e., the inputs that have small code differences but
different prediction results with the target input) to guide the generation of
adversarial examples. It considers both structure differences and identifier
differences to preserve the original semantics. Hence, the ingredient search
space can be largely reduced as the one constituted by the two kinds of code
differences, and thus the testing process can be improved by designing and
guiding corresponding equivalent structure transformations and identifier
renaming transformations. Our experiments on 15 deep code models demonstrate
the effectiveness and efficiency of CODA, the naturalness of its generated
examples, and its capability of enhancing model robustness after adversarial
fine-tuning. For example, CODA reveals 88.05% and 72.51% more faults in models
than the state-of-the-art techniques (i.e., CARROT and ALERT) on average,
respectively.Comment: Accepted by ASE 202