27 research outputs found
Evaluation of Generalizability of Neural Program Analyzers under Semantic-Preserving Transformations
The abundance of publicly available source code repositories, in conjunction
with the advances in neural networks, has enabled data-driven approaches to
program analysis. These approaches, called neural program analyzers, use neural
networks to extract patterns in the programs for tasks ranging from development
productivity to program reasoning. Despite the growing popularity of neural
program analyzers, the extent to which their results are generalizable is
unknown.
In this paper, we perform a large-scale evaluation of the generalizability of
two popular neural program analyzers using seven semantically-equivalent
transformations of programs. Our results caution that in many cases the neural
program analyzers fail to generalize well, sometimes to programs with
negligible textual differences. The results provide the initial stepping stones
for quantifying robustness in neural program analyzers.Comment: for related work, see arXiv:2008.0156
Unsupervised Classifying of Software Source Code Using Graph Neural Networks
Usually automated programming systems consist of two parts: source code analysis and source code generation. This paper is focused on the first part. Automated source code analysis can be useful for errors and vulnerabilities searching and for representing source code snippets for further investigating. Also gotten representations can be used for synthesizing source code snippets of certain types. A machine learning approach is used in this work. The training set is formed by augmented abstract syntax trees of Java classes. A graph autoencoder is trained and a latent representation of Java classes graphs is inspected. Experiments showed that the proposed model can split Java classes graphs to common classes with some business logic implementation and interfaces and utility classes. The results are good enough be used for more accurate software source code generation