Search CORE

2 research outputs found

Exploiting Program Dependence Graph for Source Code Classification by Functionality

Author: 延原宙斗
Publication venue: 法政大学大学院情報科学研究科
Publication date: 31/03/2017
Field of study

In many software engineering problems, such as code clone detection, fault prediction, and source code classification, software metrics based approaches are not suitable because they cannot capture precise semantic information of source code. Therefore reseachers have been using abstract syntax trees (ASTs) and program dependence graphs (PDGs) to solve such kind of problems. Previous studies show that a Tree-based convolutional neural network (TBCNN) outperforms other methods to solve source code classification problem. TBCNN uses ASTs to extract underlying meaning of source code. This paper aims to solve source code classification problem using PDGs in addition to ASTs. We present a novel neural network model which is implemented by extending TBCNN. Our model exploits ASTs and PDGs to obtain structural and semantic information of source code. We evaluate our model based on classifying source code by functionality. The dataset contains 104 programming problems and each problem includes 500 programs. Our model achieves over 95% accuracy which is higher than TBCNN’s one. We also survey importance of each dependence and our experiment suggests that the control dependence is the most valuable for extracting semantic features from source code

Hosei University Repository