2 research outputs found

    Exploiting Program Dependence Graph for Source Code Classification by Functionality

    Get PDF
    In many software engineering problems, such as code clone detection, fault prediction, and source code classification, software metrics based approaches are not suitable because they cannot capture precise semantic information of source code. Therefore reseachers have been using abstract syntax trees (ASTs) and program dependence graphs (PDGs) to solve such kind of problems. Previous studies show that a Tree-based convolutional neural network (TBCNN) outperforms other methods to solve source code classification problem. TBCNN uses ASTs to extract underlying meaning of source code. This paper aims to solve source code classification problem using PDGs in addition to ASTs. We present a novel neural network model which is implemented by extending TBCNN. Our model exploits ASTs and PDGs to obtain structural and semantic information of source code. We evaluate our model based on classifying source code by functionality. The dataset contains 104 programming problems and each problem includes 500 programs. Our model achieves over 95% accuracy which is higher than TBCNN’s one. We also survey importance of each dependence and our experiment suggests that the control dependence is the most valuable for extracting semantic features from source code
    corecore