627 research outputs found

    A Novel Approach for Code Clone Detection Using Hybrid Technique

    Full text link
    Code clones have been studied for long, and there is strong evidence that they are a major source of software faults. The copying of code has been studied within software engineering mostly in the area of clone analysis. Software clones are regions of source code which are highly similar; these regions of similarity are called clones, clone classes, or clone pairs In this paper a hybrid approach using metric based technique with the combination of text based technique for detection and reporting of clones is proposed. The Proposed work is divided into two stages selection of potential clones and comparing of potential clones using textual comparison. The proposed technique detects exact clones on the basis of metric match and then by text match

    An Empirical Study of a Hybrid Code Clone Detection Approach on Java Byte Code

    Get PDF
    Code clones increase the complexity of the system;therefore the software maintenance costs. Code clonedetection techniques have been proposed and evaluated basedon metric value and runtime evaluations. But in the existingmethods, many false positive clones are detected. In thispaper, we suggest a hybrid approach combining ProgramDependence Graph-based technique with Metric-basedtechnique to improve the precision of clone detection. Weconduct a case study on two open source code Java projectssuch as Eclipse-ant and Eclipse-JDT core to show the effectiveness of our tool. The application of this hybrid technique is then compared with the existing clone detection technique, CloneDR. The result shows that our tool increases the performance in precision, recall, false positive and false negative compared to CloneDR

    The Survey of the Code Clone Detection Techniques and Process with Types (I, II, III and IV)

    Get PDF
    In software upgradation code clones are regularly utilized. So, we can contemplate on code location strategies goes past introductory code. In condition of-craftsmanship on clone programming study, we perceived the absence of methodical overview. We clarified the earlier research-in view of deliberate and broad database find and the hole of research for additionally think about. Software support cost is more than outlining cost. Code cloning is useful in several areas like detecting library contents, understanding program, detecting malicious program, etc. and apart from pros several serious impact of code cloning on quality, reusability and continuity of software framework. In this paper, we have discussed the code clone and its evolution and classification of code clone. Code clone is classified into 4 types namely Type I, Type II, III and IV. The exact code as well as copied code is depicted in detail for each type of code clone. Several clone detection techniques such as: Text, token, metric, hybrid based techniques were studied comparatively. Comparison of detection tools such as: clone DR, covet, Duploc, CLAN, etc. based on different techniques used are highlighted and cloning process is also explained. Code clones are identical segment of source code which might be inserted intentionally or unintentionally. Reusing code snippets via copying and pasting with or without minor alterations is general task in software development. But the existence of code clones may reduce the design structure and quality of software like changeability, readability and maintainability and hence increase the continuation charges

    CroLSSim: Cross‐language software similarity detector using hybrid approach of LSA‐based AST‐MDrep features and CNN‐LSTM model

    Get PDF
    Software similarity in different programming codes is a rapidly evolving field because of its numerous applications in software development, software cloning, software plagiarism, and software forensics. Currently, software researchers and developers search cross-language open-source repositories for similar applications for a variety of reasons, such as reusing programming code, analyzing different implementations, and looking for a better application. However, it is a challenging task because each programming language has a unique syntax and semantic structure. In this paper, a novel tool called Cross-Language Software Similarity (CroLSSim) is designed to detect similar software applications written in different programming codes. First, the Abstract Syntax Tree (AST) features are collected from different programming codes. These are high-quality features that can show the abstract view of each program. Then, Methods Description (MDrep) in combination with AST is used to examine the relationship among different method calls. Second, the Term Frequency Inverse Document Frequency approach is used to retrieve the local and global weights from AST-MDrep features. Third, the Latent Semantic Analysis-based features extraction and selection method is proposed to extract the semantic anchors in reduced dimensional space. Fourth, the Convolution Neural Network (CNN)-based features extraction method is proposed to mine the deep features. Finally, a hybrid deep learning model of CNN-Long-Short-Term Memory is designed to detect semantically similar software applications from these latent variables. The data set contains approximately 9.5K Java, 8.8K C#, and 7.4K C++ software applications obtained from GitHub. The proposed approach outperforms as compared with the state-of-the-art methods

    Cloneless: Code Clone Detection via Program Dependence Graphs with Relaxed Constraints

    Get PDF
    Code clones are pieces of code that have the same functionality. While some clones may structurally match one another, others may look drastically different. The inclusion of code clones clutters a code base, leading to increased costs through maintenance. Duplicate code is introduced through a variety of means, such as copy-pasting, code generated by tools, or developers unintentionally writing similar pieces of code. While manual clone identification may be more accurate than automated detection, it is infeasible due to the extensive size of many code bases. Software code clone detection methods have differing degree of success based on the analysis performed. This thesis outlines a method of detecting clones using a program dependence graph and subgraph isomorphism to identify similar subgraphs, ultimately illuminating clones. The project imposes few constraints when comparing code segments to potentially reveal more clones

    Measuring Code Similarity in Large-scaled Code Corpora

    Get PDF
    Source code similarity measurement is a fundamental technique in software engineering research. Techniques to measure code similarity have been invented and applied to various research areas such as code clone detection, finding bug fixes, and software plagiarism detection. We perform an evaluation of 30 similarity analysers for source code. The results show that specialised tools including clone and plagiarism detectors, with proper parameter tuning, outperform general techniques such as string matching. Although these specialised tools can handle code similarity in local code bases, they fail to locate similar code artefacts from large-scaled corpora. This is increasingly important considering the rising amount of online code artefacts. We propose a scalable search system specifically designed for source code. It lays a foundation to discovering online code reuse, large-scale code clone detection, finding usage examples, detecting software plagiarism, and finding software licensing conflicts. Our proposed code search framework is a hybrid of information retrieval and code clone detection techniques. This framework will be able to locate similar code artefacts instantly. The search is not only based on textual similarity, but also syntactic and structural similarity. It is resilient to incomplete code fragments that are normally found on the Internet
    • 

    corecore