36 research outputs found

    Improving the Unification of Software Clones using Tree and Graph Matching Algorithms

    Get PDF
    Code duplication is common in all kind of software systems and is one of the most troublesome hurdles in software maintenance and evolution activities. Even though these code clones are created for the reuse of some functionality, they usually go through several modifications after their initial introduction. This has a serious negative impact on the maintainability, comprehensibility, and evolution of software systems. Existing code duplication can be eliminated by extracting the common functionality into a single module. In the past, several techniques have been developed for the detection and management of software clones. However, the unification and refactoring of software clones is still a challenging problem, since the existing tools are mostly focused on clone detection and there is no tool to find particularly refactoring-oriented clones. The programmers need to manually understand the clones returned by the clone detection tools, decide whether they should be refactored, and finally perform their refactoring. This obvious gap between the clone detection tools and the clone analysis tools, makes the refactoring tedious and the programmers reluctant towards refactoring duplicate codes. In this thesis, an approach for the unification and refactoring of software clones that overcomes the limitations of previous approaches is presented. More specifically, the proposed technique is able to detect and parameterize non-trivial differences between the clones. Moreover, it can find a mapping between the statements of the clones that minimizes the number of differences. We have also defined preconditions in order to determine whether the duplicated code can be safely refactored to preserve the behavior of the existing code. We compared the proposed technique with a competitive clone refactoring tool and concluded that our approach is able to find a significantly larger number of refactorable clones

    Code clone detection in obfuscated Android apps

    Get PDF
    The Android operating system has long become one of the main global smartphone operating systems. Both developers and malware authors often reuse code to expedite the process of creating new apps and malware samples. Code cloning is the most common way of reusing code in the process of developing Android apps. Finding code clones through the analysis of Android binary code is a challenging task that becomes more sophisticated when instances of code reuse are non-contiguous, reordered, or intertwined with other code. We introduce an approach for detecting cloned methods as well as small and non-contiguous code clones in obfuscated Android applications by simulating the execution of Android apps and then analyzing the subsequent execution traces. We first validate our approach’s ability on finding different types of code clones on 20 injected clones. Next we validate the resistance of our approach against obfuscation by comparing its results on a set of 1085 apps before and after code obfuscation. We obtain 78-87% similarity between the finding from non-obfuscated applications and four sets of obfuscated applications. We also investigated the presence of code clones among 1603 Android applications. We were able to find 44,776 code clones where 34% of code clones were seen from different applications and the rest are among different versions of an application. We also performed a comparative analysis between the clones found by our approach and the clones detected by Nicad on the source code of applications. Finally, we show a practical application of our approach for detecting variants of Android banking malware. Among 60,057 code clone clusters that are found among a dataset of banking malware, 92.9% of them were unique to one malware family or benign applications

    Multimodal Code Search

    Get PDF

    Enhancement of generic code clone detection model for python application

    Get PDF
    Identical code fragments in different locations are recognized as code clones. There are four native terminologies of code clones concluded as Type-1, Type-2, Type-3 and Type-4. Code clones can be identified using various approaches and models. Generic Code Clone Detection (GCCD) model was created to detect all four terminologies of code clones through five processes. A prototype has been developed to detect code clones in Java programming language that starts with Pre-processing Transformation, Parameterization, Categorization and ends with the Match Detection process. Hence, this work targeted to enhance the prototype using a GCCD model to identify all clone types in Python language. Enhancements are done in the Pre-processing process and parameterization process of the GCCD model to fit the Python language criteria. Results are improved by finding the best constant value and suitable weightage according to Python language. Proposed enhancement results of the Python language clone detection in GCCD model imply that Public as the weightage indicator and def as the best constant value

    FORMALIZATION AND DETECTION OF COLLABORATIVE PATTERNS IN SOFTWARE

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore