26 research outputs found

    Structured Review of the Evidence for Effects of Code Duplication on Software Quality

    Get PDF
    This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

    Program Compression

    Get PDF
    The talk focused on a grammar-based technique for identifying redundancy in program code and taking advantage of that redundancy to reduce the memory required to store and execute the program. The idea is to start with a simple context-free grammar that represents all valid basic blocks of any program. We represent a program by the parse trees (i.e. derivations) of its basic blocks using the grammar. We then modify the grammar, by considering sample programs, so that idioms of the language have shorter derivations in the modified grammar. Since each derivation represents a basic block, we can interpret the resulting set of derivations much as we would interpret the original program. We need only expand the grammar rules indicated by the derivation to produce a sequence of original program instructions to execute. The result is a program representation that is approximately 40% of the original program size and is interpretable by a very modest-sized interpreter

    Structured Review of Code Clone Literature

    Get PDF
    This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other

    Sistema web de detección de copias en prácticas de programación

    Get PDF
    Llegado el momento de corregir las prácticas de programación de una asignatura podemos encontrarnos con que algunos alumnos hayan copiado. Comparar archivos de texto completos es simple y suficiente para detectar copias completas exactas, sin embargo, la detección de copias parciales es una tarea más compleja. En este trabajo se presenta un sistema web que facilita el uso del servicio MOSS, de forma que el profesor pueda usarlo mediante una interfaz fácil e intuitiva, sin necesidad de saber lenguaje Perl ni teclear órdenes. El sistema propuesto facilita la gestión de los ficheros a comparar, realiza la petición a MOSS para que lleve a cabo la comparación, y por último muestra en una ventana del navegador los resultados de la comparación. El sistema propuesto se ha utilizado por primera vez en las prácticas de programación en ensamblador de Estructura de los Computadores I durante el curso 2005/2006 habiendo resultado muy eficaz en la detección de copias

    Softwareplagiatserkennung auf Java-Bytecodebasis

    Get PDF
    Im Rahmen dieser Arbeit wird eine eigens entwickelte Plagiatserkennungsoftware für Java-Programme namens Plagiarism Finder vorgestellt. Der Plagiatserkennungsprozess dieser Software basiert auf dem Java-Bytecode. Es werden die Grundlagen der Plagiatserkennung und des Java-Bytecodes umfassend erläutert. Des Weiteren wird die Funktionsweise, der Entwurf, die Benutzeroberfläche und die Evaluierung von Plagiarism Finder dargestellt. Hierbei wird auf folgende von der Wissenschaft bisher nicht betrachteten Aspekte eingegangen: -Die Art der Normalisierung des Bytecodes vor dem Vergleich. -Wie gelöst werden kann, dass das Verschieben von Methoden keinen Einfluss auf die ermittelten Ergebnisse hat. -Wie bei der Plagiatserkennung Vorlagen gefiltert werden können. Die Arbeit kommt zu dem Resultat, dass sich die Ergebnisse von Plagiarism Finder stabil gegenüber Änderungen des Wortlautes, des Textsatzes und dem Verschieben von Methoden verhalten. Änderungen an Kontrollstrukturen (z.B. For- statt While- Schleifen), an Zugriffsmodifikatoren und an der Anzahl von Methoden führen zu einem instabilen Verhalten der Ergebnisse. Alles in allem kann Plagiarism Finder mit einer etablierten Plagiatserkennugssoftware wie JPlag [MP00] mithalten. Auf Grundlage der untersuchten Daten ist Plagiarism Finder im Erkennen von Plagiaten leicht schlechter als JPlag. Plagiarism Finder ist hingegen deutlich besser im Erkennen von nicht plagiierter Software. Bei wortwörtlichen Kopien sowie bei Änderungen an der Textgestaltung und an Bezeichnungen sind die Ergebnisse der Programme nahezu identisch

    On the side-effects of code abstraction

    Get PDF

    Plagiarism detection in source programs using structural similarities

    Get PDF
    The paper presents a plagiarism detection framework the goal of which is to determine whether two programs are similar to each other, and if so, to what extent. The issue of plagiarism detection has been considered earlier for written material, such as student essays. For these, text-based algorithms have been published. We argue that in case of program code comparison, structure based techniques may be much more suitable. The main idea is to transform the source code into mathematical objects, use appropriate reduction and comparison methods on these, and interpret the results appropriately. We have designed a generic program structure comparison framework and implemented it for the Prolog and SML programming languages. We have been using the implementation at BUTE to successfully detect plagiarism in homework assignments for years

    Declassification: transforming java programs to remove intermediate classes

    Get PDF
    Computer applications are increasingly being written in object-oriented languages like Java and C++ Object-onented programming encourages the use of small methods and classes. However, this style of programming introduces much overhead as each method call results in a dynamic dispatch and each field access becomes a pointer dereference to the heap allocated object. Many of the classes in these programs are included to provide structure rather than to act as reusable code, and can therefore be regarded as intermediate. We have therefore developed an optimisation technique, called declassification, which will transform Java programs into equivalent programs from which these intermediate classes have been removed. The optimisation technique developed involves two phases, analysis and transformation. The analysis involves the identification of intermediate classes for removal. A suitable class is defined to be a class which is used exactly once within a program. Such classes are identified by this analysis The subsequent transformation involves eliminating these intermediate classes from the program. This involves inlinmg the fields and methods of each intermediate class within the enclosing class which uses it. In theory, declassification reduces the number of classes which are instantiated and used in a program during its execution. This should reduce the overhead of object creation and maintenance as child objects are no longer created, and it should also reduce the number of field accesses and dynamic dispatches required by a program to execute. An important feature of the declassification technique, as opposed to other similar techniques, is that it guarantees there will be no increase in code size. An empirical study was conducted on a number of reasonable-sized Java programs and it was found that very few suitable classes were identified for miming. The results showed that the declassification technique had a small influence on the memory consumption and a negligible influence on the run-time performance of these programs. It is therefore concluded that the declassification technique was not successful in optimizing the test programs but further extensions to this technique combined with an intrinsically object-onented set of test programs could greatly improve its success

    Enhancing source-based clone detection using intermediate representation

    Get PDF
    Abstract-Detecting software clones in large scale projects helps improve the maintainability of large code bases. The source code representation (e.g., Java or C files) of a software system has traditionally been used for clone detection. In this paper, we propose a technique that transforms the source code to an intermediate representation, and then reuses established source-based clone detection techniques to detect clones in the intermediate representation. The clones are mapped back to the source code and are used to augment the results reported by source-based clone detection. We demonstrate the performance of our new technique using systems from the Bellon clone evaluation benchmark. The result shows that our technique can detect Type 3 clones. Our technique has higher recall with minimal drop in precision using Bellon corpus. By examining the complete clone groups, our technique has higher precision than the standalone string based and token based clone detectors
    corecore