17 research outputs found

    Structured Review of the Evidence for Effects of Code Duplication on Software Quality

    Get PDF
    This report presents the detailed steps and results of a structured review of code clone literature. The aim of the review is to investigate the evidence for the claim that code duplication has a negative effect on code changeability. This report contains only the details of the review for which there is not enough place to include them in the companion paper published at a conference (Hordijk, Ponisio et al. 2009 - Harmfulness of Code Duplication - A Structured Review of the Evidence)

    Program Compression

    Get PDF
    The talk focused on a grammar-based technique for identifying redundancy in program code and taking advantage of that redundancy to reduce the memory required to store and execute the program. The idea is to start with a simple context-free grammar that represents all valid basic blocks of any program. We represent a program by the parse trees (i.e. derivations) of its basic blocks using the grammar. We then modify the grammar, by considering sample programs, so that idioms of the language have shorter derivations in the modified grammar. Since each derivation represents a basic block, we can interpret the resulting set of derivations much as we would interpret the original program. We need only expand the grammar rules indicated by the derivation to produce a sequence of original program instructions to execute. The result is a program representation that is approximately 40% of the original program size and is interpretable by a very modest-sized interpreter

    Sistema web de detección de copias en prácticas de programación

    Get PDF
    Llegado el momento de corregir las prácticas de programación de una asignatura podemos encontrarnos con que algunos alumnos hayan copiado. Comparar archivos de texto completos es simple y suficiente para detectar copias completas exactas, sin embargo, la detección de copias parciales es una tarea más compleja. En este trabajo se presenta un sistema web que facilita el uso del servicio MOSS, de forma que el profesor pueda usarlo mediante una interfaz fácil e intuitiva, sin necesidad de saber lenguaje Perl ni teclear órdenes. El sistema propuesto facilita la gestión de los ficheros a comparar, realiza la petición a MOSS para que lleve a cabo la comparación, y por último muestra en una ventana del navegador los resultados de la comparación. El sistema propuesto se ha utilizado por primera vez en las prácticas de programación en ensamblador de Estructura de los Computadores I durante el curso 2005/2006 habiendo resultado muy eficaz en la detección de copias

    On the side-effects of code abstraction

    Get PDF

    Structured Review of Code Clone Literature

    Get PDF
    This report presents the results of a structured review of code clone literature. The aim of the review is to assemble a conceptual model of clone-related concepts which helps us to reason about clones. This conceptual model unifies clone concepts from a wide range of literature, so that findings about clones can be compared with each other

    Plagiarism detection in source programs using structural similarities

    Get PDF
    The paper presents a plagiarism detection framework the goal of which is to determine whether two programs are similar to each other, and if so, to what extent. The issue of plagiarism detection has been considered earlier for written material, such as student essays. For these, text-based algorithms have been published. We argue that in case of program code comparison, structure based techniques may be much more suitable. The main idea is to transform the source code into mathematical objects, use appropriate reduction and comparison methods on these, and interpret the results appropriately. We have designed a generic program structure comparison framework and implemented it for the Prolog and SML programming languages. We have been using the implementation at BUTE to successfully detect plagiarism in homework assignments for years

    Extracting Source Level Program Similarities from Dynamic Behavior

    Get PDF
    The vast majority of work on comparing program similarities to detect software piracy either assumes the availability of the program source code (e.g., Moss) or performs a complicated source program transformation to embed carefully designed signatures, or software watermarks, into the binary code. In this paper, we propose a new approach to detecting program similarities that requires neither the availability of the program source nor complicated compile-time watermarking techniques. Furthermore, in contrast to the alternatives, our framework is resistant to standard attacks such as code obfuscation. Our approach exploits the observation that the sequence of system calls performed by a program execution provides a strong signature of the program semantics or functionality, thereby using the inherent properties of a program to identify it. By statistically analyzing sequences of system calls, the relative similarities and differences of program regions can be automatically determined. We have developed a framework that automatically extracts system call sequences, computes the similarities between two binaries via statistical analysis, and maps dynamically similar regions onto textually similar source files. We present several case studies showing the applicability of our framework in pinpointing pirated segments. Our experimental study also shows that directly comparing the binary files of the programs without considering their dynamic behavior is ineffective, and demonstrates strong consistency between the output of our new framework and that of Moss

    Enhancing source-based clone detection using intermediate representation

    Get PDF
    Abstract-Detecting software clones in large scale projects helps improve the maintainability of large code bases. The source code representation (e.g., Java or C files) of a software system has traditionally been used for clone detection. In this paper, we propose a technique that transforms the source code to an intermediate representation, and then reuses established source-based clone detection techniques to detect clones in the intermediate representation. The clones are mapped back to the source code and are used to augment the results reported by source-based clone detection. We demonstrate the performance of our new technique using systems from the Bellon clone evaluation benchmark. The result shows that our technique can detect Type 3 clones. Our technique has higher recall with minimal drop in precision using Bellon corpus. By examining the complete clone groups, our technique has higher precision than the standalone string based and token based clone detectors

    Softwareplagiatserkennung auf Java-Bytecodebasis

    Get PDF
    Im Rahmen dieser Arbeit wird eine eigens entwickelte Plagiatserkennungsoftware für Java-Programme namens Plagiarism Finder vorgestellt. Der Plagiatserkennungsprozess dieser Software basiert auf dem Java-Bytecode. Es werden die Grundlagen der Plagiatserkennung und des Java-Bytecodes umfassend erläutert. Des Weiteren wird die Funktionsweise, der Entwurf, die Benutzeroberfläche und die Evaluierung von Plagiarism Finder dargestellt. Hierbei wird auf folgende von der Wissenschaft bisher nicht betrachteten Aspekte eingegangen: -Die Art der Normalisierung des Bytecodes vor dem Vergleich. -Wie gelöst werden kann, dass das Verschieben von Methoden keinen Einfluss auf die ermittelten Ergebnisse hat. -Wie bei der Plagiatserkennung Vorlagen gefiltert werden können. Die Arbeit kommt zu dem Resultat, dass sich die Ergebnisse von Plagiarism Finder stabil gegenüber Änderungen des Wortlautes, des Textsatzes und dem Verschieben von Methoden verhalten. Änderungen an Kontrollstrukturen (z.B. For- statt While- Schleifen), an Zugriffsmodifikatoren und an der Anzahl von Methoden führen zu einem instabilen Verhalten der Ergebnisse. Alles in allem kann Plagiarism Finder mit einer etablierten Plagiatserkennugssoftware wie JPlag [MP00] mithalten. Auf Grundlage der untersuchten Daten ist Plagiarism Finder im Erkennen von Plagiaten leicht schlechter als JPlag. Plagiarism Finder ist hingegen deutlich besser im Erkennen von nicht plagiierter Software. Bei wortwörtlichen Kopien sowie bei Änderungen an der Textgestaltung und an Bezeichnungen sind die Ergebnisse der Programme nahezu identisch
    corecore