2 research outputs found

    Detecting Disguised Plagiarism

    Full text link
    Source code plagiarism detection is a problem that has been addressed several times before; and several tools have been developed for that purpose. In this research project we investigated a set of possible disguises that can be mechanically applied to plagiarized source code to defeat plagiarism detection tools. We propose a preprocessor to be used with existing plagiarism detection tools to "normalize" source code before checking it, thus making such disguises ineffective

    Evolving Similarity Functions for Code Plagiarism Detection

    No full text
    Detecting whether computer program code is a student’s original work or has been copied from another student or some other source is a major problem for many universities. Detection methods based on the information retrieval concepts of indexing and similarity matching scale well to large collections of files, but require appropriate similarity functions for good performance. We have used particle swarm optimization and genetic programming to evolve similarity functions that are suited to computer program code. Using a training set of plagiarised and non-plagiarised programs we have evolved better parameter values for the previously published Okapi BM25 similarity function. We have then used genetic programming to evolve completely new similarity functions that do not conform to any predetermined structure. We found that the evolved similarity functions outperformed the human developed Okapi BM25 function. We also found that a detection system using the evolved functions was more accurate than the the best code plagiarism detection system in use today, and scales much better to large collections of files. The evolutionary computing techniques have been extremely useful in finding similarity functions that advance the state of the art in code plagiarism detection
    corecore