31 research outputs found

    Structural analysis of source code plagiarism using graphs

    Get PDF
    A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg in fulfillment of the requirements for the degree of Master of Science. May 2017Plagiarism is a serious problem in academia. It is prevalent in the computing discipline where students are expected to submit source code assignments as part of their assessment; hence, there is every likelihood of copying. Ideally, students can collaborate with each other to perform a programming task, but it is expected that each student submit his/her own solution for the programming task. More so, one might conclude that the interaction would make them learn programming. Unfortunately, that may not always be the case. In undergraduate courses, especially in the computer sciences, if a given class is large, it would be unfeasible for an instructor to manually check each and every assignment for probable plagiarism. Even if the class size were smaller, it is still impractical to inspect every assignment for likely plagiarism because some potentially plagiarised content could still be missed by humans. Therefore, automatically checking the source code programs for likely plagiarism is essential. There have been many proposed methods that attempt to detect source code plagiarism in undergraduate source code assignments but, an ideal system should be able to differentiate actual cases of plagiarism from coincidental similarities that usually occur in source code plagiarism. Some of the existing source code plagiarism detection systems are either not scalable, or performed better when programs are modified with a number of insertions and deletions to obfuscate plagiarism. To address this issue, a graph-based model which considers structural similarities of programs is introduced to address cases of plagiarism in programming assignments. This research study proposes an approach to measuring cases of similarities in programming assignments using an existing plagiarism detection system to find similarities in programs, and a graph-based model to annotate the programs. We describe experiments with data sets of undergraduate Java programs to inspect the programs for plagiarism and evaluate the graph-model with good precision. An evaluation of the graph-based model reveals a high rate of plagiarism in the programs and resilience to many obfuscation techniques, while false detection (coincident similarity) rarely occurred. If this detection method is adopted into use, it will aid an instructor to carry out the detection process conscientiously.MT 201

    Uji Plagiarism pada Tugas Mahasiswa Menggunakan Algoritma Winnowing

    Get PDF
    Plagiarism is the act of taking or plagiarizing the work, ideas or ideas of others either intentionally or unintentionally and claiming to be one's own work without mentioning the source or author. Often students plagiarize assignments given by the lecturer so that sometimes students just copy other people's assignments to complete the assignment. So that this can lead to dependence on other people in doing assignments and not being able to independently carry out assignments given by the lecturer. Currently, many plagiarism detection systems have been created to help reduce the level of plagiarism, one of which is the winnowing algorithm. In this study the authors used the winnowing algorithm to detect plagiarism in student assignments, namely programming source code, from the results of research conducted on 10 student assignments using the winnowing algorithm produced various similarity values as a percentage of similarity between two students tasks compared. With an average value of the overall similarity of the 10 tasks, namely 75.12%

    Comparison between the Stemmer Porter Effect and Nazief-Adriani on the Performance of Winnowing Algorithms for Measuring Plagiarism

    Get PDF
    Current technological developments change physical paper patterns into digital, and this has a very high impact. Positive impact because paper waste is reduced, on the other hand, the rampant copying of digital data raises the amount of plagiarism that is increasing. At present, there are many efforts made by experts to overcome the problem of plagiarism, one of which is by utilizing the winnowing algorithm as a tool to detect plagiarism data. In its development, many optimizing winnowing algorithms used stemming techniques. The most widely used stemmer algorithms include stemmer porter and nazief-adriani. However, there has not been a discussion on the comparison of the effect of performance using stemmer on the winnowing algorithm in measuring the value of plagiarism. So it is necessary to research the effect of stemmer algorithms on winnowing algorithms so that the results of plagiarism detection are more optimal. The results of this study indicate that the effect of nazief-adriani stemmer on the winnowing algorithm is superior to the stemmer porter, only decreasing the detection performance of the 0.28% similarity value while the Porter stemmer is superior in increasing the processing time to 69% faster

    Computer-based assessment system for e-learning applied to programming education

    Get PDF
    Tese de Mestrado Integrado. Engenharia Informática e Computação. Faculdade de Engemharia. Universidade do Porto. 201

    Enhancing computer-aided plagiarism detection

    Get PDF

    Process Model Improvement for Source Code Plagiarism Detection in Student Programming Assignments

    Get PDF
    In programming courses there are various ways in which students attempt to cheat. The most commonly used method is copying source code from other students and making minimal changes in it, like renaming variable names. Several tools like Sherlock, JPlag and Moss have been devised to detect source code plagiarism. However, for larger student assignments and projects that involve a lot of source code files these tools are not so effective. Also, issues may occur when source code is given to students in class so they can copy it. In such cases these tools do not provide satisfying results and reports. In this study, we present an improved process model for plagiarism detection when multiple student files exist and allowed source code is present. In the research in this paper we use the Sherlock detection tool, although the presented process model can be combined with any plagiarism detection engine. The proposed model is tested on assignments in three courses in two subsequent academic years

    Improving source code plagiarism detection systems

    Get PDF
    Образовање у области рачунарства укључује практичан рад кроз програмске задатке који су честа мета плагијаризма. У овом раду су дискутовани различити аспекти плагијаризма у програмском коду у академском окружењу, извршена је упоредна анализа софтверских система за детекцију сличности и предложена њихова унапређења. Изабрани системи су евалуирани коришћењем три различита програма над којима је коришћено више од 20 типова лексичких и структуралних измена које су примењиване на код током 1, 2, 4, и 8 сати рада. Примењено је и реално оптерећење које је укључивало задатке обима од 50 до 1000 линија програмског кода са три различита предмета које је похађало од 100 до 300 студената. Резултати су показали да 5-10% студената, сходно метрици и критеријумима ове тезе, плагира своја решења...Computing education involves practical training through programming assignments which are frequent targets for plagiarism. In this thesis, different aspects of source code plagiarism in academic environment are discussed. Comparative analysis of source code similarity detection systems was performed and several improvеments were proposed. Selected systems were evaluated using simulated plagiarism based on three programming assignments produced after 1, 2, 4, and 8 hours of work on baseline version using more than 20 types of lexical and structural modifications. Real-life student codes from three different courses were also used for evaluation. The courses were attended by 100 to 300 students, and the solutions varied from 50 to 1000 lines of code. The results show that 5-10% of students plagiarized their solutions, according to the criteria used in this thesis..

    PENERAPAN ALGORITME RABIN-KARP DAN COSINE SIMILARITY UNTUK PEMERIKSAAN KESAMAAN DOKUMEN TUGAS MAKALAH MAHASISWA (STUDI KASUS: TEKNIK INFORMATIKA UIN SUSKA RIAU)

    Get PDF
    Kesamaan dokumen (document similarity) merupakan fondasi dari sistem kecerdasan dalam pemrosesan data seperti information retrieval dan klasifikasi teks. Berdasarkan hasil wawancara yang dilakukan terhadap beberapa dosen Teknik Informatika UIN SUSKA Riau, didapatkan bahwa masih banyak ditemukan kasus kesamaaan dokumen tugas makalah antara satu mahasiswa dengan mahasiswa lain, sehingga perlu dilakukan proses pemeriksaan kesamaan dokumen tugas makalah mahasiswa. Pada penelitian ini, algoritme yang diterapkan untuk melakukan proses pemeriksaan kesamaan dokumen adalah algoritme Rabin-Karp dan Cosine Similarity. Algoritme Rabin-Karp digunakan untuk proses prepocessing dan ekstraksi nilai hash. Sedangkan Cosine similarity digunakan untuk menghitung nilai persentase kesamaan dari dokumen yang diuji. Hasil pengujian tata letak kalimat didapatkan bahwa nilai similarity akan tetap sama walaupun letak kalimat telah diubah. Selanjutnya, hasil pengujian dokumen yang sama akan menghasilkan presentase similarity 100%. Sebaliknya, hasil pengujian dokumen yang tidak sama akan menghasilkan presentase similarity 0%. Selain itu pengujian dua dokumen dengan nilai K yang digunakan 3, 5, 6, dan 7 diperoleh hasil bahwa semakin kecil nilai K-gram maka semakin tinggi hasil similarity yang diperoleh dan nilai similarity tertinggi terletak pada K= 3 yaitu sebesar 18,54 %. Kemudian, hasil pengujian perbandingan antara sistem pemeriksaan kesamaan dokumen dengan plagiarism checker x dari 15 file dokumen, diperoleh nilai similarity tertinggi 10,29 % pada sistem pemeriksaan dokumen dan pada plagiarism checker x nilai similarity tertinggi 14,06 %.
    corecore