Search CORE

1,318 research outputs found

An Abstract Method Linearization for Detecting Source Code Plagiarism in Object-Oriented Environment

Author: Karnalim Oscar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/11/2017
Field of study

Despite the fact that plagiarizing source code is a trivial task for most CS students, detecting such unethical behavior requires a considerable amount of effort. Thus, several plagiarism detection systems were developed to handle such issue. This paper extends Karnalim's work, a low-level approach for detecting Java source code plagiarism, by incorporating abstract method linearization. Such extension is incorporated to enhance the accuracy of low-level approach in term of detecting plagiarism in object-oriented environment. According to our evaluation, which was conducted based on 23 design-pattern source code pairs, our extended low-level approach is more effective than state-of-the-art and Karnalim's approach. On the one hand, when compared to state-of-the-art approach, our approach can generate less coincidental similarities and provide more accurate result. On the other hand, when compared to Karnalim's approach, our approach, at some extent, can generate higher similarity when simple abstract method invocation is incorporated.Comment: The 8th International Conference on Software Engineering and Service Scienc

arXiv.org e-Print Archive

Crossref

The Effectiveness of Low-Level Structure-based Approach Toward Source Code Plagiarism Level Taxonomy

Author: Budi Setia
Karnalim Oscar
Publication venue
Publication date: 03/05/2018
Field of study

Low-level approach is a novel way to detect source code plagiarism. Such approach is proven to be effective when compared to baseline approach (i.e., an approach which relies on source code token subsequence matching) in controlled environment. We evaluate the effectiveness of state of the art in low-level approach based on Faidhi \& Robinson's plagiarism level taxonomy; real plagiarism cases are employed as dataset in this work. Our evaluation shows that state of the art in low-level approach is effective to handle most plagiarism attacks. Further, it also outperforms its predecessor and baseline approach in most plagiarism levels.Comment: The 6th International Conference on Information and Communication Technolog

arXiv.org e-Print Archive

Crossref

Issues Related to the Detection of Source Code Plagiarism in Students Assignments

Author: AlHami I.
Alsmadi Izzat M.
Kazakzeh S.
Publication venue: Digital Commons @ Texas A&M University-San Antonio
Publication date: 01/01/2014
Field of study

Detecting similarity or plagiarism in the academic research publications, source code, etc. has been a long time complex and time consuming task. Several algorithms, tools and websites exist that try to find plagiarism or possible plagiarism in those human creative products. In this paper we used source code plagiarism detection tools to assess the level of plagiarism in source codes. We also investigated issues related to accuracy and challenges in detecting possible plagiarism in students\u27 assignments. In a second study, we evaluated some tools against detecting possible plagiarism in research papers. Results showed that such process or decision is not binary to make and that subjectivity is high. In addition, there is a need to tune plagiarism detection tools to give criticality or weights by users of those tools to categorize and classify different levels of seriousness for committing plagiarism

Digital Commons @ Texas A&M University-San Antonio

The System Kato: Detecting Cases of Plagiarism for Answer-Set Programs

Author: Arwin
Farringdon
Halstead
HANS TOMPITS
JOHANNES OETSCH
Jones
JÖRG PÜHRER
MARTIN SCHWENGERER
Maurer
Mozgovoy
Prechelt
Verco
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2010
Field of study

Plagiarism detection is a growing need among educational institutions and solutions for different purposes exist. An important field in this direction is detecting cases of source-code plagiarism. In this paper, we present the tool Kato for supporting the detection of this kind of plagiarism in the area of answer-set programming (ASP). Currently, the tool is implemented for DLV programs but it is designed to handle other logic-programming dialects as well. We review the basic features of Kato, introduce its theoretical underpinnings, and discuss an application of Kato for plagiarism detection in the context of courses on logic programming at the Vienna University of Technology

arXiv.org e-Print Archive

CiteSeerX

Crossref

An approach to source-code plagiarism detection investigation using latent semantic analysis

Author: Cosma Georgina
Publication venue
Publication date
Field of study

This thesis looks at three aspects of source-code plagiarism. The first aspect of the thesis is concerned with creating a definition of source-code plagiarism; the second aspect is concerned with describing the findings gathered from investigating the Latent Semantic Analysis information retrieval algorithm for source-code similarity detection; and the final aspect of the thesis is concerned with the proposal and evaluation of a new algorithm that combines Latent Semantic Analysis with plagiarism detection tools. A recent review of the literature revealed that there is no commonly agreed definition of what constitutes source-code plagiarism in the context of student assignments. This thesis first analyses the findings from a survey carried out to gather an insight into the perspectives of UK Higher Education academics who teach programming on computing courses. Based on the survey findings, a detailed definition of source-code plagiarism is proposed. Secondly, the thesis investigates the application of an information retrieval technique, Latent Semantic Analysis, to derive semantic information from source-code files. Various parameters drive the effectiveness of Latent Semantic Analysis. The performance of Latent Semantic Analysis using various parameter settings and its effectiveness in retrieving similar source-code files when optimising those parameters are evaluated. Finally, an algorithm for combining Latent Semantic Analysis with plagiarism detection tools is proposed and a tool is created and evaluated. The proposed tool, PlaGate, is a hybrid model that allows for the integration of Latent Semantic Analysis with plagiarism detection tools in order to enhance plagiarism detection. In addition, PlaGate has a facility for investigating the importance of source-code fragments with regards to their contribution towards proving plagiarism. PlaGate provides graphical output that indicates the clusters of suspicious files and source-code fragments

Warwick Research Archives Portal Repository