6,138 research outputs found

    AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments

    Full text link
    Plagiarism is one of the growing issues in academia and is always a concern in Universities and other academic institutions. The situation is becoming even worse with the availability of ample resources on the web. This paper focuses on creating an effective and fast tool for plagiarism detection for text based electronic assignments. Our plagiarism detection tool named AntiPlag is developed using the tri-gram sequence matching technique. Three sets of text based assignments were tested by AntiPlag and the results were compared against an existing commercial plagiarism detection tool. AntiPlag showed better results in terms of false positives compared to the commercial tool due to the pre-processing steps performed in AntiPlag. In addition, to improve the detection latency, AntiPlag applies a data clustering technique making it four times faster than the commercial tool considered. AntiPlag could be used to isolate plagiarized text based assignments from non-plagiarised assignments easily. Therefore, we present AntiPlag, a fast and effective tool for plagiarism detection on text based electronic assignments

    Text Similarity from Image Contents using Statistical and Semantic Analysis Techniques

    Full text link
    Plagiarism detection is one of the most researched areas among the Natural Language Processing(NLP) community. A good plagiarism detection covers all the NLP methods including semantics, named entities, paraphrases etc. and produces detailed plagiarism reports. Detection of Cross Lingual Plagiarism requires deep knowledge of various advanced methods and algorithms to perform effective text similarity checking. Nowadays the plagiarists are also advancing themselves from hiding the identity from being catch in such offense. The plagiarists are bypassed from being detected with techniques like paraphrasing, synonym replacement, mismatching citations, translating one language to another. Image Content Plagiarism Detection (ICPD) has gained importance, utilizing advanced image content processing to identify instances of plagiarism to ensure the integrity of image content. The issue of plagiarism extends beyond textual content, as images such as figures, graphs, and tables also have the potential to be plagiarized. However, image content plagiarism detection remains an unaddressed challenge. Therefore, there is a critical need to develop methods and systems for detecting plagiarism in image content. In this paper, the system has been implemented to detect plagiarism form contents of Images such as Figures, Graphs, Tables etc. Along with statistical algorithms such as Jaccard and Cosine, introducing semantic algorithms such as LSA, BERT, WordNet outperformed in detecting efficient and accurate plagiarism.Comment: NLPTT2023 publication, 10 Page

    The Hegelian Inquiring System and Critical Triangulation Tools for the Internet Information Slave

    Get PDF
    This paper discusses informing, i.e. increasing people’s understanding of reality by providing representations of this reality. The Hegelian inquiry system is used to explain the nature of informing. Understanding the Hegelian inquiry system is essential for making informed decisions where the reality can be ambiguous and where sources of bias and manipulation have to be understood for increasing the level of free-informed choice. This inquiry system metaphorically identifies information masters and slaves, and we propose critical dialectic information triangulation (CDIT) tools for information slaves (i.e. non-experts) in dialect interactions with informative systems owned by supposed information masters. The paper concludes with suggestions for further research on informative triangulation tools for the internet and management information systems

    Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs

    Full text link
    Binary code analysis allows analyzing binary code without having access to the corresponding source code. A binary, after disassembly, is expressed in an assembly language. This inspires us to approach binary analysis by leveraging ideas and techniques from Natural Language Processing (NLP), a rich area focused on processing text of various natural languages. We notice that binary code analysis and NLP share a lot of analogical topics, such as semantics extraction, summarization, and classification. This work utilizes these ideas to address two important code similarity comparison problems. (I) Given a pair of basic blocks for different instruction set architectures (ISAs), determining whether their semantics is similar or not; and (II) given a piece of code of interest, determining if it is contained in another piece of assembly code for a different ISA. The solutions to these two problems have many applications, such as cross-architecture vulnerability discovery and code plagiarism detection. We implement a prototype system INNEREYE and perform a comprehensive evaluation. A comparison between our approach and existing approaches to Problem I shows that our system outperforms them in terms of accuracy, efficiency and scalability. And the case studies utilizing the system demonstrate that our solution to Problem II is effective. Moreover, this research showcases how to apply ideas and techniques from NLP to large-scale binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium 201

    Sensing Textual Plagiarism

    Get PDF
    This Final Year Project (FYP) is about Sensing Textual Plagiarism. To realize this, an application that is equipped with the capability of detecting plagiarism from occurring in a textual document is to be developed. The main focus of this project is to perform a study on how to detect plagiarism from a textual document. Word-for-word plagiarism is the most obvious and serious form of plagiarism which can be can be categorized as a form of direct stealing, without significant alteration and consent of another's work. Fact findings are carried out in order to perform the study on plagiarism. This project will incorporate the Smith-Waterman Algorithm which is a classical tool in the identification and quantification of local similarities in biological sequences. As a result, the significance of this project is the availability of the application to sense the wide spread of plagiarism that often occur upon valuable documents, articles, and journals
    • …
    corecore