6,230 research outputs found

    Sistem Deteksi Plagiat pada Dokumen Bahasa Indonesia dengan Algoritma SCAM

    Get PDF
    ABSTRAKSI: Plagiat merupakan tindak kecurangan yang sering kali dilakukan. Baik terkadang tidak menyadari bahwa tindakan yang mereka lakukan tergolong plagiat ataukah dengan sengaja melakukan tindak plagiat tersebut. Dengan kemajuan teknologi, tindak plagiat saat ini lebih mudah untuk dilakukan karena banyaknya dokumen-dokumen yang diunggah di internet tanpa adanya suatu pengaman khusus sehingga dengan sangat mudah dapat di-copy oleh orang lain. Semakin meningkat kecanggihan manusia dalam melakukan plagiat, maka dibutuhkan juga sistem yang dapat membantu mendeteksi plagiat dalam sebuah dokumen. Oleh karena itu, penulis menggagas untuk dibangunnya sebuah sistem deteksi plagiat (Plagiarism detection system) yang menggunakan SCAM (Stanford Copy Analysis Mechanism). SCAM adalah sebuah mekanisme untuk menghitung tingkat kemiripan antara dua atau lebih dokumen. SCAM baik dalam mendeteksi dokumen yang overlap tanpa bergantung pada lenght dokumen tersebut. Dan juga SCAM mampu mengenali dengan baik untuk dokumen yang merupakan subset atau superset dari dokumen lainnya. SCAM baik dalam mendeteksi dokumen yang overlap tanpa bergantung pada lenght dokumen tersebut. Dan juga SCAM mampu mengenali dengan baik untuk dokumen yang merupakan subset atau superset dari dokumen lainnya. Dalam penelitian Tugas Akhir ini dilakukan pendeteksian terhadap 5 jenis plagiat, yaitu sinonim, pasif aktif, carbon copy, ubah susunan dan juga tambah kata. Dimana dokumen yang digunakan berupa abstrak Tugas Akhir mahasiswa IT Telkom berbahasa Indonesia.Kata Kunci : SCAM, plagiat, plagiarism, Standford Copy Analysis Mechanism, Plagiarism Detection System.ABSTRACT: Plagiarsm is the act of cheating is often done. Sometimes we do not realize that we are doing quite plagiarism or knowingly commit plagiarsm is. By the advances of technology, the current act plagiarism easier to do, because of the documents uploaded on the internet without any security that can very easily be copied by others. Increasing sophistication of humans in plagiarism, it also needed a system that can help detect plagiarsm in a document. Therefore, the authors initiated for the construction of a plagiarsm detection system using SCAM (Stanford Copy Analysis Mechanism). SCAM is a mechanism to calculate the similarity between two or more documents.SCAM works good in finding exact or partial plagiarism copies, the algorithm is able to precisely detect copy of entire phrases or parts of them. This riset was conducted detection of the 5 types of plagiarism which is a synonim, active passive, carbon copy, recomposition, and also added the word. This test uses a Indonesian document from abstract of finall assigment from IT Telkom student.Keyword: SCAM, plagiarism, Standford Copy Analysis Detection Mechanism, Plagiarism Detection Syste

    Sistem Pendeteksi Plagiarisme Pada Dokumen Teks Bahasa Indonesia Dengan Menggunakan Metode Latent Semantic Analysis

    Get PDF
    ABSTRAKSI: Pencarian informasi pada masa yang serba canggih ini sangatlah mudah didapatkan. Hal tersebut secara tidak langsung membawa dampak positif dan negatif. Salah satu dampak negatifnya adalah terjadinya tindakan plagiarisme baik disadari maupun tidak. Dalam lingkungan alademik, tindakan plagiarisme merupakan perbuatan yang sangat tercela.Untuk mencegah plagiarisme banyak cara dilakukan salah satunya pengecekan manual terhadap judul-judul karya ilmiah yang diajukan oleh mahasiswa kepada tim skripsi dengan skripsi-skripsi sebelumnya. Adapun beberapa masalah yang timbul dari pengecekan manual seperti terlalu memakan banyak waktu.Untuk membantu mendeteksi dokumen yang terindikasi plagiat, dibuatlah sebuah sistem yang dapat menghitung nilai similarity antar dokumen dengan menggunakan metode Latent Semantic Analysis. Metode Latent Semantic Analysis digunakan untuk mencari dokumen yang memiliki kesamaan teks dengan melalui beberapa tahap seperti tokenizing, stoplist, dan stemming. Untuk perhitungannya menggunakan algoritma model ruang vektor.Pada Tugas Akhir ini dilakukan dua skenario pengujian yaitu intra class dan ekstra class untuk mengetahui nilai similarity. Hasil pengujian skenario intra class ini dapat dilihat bahwa terdapat 119 abstrak terindikasi plagiat yang menunjukkan bahwa missed detection pada system terdapat pada 1 dokumen. Sedangkan pada skenario extra class pada fakultas yang sama maupun fakultas yang berbeda, masih menunjukkan terjadinya false detection dan menghasilkan nilai similarity yang melebihi threshold indikasi plagiat.Kata Kunci : plagiat, Latent Semantic Analysis, Singular Value Decomposition, Metode Ruang VectorABSTRACT: Information retrieval during the all-powerful is very easy to get. It indirectly bring positive and negative effects. One of the negative impacts is the act of plagiarism either consciously or unconsciously. In alademik environment, acts of plagiarism is a very despicable act.To prevent plagiarism is one of many ways done manually checking the titles of scientific papers submitted by students to the team with the thesis-thesis thesis before. As for some of the problems arising from manual checks as too time consuming.To help detect plagiarism indicated documents, they invented a system that can calculate the value of similarity between documents by using the method of Latent Semantic Analysis. Latent Semantic Analysis method is used to find documents that have the same text in several stages such as tokenizing, stoplist and stemming. For calculations using the vector space model algorithm.In this final project conducted two test scenarios, namely intra-class and extra class to determine the value of similarity. The results of intra-class testing scenarios can be seen that there were 119 abstracts indicated that suggests that missed plagiarism detection system contained in one document. While in the scenario of extra classes in the same school or a different school, it still shows the false detection and produce similarity values which exceed the threshold indicative of plagiarism.Keyword: Plagiarism, Latent Semantic Analysis, Singular Value Decomposition, Vector Space Model

    Shape-Based Plagiarism Detection for Flowchart Figures in Texts

    Full text link
    Plagiarism detection is well known phenomenon in the academic arena. Copying other people is considered as serious offence that needs to be checked. There are many plagiarism detection systems such as turn-it-in that has been developed to provide this checks. Most, if not all, discard the figures and charts before checking for plagiarism. Discarding the figures and charts results in look holes that people can take advantage. That means people can plagiarized figures and charts easily without the current plagiarism systems detecting it. There are very few papers which talks about flowcharts plagiarism detection. Therefore, there is a need to develop a system that will detect plagiarism in figures and charts. This paper presents a method for detecting flow chart figure plagiarism based on shape-based image processing and multimedia retrieval. The method managed to retrieve flowcharts with ranked similarity according to different matching sets.Comment: 12 page

    Plagiarism Detection Avoidance Methods and Countermeasures

    Full text link
    Plagiarism is a major problem that educators face in the information age. Today\u27s plagiarist has a near limitless supply of well-written articles via the internet. Due to the scale of the problem, detecting plagiarism has now become the domain of the computer scientist rather than the educator. With the use of computers, documents can be conveniently scanned into a plagiarism detection system that references public web pages, academic journals, and even previous students\u27 papers, acting as an all-seeing eye. However, plagiarists can overcome these digital content detection systems with the use of clever masking and substitutions techniques. These systems cost universities tens of thousands of dollars, and also infringe upon intellectual property ownership rights without the informed consent of individual students. In this work, we examine the efficacy of commercial plagiarism detection systems when used against some selected masking techniques, and then present a simple countermeasure to combat the aforementioned detection avoidance technique

    Deep Investigation of Cross-Language Plagiarism Detection Methods

    Full text link
    This paper is a deep investigation of cross-language plagiarism detection methods on a new recently introduced open dataset, which contains parallel and comparable collections of documents with multiple characteristics (different genres, languages and sizes of texts). We investigate cross-language plagiarism detection methods for 6 language pairs on 2 granularities of text units in order to draw robust conclusions on the best methods while deeply analyzing correlations across document styles and languages.Comment: Accepted to BUCC (10th Workshop on Building and Using Comparable Corpora) colocated with ACL 201