A Simple and Effective Method of Cross-Lingual Plagiarism Detection

Avetisyan, Arutyun; Avetisyan, Karen; Ghukasyan, Tsolak; Malajyan, Arthur

A Simple and Effective Method of Cross-Lingual Plagiarism Detection

Authors: Arutyun Avetisyan
Karen Avetisyan
Tsolak Ghukasyan
Arthur Malajyan
Publication date: 5 April 2023
Publisher

Abstract

We present a simple cross-lingual plagiarism detection method applicable to a large number of languages. The presented approach leverages open multilingual thesauri for candidate retrieval task and pre-trained multilingual BERT-based language models for detailed analysis. The method does not rely on machine translation and word sense disambiguation when in use, and therefore is suitable for a large number of languages, including under-resourced languages. The effectiveness of the proposed approach is demonstrated for several existing and new benchmarks, achieving state-of-the-art results for French, Russian, and Armenian languages

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2304.01352

Last time updated on 10/04/2023