MQVC: Measuring Quranic Verses Similarity and Sura Classification Using N-Gram

Abstract

Extensive research efforts in the area of Information Retrieval were concentrated on developing retrieval systems related to Arabic language for the different natural language and information retrieval methodologies. However, little effort was conducted in those areas for knowledge extraction from the Holly Muslim book, the Quran. In this paper, we present an approach (MQVC) for retrieving the most similar verses in comparison with a user input verse as a query. To demonstrate the accuracy of our approach, we performed a set of experiments and compared the results with an evaluation from a Quran Specialist who manually identified all relevant chapters and verses to the targeted verse in our study. The MQVC approach was applied to 70 out of 114 Quran chapters. We picked 40 verses randomly and calculated the precision to evaluate the accuracy of our approach. We utilized N-gram to extend the work by performing experiment with machine learning algorithm (LibSVM classifier in Weka), to classify Quran chapters based on the most common scholars classification: Makki and Madani chapters

    Similar works