1,981 research outputs found

    MQVC: Measuring Quranic Verses Similarity and Sura Classification Using N-Gram

    Get PDF
    Extensive research efforts in the area of Information Retrieval were concentrated on developing retrieval systems related to Arabic language for the different natural language and information retrieval methodologies. However, little effort was conducted in those areas for knowledge extraction from the Holly Muslim book, the Quran. In this paper, we present an approach (MQVC) for retrieving the most similar verses in comparison with a user input verse as a query. To demonstrate the accuracy of our approach, we performed a set of experiments and compared the results with an evaluation from a Quran Specialist who manually identified all relevant chapters and verses to the targeted verse in our study. The MQVC approach was applied to 70 out of 114 Quran chapters. We picked 40 verses randomly and calculated the precision to evaluate the accuracy of our approach. We utilized N-gram to extend the work by performing experiment with machine learning algorithm (LibSVM classifier in Weka), to classify Quran chapters based on the most common scholars classification: Makki and Madani chapters

    K-means variations analysis for translation of English Tafseer Al-Quran text

    Get PDF
    Text mining is a powerful modern technique used to obtain interesting information from huge datasets. Text clustering is used to distinguish between documents that have the same themes or topics. The absence of the datasets ground truth enforces the use of clustering (unsupervised learning) rather than others, such as classification (supervised learning). The “no free lunch” (NFL) theorem supposed that no algorithm outperformed the other in a variety of conditions (several datasets). This study aims to analyze the k-means cluster algorithm variations (three algorithms (k-means, mini-batch k-means, and k-medoids) at the clustering process stage. Six datasets were used/analyzed in chapter Al-Baqarah English translation (text) of 286 verses at the preprocessing stage. Moreover, feature selection used the term frequency–inverse document frequency (TF-IDF) to get the weighting term. At the final stage, five internal cluster validations metrics were implemented silhouette coefficient (SC), Calinski-Harabasz index (CHI), C-index (CI), Dunn’s indices (DI) and Davies Bouldin index (DBI) and regarding execution time (ET). The experiments proved that k-medoids outperformed the other two algorithms in terms of ET only. In contrast, no algorithm is superior to the other in terms of the clustering process for the six datasets, which confirms the NFL theorem assumption

    A Statistical Learning Approach to Evidence the Acoustic Miracles in the Holy Quran Using Audio Features

    Get PDF
    This paper presents a novel approach for exploring the intrinsic acoustic properties of the Holy Quran, in an attempt to provide yet one more evidence of the miraculous nature of the Quran. The study uses a dataset composed of recitations made by seven prominent reciters and three chapters of the Quran. A novel statistical approach is used to detect the correlation between the recitations of the reciters for three different Chapters (Quranic Surah). The study utilizes the Mel-Frequency Cepstral Coefficients (MFCCs) feature to detect certain common patterns among the recitations. The main measurement indexes used in this study are the correlation and the Euclidian Distance (ED) between the mean of the MFCCs Cepstral Coefficients, and deltadelta MFCCs. The study reveals a strong correlation and short distance between all recitations for one verse at a time, and relatively high correlation and short distance for two or more verses. Furthermore, the study lays down a foundation to detect and formulate acoustic clusters for sequential verses in the Holy Quran

    A Machine Learning Model for the Identification of the Holy Quran Reciter Utilizing K-Nearest Neighbor and Artificial Neural Networks

    Get PDF
    The method of identification of the Holy Quran reciter, which is entered on the various features of the acoustic wave, is referred to as the Holy Quran Reciter Identification. The Muslim communitys Holy Book is the Holy Quran. Listening to or reading the Holy Quran is one of the obligatory activities for Muslims. This research proposes a machine learning model for identifying the Holy Quran reciter using a machine learning language. Here, the presented system comprises the essential phases for a voice recognition system encompassing the processes of classification, extraction of features, preprocessing, and data acquisition. Moreover, the voices of ten known reciters are framed as a dataset in this research. The reciters are leaders of prayers in the Holy masjids of Madinah and Makkah. The analysis of the audio dataset is performed using the mel frequency cepstral coefficients (MFCC). The artificial neural network (ANN) and the k-nearest neighbor (KNN) classifiers are employed for classification. The pitch is utilized as features employed to train the KNN and ANN classifiers. The proposed system is validated using two chapters chosen from the Holy Quran. The results revealed an excellent level of accuracy. With the help of the ANN classifier, the proposed system offered 98.5% accuracy for chapter 7 and 97.2% accuracy for chapter 32. On the other hand, while utilizing KNN, the accuracy for chapter 7 is 97.02% and for chapter 32 is 96.07%. Then, the system’s performance is compared with the utilization of support vector machines (SVM) in recognition of Quranic voice reciter. The comparison results revealed that ANN is a better machine learning algorithm for voice recognition when compared to SVM

    Mobile technology and Islamic education for non-native Arabic children

    Get PDF
    Most religions practised at present are grounded in deep history and tradition. Sculptures and writings have been passed on from generations and are integral to the sanctity of the religion. In addition, the use of digital technology can provide us with a mechanism to not only maintain the consistency of the teaching but also establish a real-time learning and understanding process for a variety of users. Islam is one of the largest religions in the world with almost 1/5th of the world’s population being of the Muslim faith. The Qur’an is the holy book for millions of Muslims around the world and is read and learnt in the Arabic language. This thesis aims to provide a strong technology base to improve the teaching experience for school teachers and religious scholars in regard to educating non-native Arabic children about the holy book, the Qur’an, as well as improving the progress curve for those children by proving a strong technology base which addresses their circumstances. It should be noted that there was no prior knowledge about those children and their difficulties caused by language, background, and cultural barriers in relation to their learning abilities. In this thesis, we build a web simulator based on a reinforcement learning mechanism and use a Speech recognition system to achieve our intended goal. Unlike other studies, this research is build based on a User Centre Design where data was collected from both types of instructors, being school teachers and religious scholars, to help build the system. After the system was built, we used the same type of instructors to examine the system before they actually implemented it on children. Thus, this research has been completed through the use of three studies: a qualitative study with school teachers and religious scholars to collect data about what the design should address, a qualitative study on both types of teachers to examine the system, and finally, a quantitative study completed by teachers on the use of the system by children. The results found were extremely positive in terms of providing teachers with a system which is cable to support and improve their teaching experience as well as ensuring an incredible improvement of children’s’ performance in their Islamic education

    Parallel corpus multi stream question answering with applications to the Qu'ran

    Get PDF
    Question-Answering (QA) is an important research area, which is concerned with developing an automated process that answers questions posed by humans in a natural language. QA is a shared task for the Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing communities (NLP). A technical review of different QA system models and methodologies reveals that a typical QA system consists of different components to accept a natural language question from a user and deliver its answer(s) back to the user. Existing systems have been usually aimed at structured/ unstructured data collected from everyday English text, i.e. text collected from television programmes, news wires, conversations, novels and other similar genres. Despite all up-to-date research in the subject area, a notable fact is that none of the existing QA Systems has been tested on a Parallel Corpus of religious text with the aim of question answering. Religious text has peculiar characteristics and features which make it more challenging for traditional QA methods than other kinds of text. This thesis proposes PARMS (Parallel Corpus Multi Stream) Methodology; a novel method applying existing advanced IR (Information Retrieval) techniques, and combining them with NLP (Natural Language Processing) methods and additional semantic knowledge to implement QA (Question Answering) for a parallel corpus. A parallel Corpus involves use of multiple forms of the same corpus where each form differs from others in a certain aspect, e.g. translations of a scripture from one language to another by different translators. Additional semantic knowledge can be referred as a stream of information related to a corpus. PARMS uses Multiple Streams of semantic knowledge including a general ontology (WordNet) and domain-specific ontologies (QurTerms, QurAna, QurSim). This additional knowledge has been used in embedded form for Query Expansion, Corpus Enrichment and Answer Ranking. The PARMS Methodology has wider applications. This thesis applies it to the Quran – the core text of Islam; as a first case study. The PARMS Method uses parallel corpus comprising ten different English translations of the Quran. An individual Quranic verse is treated as an answer to questions asked in a natural language, English. This thesis also implements PARMS QA Application as a proof of concept for the PARMS methodology. The PARMS Methodology aims to evaluate the range of semantic knowledge streams separately and in combination; and also to evaluate alternative subsets of the DATA source: QA from one stream vs. parallel corpus. Results show that use of Parallel Corpus and Multiple Streams of semantic knowledge have obvious advantages. To the best of my knowledge, this method is developed for the first time and it is expected to be a benchmark for further research area

    A review and open issues of diverse text watermarking techniques in spatial domain

    Get PDF
    Nowadays, information hiding is becoming a helpful technique and fetches more attention due to the fast growth of using the internet; it is applied for sending secret information by using different techniques. Watermarking is one of major important technique in information hiding. Watermarking is of hiding secret data into a carrier media to provide the privacy and integrity of information so that no one can recognize and detect it's accepted the sender and receiver. In watermarking, many various carrier formats can be used such as an image, video, audio, and text. The text is most popular used as a carrier files due to its frequency on the internet. There are many techniques variables for the text watermarking; each one has its own robust and susceptible points. In this study, we conducted a review of text watermarking in the spatial domain to explore the term text watermarking by reviewing, collecting, synthesizing and analyze the challenges of different studies which related to this area published from 2013 to 2018. The aims of this paper are to provide an overview of text watermarking and comparison between approved studies as discussed according to the Arabic text characters, payload capacity, Imperceptibility, authentication, and embedding technique to open important research issues in the future work to obtain a robust method

    Can bank interaction during rating measurement of micro and very small enterprises ipso facto Determine the collapse of PD status?

    Get PDF
    This paper begins with an analysis of trends - over the period 2012-2018 - for total bank loans, non-performing loans, and the number of active, working enterprises. A review survey was done on national data from Italy with a comparison developed on a local subset from the Sardinia Region. Empirical evidence appears to support the hypothesis of the paper: can the rating class assigned by banks - using current IRB and A-IRB systems - to micro and very small enterprises, whose ability to replace financial resources using endogenous means is structurally impaired, ipso facto orient the results of performance in the same terms of PD assigned by the algorithm, thereby upending the principle of cause and effect? The thesis is developed through mathematical modeling that demonstrates the interaction of the measurement tool (the rating algorithm applied by banks) on the collapse of the loan status (default, performing, or some intermediate point) of the assessed micro-entity. Emphasis is given, in conclusion, to the phenomenon using evidence of the intrinsically mutualistic link of the two populations of banks and (micro) enterprises provided by a system of differential equation

    Building a neural speech recognizer for quranic recitations

    Get PDF
    This work is an effort towards building Neural Speech Recognizers system for Quranic recitations that can be effectively used by anyone regardless of their gender and age. Despite having a lot of recitations available online, most of them are recorded by professional male adult reciters, which means that an ASR system trained on such datasets would not work for female/child reciters. We address this gap by adopting a benchmark dataset of audio records of Quranic recitations that consists of recitations by both genders from different ages. Using this dataset, we build several speaker-independent NSR systems based on the DeepSpeech model and use word error rate (WER) for evaluating them. The goal is to show how an NSR system trained and tuned on a dataset of a certain gender would perform on a test set from the other gender. Unfortunately, the number of female recitations in our dataset is rather small while the number of male recitations is much larger. In the first set of experiments, we avoid the imbalance issue between the two genders and down-sample the male part to match the female part. For this small subset of our dataset, the results are interesting with 0.968 WER when the system is trained on male recitations and tested on female recitations. The same system gives 0.406 WER when tested on male recitations. On the other hand, training the system on female recitations and testing it on male recitation gives 0.966 WER while testing it on female recitations gives 0.608 WER
    • 

    corecore