42 research outputs found
Relationship Analysis of Keyword and Chapter in Malay-Translated Tafseer of Al-Quran
A number of studies have gained popularity to study the unseen knowledge categories and relationship of subject matters discussed in the Al-Quran or the Tafseer. This research investigates the relationships between verses and chapters at the keyword level in a Malay translated Tafseer. A combination technique of text mining and network analysis is developed to discover non-trivial patterns and relationships of verses and chapters in the Tafseer. This is achieved through keyword extraction, keyword-chapter relationship discovery and keyword- chapter network analysis. A total of 130 keywords were extracted from six chapters in the Tafseer. The keywords and their relative importance to a chapter are computed using term weighting. A network analysis map was generated to visualize and analyze the relationship between keyword and chapter in the Tafseer. The relationship between the verses and chapters at the keyword level are successfully portrayed through the combination technique of text mining and network analysis. The novelty of this approach lies in the discovery of the relationships between verses and chapters that is useful for grouping related chapters together
Mini-batch k-Means versus k-Means to Cluster English Tafseer Text: View of Al-Baqarah Chapter
Al-Quran is the primary text of Muslims' religion and practise. Millions of Muslims around the world use al-Quran as their reference guide, and so knowledge can be obtained from it by Muslims and Islamic scholars in general. Al-Quran has been reinterpreted to various languages in the world, for example, English and has been written by several translators. Each translator has ideas, comments and statements to translate the verses from which he has obtained (Tafseer). Therefore, this paper tries to cluster the translation of the Tafseer using text clustering. Text clustering is the text mining method that needs to be clustered in the same section of related documents. The study adapted (mini-batch k-means and k-means) algorithms of clustering techniques to explain and to define the link between keywords known as features or concepts for Al-Baqarah chapter of 286 verses. For this dataset, data preprocessing and extraction of features using TF-IDF (Term Frequency-Inverse Document Frequency), and PCA (Principal Component Analysis) applied. Results show two/three-dimensional clustering plotting assigning seven cluster categories (k=7) for the Tafseer. The implementation time of the mini-batch k-means algorithm (0.05485s) outperforms the time of the k-means algorithm (0.23334s). Finally, the features 'god', 'people', and 'believe' was the most frequent features
Predictive trend mining for social network analysis
This thesis describes research work within the theme of trend mining as applied to social network data. Trend mining is a type of temporal data mining that provides observation into how information changes over time. In the context of the work described in this thesis the focus is on how information contained in social networks changes with time. The work described proposes a number of data mining based techniques directed at mechanisms to not only detect change, but also support the analysis of change, with respect to social network data. To this end a trend mining framework is proposed to act as a vehicle for evaluating the ideas presented in this thesis. The framework is called the Predictive Trend Mining Framework (PTMF). It is designed to support "end-to-end" social network trend mining and analysis. The work described in this thesis is divided into two elements: Frequent Pattern Trend Analysis (FPTA) and Prediction Modeling (PM). For evaluation purposes three social network datasets have been considered: Great Britain Cattle Movement, Deeside Insurance and Malaysian Armed Forces Logistic Cargo. The evaluation indicates that a sound mechanism for identifying and analysing trends, and for using this trend knowledge for prediction purposes, has been established
K-means variations analysis for translation of English Tafseer Al-Quran text
Text mining is a powerful modern technique used to obtain interesting information from huge datasets. Text clustering is used to distinguish between documents that have the same themes or topics. The absence of the datasets ground truth enforces the use of clustering (unsupervised learning) rather than others, such as classification (supervised learning). The “no free lunch” (NFL) theorem supposed that no algorithm outperformed the other in a variety of conditions (several datasets). This study aims to analyze the k-means cluster algorithm variations (three algorithms (k-means, mini-batch k-means, and k-medoids) at the clustering process stage. Six datasets were used/analyzed in chapter Al-Baqarah English translation (text) of 286 verses at the preprocessing stage. Moreover, feature selection used the term frequency–inverse document frequency (TF-IDF) to get the weighting term. At the final stage, five internal cluster validations metrics were implemented silhouette coefficient (SC), Calinski-Harabasz index (CHI), C-index (CI), Dunn’s indices (DI) and Davies Bouldin index (DBI) and regarding execution time (ET). The experiments proved that k-medoids outperformed the other two algorithms in terms of ET only. In contrast, no algorithm is superior to the other in terms of the clustering process for the six datasets, which confirms the NFL theorem assumption
GIS-Based Method for Finding Optimal Ocean Energy Location: A Case Study of Terengganu State
Ocean energy is one of the most important renewable energy sources and it can highly contribute to the supply of the world’s electricity demands. This paper presents a method of locating the highest potential sources of ocean energy by implementing Geographic Information System (GIS). The aim of this study was to find the optimal wave energy location in the coastal area of Terengganu state in Malaysia. The wave data for the years 2015-2017 have been collected. The GIS was adopted to prepare data for analysis and perform geostatistical analysis. The results showed the exact location of areas in the coastal area of Terengganu in which maximum energy from the ocean can be harvested. The proposed methodology can be applied in other coastal areas
A case study in knowledge acquisition for logistic cargo distribution data mining framework
Knowledge acquisition is one of important aspect of Knowledge Discovery in Databases to ensure the correct and interesting knowledge is extracted and represented to the stakeholders and decision makers. The process can
undertake using several techniques as such in this study, it is using data mining to extract the knowledge patterns and representing the knowledge described using ontology based representation. In this paper, a data set of
Logistic Cargo Distribution is selected for the experiment. The dataset describes the shipment of logistic items for the Malaysian Army
Finding Temporal Patterns in Noisy Longitudinal Data: A Study in Diabetic Retinopathy
This paper describes an approach to temporal pattern mining using the concept of user defined temporal prototypes to define the nature of the trends of interests. The temporal patterns are defined in terms of sequences of support values associated with identified frequent patterns. The prototypes are defined mathematically so that they can be mapped onto the temporal patterns. The focus for the advocated temporal pattern mining process is a large longitudinal patient database collected as part of a diabetic retinopathy screening programme, The data set is, in itself, also of interest as it is very noisy (in common with other similar medical datasets) and does not feature a clear association between specific time stamps and subsets of the data. The diabetic retinopathy application, the data warehousing and cleaning process, and the frequent pattern mining procedure (together with the application of the prototype concept) are all described in the paper. An evaluation of the frequent pattern mining process is also presented
Text analytics of unstructured textual data: A study on military peacekeeping document using R text mining package
This paper describes a technique of text analytics on peacekeeping documents to discover significant text patterns exist in the documents. These documents are considered as unstructured textual data.The paper proposes a framework that consists of 3 stages (i) data collection (ii) document preprocessing and (iii) text analytics and visualization.The technique
is developed using R text mining package for text analytics experiments
An Analysis of Energy Mix in Peninsular Malaysia in Line with the Malaysia's Existing Energy Policy
This paper considers dynamic changes of energy-mix available in Peninsular Malaysia with respect to the Malaysia’s energy policies and evaluates these on experimental basis. This research applied a Data Mining approach; Self Organizing Map (SOM) Algorithm for trend cluster analysis time series data. The approach can provide a number of capabilities to uncover relationships between data attributes, uncover relationships between observations, predict the outcome of future observations and learn how to best react to situations through trial and error by using reinforcement learning. Based on the experiment, the test results have shown that the application is able to accommodate large sets of data and produced the trend lines graphs thus at the same time, a clearer picture of scenarios and the latest trend of energy mix applied in Peninsular Malaysia were successfully obtained; it is shown that Malaysian government should increase the execution and improvement in the realization and implementation of energy policy in Malaysia. Besides, Malaysia still has a lot of potential in order to fully utilise renewable energy resources.