Search CORE

42 research outputs found

Taxonomy learning from Malay texts using artificial immune system based clustering

Author: Ahmad Nazri Mohd. Zakree
Publication venue
Publication date: 01/04/2011
Field of study

In taxonomy learning from texts, the extracted features that are used to describe the context of a term usually are erroneous and sparse. Various attempts to overcome data sparseness and noise have been made using clustering algorithm such as Hierarchical Agglomerative Clustering (HAC), Bisecting K-means and Guided Agglomerative Hierarchical Clustering (GAHC). However these methods suffer low recall. Therefore, the purpose of this study is to investigate the application of two hybridized artificial immune system (AIS) in taxonomy learning from Malay text and develop a Google-based Text Miner (GTM) for feature selection to reduce data sparseness. Two novel taxonomy learning algorithms have been proposed and compared with the benchmark methods (i.e., HAC, GAHC and Bisecting K-means). The first algorithm is designed through the hybridization of GAHC and Artificial Immune Network (aiNet) called GCAINT (Guided Clustering and aiNet for Taxonomy Learning). The GCAINT algorithm exploits a Hypernym Oracle (HO) to guide the hierarchical clustering process and produce better results than the benchmark methods. However, the Malay HO introduces erroneous hypernym-hyponym pairs and affects the result. Therefore, the second novel algorithm called CLOSAT (Clonal Selection Algorithm for Taxonomy Learning) is proposed by hybridizing Clonal Selection Algorithm (CLONALG) and Bisecting k-means. CLOSAT produces the best results compared to the benchmark methods and GCAINT. In order to reduce sparseness in the obtained dataset, the GTM is proposed. However, the experimental results reveal that GTM introduces too many noises into the dataset which leads to many false positives of hypernym-hyponym pairs. The effect of different combinations of affinity measurement (i.e., Hamming, Jaccard and Rand) on the performance of the developed methods was also studied. Jaccard is found better than Hamming and Rand in measuring the similarity distance between terms. In addition, the use of Particle Swarm Optimization (PSO) for automatic parameter tuning the GCAINT and CLOSAT was also proposed. Experimental results demonstrate that in most cases, PSO-tuned CLOSAT and GCAINT produce better results compared to the benchmark methods and able to reduce data sparseness and noise in the dataset

Universiti Teknologi Malaysia Institutional Repository

Self-adaptive Based Model for Ambiguity Resolution of The Linked Data Query for Big Data Analytics

Author: Ahmad Nazri Mohd Zakree
Husin Nor Azura
M. Shafazand Yasser
Mohd Sharef Nurfadhlina
Publication venue: 'Penerbit UTHM'
Publication date: 25/11/2018
Field of study

Integration of heterogeneous data sources is a crucial step in big data analytics, although it creates ambiguity issues during mapping between the sources due to the variation in the query terms, data structure and granularity conflicts. However, there are limited researches on effective big data integration to address the ambiguity issue for big data analytics. This paper introduces a self-adaptive model for big data integration by exploiting the data structure during querying in order to mitigate and resolve ambiguities. An assessment of a preliminary work on the Geography and Quran dataset is reported to illustrate the feasibility of the proposed model that motivates future work such as solving complex query

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

International Journal of Integrated Engineering

Normalization of common noisy terms in Malaysian online media

Author: Ahmad Nazri Mohd Zakree
Hamdan Abdul Razak
Puteh Mazidah
Samsudin Norlela
Publication venue
Publication date: 04/07/2012
Field of study

This paper proposes a normalization technique of noisy terms that occur in Malaysian micro-texts.Noisy terms are common in online messages and influence the results of activities such as text classification and information retrieval.Even though many researchers have study methods to solve this problem, few had looked into the problems using a language other than English. In this study, about 5000 noisy texts were extracted from 15000 documents that were created by the Malaysian.Normalization process was executed using specific translation rules as part or preprocessing steps in opinion mining of movie reviews.The result shows up to 5% improvement in accuracy values of opinion mining

UUM Repository

An evolutionary variable neighbourhood search for the unrelated parallel machine scheduling problem

Author: Abdullah Salwani
Ahmad Nazri Mohd Zakree
Sabar NR
Turky Ayad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

This article addresses a challenging industrial problem known as the unrelated parallel machine scheduling problem (UPMSP) with sequence-dependent setup times. In UPMSP, we have a set of machines and a group of jobs. The goal is to find the optimal way to schedule jobs for execution by one of the several available machines. UPMSP has been classified as an NP-hard optimisation problem and, thus, cannot be solved by exact methods. Meta-heuristic algorithms are commonly used to find sub-optimal solutions. However, large-scale UPMSP instances pose a significant challenge to meta-heuristic algorithms. To effectively solve a large-scale UPMSP, this article introduces a two-stage evolutionary variable neighbourhood search (EVNS) methodology. The proposed EVNS integrates a variable neighbourhood search algorithm and an evolutionary descent framework in an adaptive manner. The proposed evolutionary framework is employed in the first stage. It uses a mix of crossover and mutation operators to generate diverse solutions. In the second stage, we propose an adaptive variable neighbourhood search to exploit the area around the solutions generated in the first stage. A dynamic strategy is developed to determine the switching time between these two stages. To guide the search towards promising areas, a diversity-based fitness function is proposed to explore different locations in the search landscape. We demonstrate the competitiveness of the proposed EVNS by presenting the computational results and comparisons on the 1640 UPMSP benchmark instances, which have been commonly used in the literature. The experiment results show that our EVNS obtains better results than the compared algorithms on several UPMSP instances

Victoria University Eprints Repository

Normalization of noisy texts in Malaysian online reviews

Author: Ahmad Nazri Mohd Zakree
Hamdan Abdul Razak
Puteh Mazidah
Samsudin Norlela
Publication venue: Universiti Utara Malaysia Press
Publication date: 01/01/2012
Field of study

The process of gathering useful information from online messages has increased as more and more people use the Internet and other online applications such as Facebook and Twitter to communicate with each other.One of the problems in processing online messages is the high number of noisy texts that exist in these messages.Few studies have shown that the noisy texts decreased the result of text mining activities.On the other hand, very few works have investigated on the patterns of noisy texts that are created by Malaysians.In this study, a common noisy terms list and an artificial abbreviations list were created using specific rules and were utilized to select candidates of correct words for a noisy term.Later, the correct term was selected based on a bi-gram words index.The experiments used online messages that were created by the Malaysians.The result shows that normalization of noisy texts using artificial abbreviations list compliments the use of common noisy texts list

UUM Repository

Crossref

Using Bayesian Network for Determining The Recipient of Zakat in BAZNAS Pekanbaru

Author: Akbarizan -
Mohd Zakree Ahmad Nazri -
Nurcahaya -
Rahmad Kurniawan -
Siti Norul Huda Sheikh Abdullah -
Sri Murhayati -
Publication venue
Publication date: 01/01/2018
Field of study

Abstract—The National Amil-Zakat Agency (Baznas) in Pekanbaru has the function to collect and distribute zakat in Pekanbaru city. Baznas Pekanbaru should be able to determine Mustahik properly. Mustahik is a person eligible to receive zakat. The Baznas committee interviews and observes every Mustahik candidates to decide whom could be receive the zakat. Current Mustahik determination process could lead to be subjective assessment, due to large number of zakat recipient applicants and the complexity of rules in determining a Mustahik. Therefore, this study utilize artificial intelligence in determining Mustahik. The Bayesian Network method is appropriate to apply as an inference engine. Based on the experimental results, we found that Bayesian network produces a good accuracy 93.24% and effective to use in data set have an uneven class distribution. In addition, based on experiments by setting an alpha estimator’s values, at 0.6 to 1.0 can increase the accuracy of a Bayesian Network to 95.95%. Keywords—bayesian network, baznas pekanbaru, mustahik, zaka

Crossref

Analisis Harga Pokok Produksi Rumah Pada

Time Series Prediction of Bitcoin Cryptocurrency Price Based on Machine Learning Approach

Author: Abdullah Salwani
Abdullah Salwani
Eddie Ngai
Nazri Mohd Zakree Ahmad
Othman Zalinda
Sani Nor Samsiah
Publication venue: Mutiara Talenta Publisher
Publication date: 31/07/2023
Field of study

Over the past few years, Bitcoin has attracted the attention of numerous parties, ranging from academic researchers to institutional investors. Bitcoin is the first and most widely used cryptocurrency to date. Due to the significant volatility of the Bitcoin price and the fact that its trading method does not require a third party, it has gained great popularity since its inception in 2009 among a wide range of individuals. Given the previous difficulties in predicting the price of cryptocurrencies, this project will be developing and implementing a time series approach-based solution prediction model using machine learning algorithms which include Support Vector Machine Regression (SVR), K-Nearest Neighbor Regression (KNN), Extreme Gradient Boosting (XGBoost), and Long Short-Term Memory (LSTM) to determine the trend of bitcoin price movement, and assessing the effectiveness of the machine learning models. The data that will be used is the close prices of Bitcoin from the year 2018 up to the year 2023. The performance of the machine learning models is evaluated by comparing the results of R-squared, mean absolute error (MAE), mean squared error (RMSE), and also through a visualization graph of the original close price and predicted close price of Bitcoin in a dashboard. Among the models compared, LSTM emerged as the most accurate, followed by SVR, while XGBoost and KNN exhibited comparatively lower performance

Talenta Publisher (E-Journals, Universitas Sumatera Utara)

Using Bayesian Network for Determining The Recipient of Zakat in BAZNAS Pekanbaru (Hasil Check Similarity)

Author: Akbarizan -
Mohd Zakree Ahmad Nazri -
Nurcahaya -
Rahmad Kurniawan -
Siti Norul Huda Sheikh Abdullah -
Sri Murhayati -
Publication venue: 'Universitas Islam Negeri Sultan Syarif Kasim Riau'
Publication date: 01/01/2019
Field of study

Analisis Harga Pokok Produksi Rumah Pada

Automatic Rule Generator via FP-Growth for Eye Diseases Diagnosis

Author: Abdullah Siti Norul Huda
Ahmad Nazri Mohd Zakree
Che Hamzah Jemaima
Kurniawan Rahmad
Oktaviana Westi
Yendra Rado
Publication venue: 'Insight Society'
Publication date: 26/05/2019
Field of study

The conventional approach in developing a rule-based expert system usually applies a tedious, lengthy and costly knowledge acquisition process. The acquisition process is known as the bottleneck in developing an expert system. Furthermore, manual knowledge acquisition can eventually lead to erroneous in decision-making and function ineffective when designing any expert system. Another dilemma among knowledge engineers are handing conflict of interest or high variance of inter and intrapersonal decisions among domain experts during knowledge elicitation stage. The aim of this research is to improve the acquisition of knowledge level using a data mining technique. This paper investigates the effectiveness of an association rule mining technique in generating new rules for an expert system. In this paper, FP-Growth is the machine learning technique that was used in acquiring rules from the eye disease diagnosis records collected from Sumatera Eye Center (SMEC) Hospital in Pekanbaru, Riau, Indonesia. The developed systems are tested with 17 cases. The ophthalmologists inspected the results from automatic rule generator for eye diseases diagnosis. We found that the introduction of FP-Growth association rules into the eye disease knowledge-based systems, able to produce acceptable and promising eye diagnosing results approximately 88% of average accuracy rate. Based on the test results, we can conclude that Conjunctivitis and Presbyopia disease are the most dominant suffering in Indonesia. In conclusion, FP-growth association rules are very potential and capable of becoming an adequate automatic rules generator, but still has plenty of room for improvement in the context of eye disease diagnosing

International Journal on Advanced Science, Engineering and Information Technology

Advise-Giving Expert Systems Based on Islamic Jurisprudence for Treating Drugs and Substance Abuse (Hasil Check Similarity)

Author: Afrizal Nur -
Akbarizan -
Deby Kholilah -
Khairunnas Jamal -
Mohd Zakree Ahmad Nazri -
Rahmad Kurniawan -
Publication venue: 'Universitas Islam Negeri Sultan Syarif Kasim Riau'
Publication date: 01/01/2021
Field of study

Analisis Harga Pokok Produksi Rumah Pada