11 research outputs found
Ant Colony Optimization Approach To Communications Network Design
Ant Colony Optimization (ACO) is a metaheuristic approach for solving hard combinatorial optimization problems. The pheromone trails in ACO serve as distributed, numerical information, which the ants use to probabilistically construct solutions to the problem being solved, and which the ants adapt during the algorithm's execution to reflect their search experience
Comparative Analysis of Machine Learning Algorithms for Health Insurance Pricing
Insurance is an effective way to guard against potential loss. Risk management is primarily employed to protect against the risk of a financial loss. Risk and uncertainty are inevitable parts of life, and the pace of life has led to a rise in these risks and uncertainties. Health insurance pricing has emerged as one of the essential fields of this study following the coronavirus pandemic. The anticipated outcomes from this study will be applied to guarantee that an insurance company's goal for its health insurance packages is within the range of profitability so that the insurance company will also choose the most price-effective course of action. The US Health Insurance dataset was utilized for this study. This health insurance pricing prediction aims to examine four different types of regression-based machine learning algorithms: multiple linear regression, ridge regression, XGBoost regression, and random forest regression. The implemented model's performance is assessed using four evaluation metrics: MAE, MSE, RMSE, and R2 score. Random forest regression outperforms all other algorithms in terms of all four evaluation metrics. The best machine learning algorithm, random forest, is further enhanced with hyperparameter tuning. Random forest with hyperparameter tuning performs better for three evaluation metrics except for MAE. To gain further insights, data visualizations are also implemented to showcase the importance of features and the differences between actual and predicted prices for all the data points
Improving Machine Learning Algorithms for Breast Cancer Prediction
Early prediction of breast cancer can prevent death or receiving late
treatment. The purpose of this research is to improve machine learning algorithms
in predicting breast cancer that will assist patients and healthcare systems. The
machine learning algorithms for the prediction of breast cancer are the methods
applied in this research by using these following algorithms which are decision tree,
random forest, naive Bayes, and gradient boosting due to their high performance.
This research uses data from the breast cancer of Wisconsin (diagnostic) dataset of
the general surgery department. The results from this research are that by using the
stratified k-fold cross validation as a part of the random forest classifier achieved
100% for all four performance scores which are accuracy, recall, precision and F1.
The stratified k-fold also improved two machine learning algorithms. In addition,
data visualization was applied to the random forest algorithm for result
understanding. The implication from the best method is that it could increase the
number of accurate breast cancer detections. The values by selecting the best
method from this research could assist doctors in early breast cancer detection and
increase the number of breast cancer survival rates by receiving early treatment
from accurate prediction
Ant Colony Optimization approaches to the degree-constrained minimum spanning tree problem
This paper presents the design of two Ant Colony Optimization (ACO) approaches and their improved variants on the degree-constrained minimum spanning tree (d-MST) problem. The first approach, which we call p-ACO, uses the vertices of the construction graph as solution components, and is motivated by the well-known Prim's algorithm for constructing MST. The second approach, known as k-ACO, uses the graph edges as solution components, and is motivated by Kruskal's algorithm for the MST problem. The proposed approaches are evaluated on two different data sets containing difficult d-MST instances. Empirical results show that k-ACO performs better than p-ACO. We then enhance the k-ACO approach by incorporating the tournament selection, global update and candidate lists strategies. Empirical evaluations of the enhanced k-ACO indicate that on average, it performs better than Prufer-coded evolutionary algorithm (F-EA), problem search space (PSS), simulated annealing (SA), branch and bound (B&B), Knowles and Come's evolutionary algorithm (K-EA) and ant-based algorithm (AB) on most problem instances from a well-known class of data set called structured hard graphs. Results also show that it is very competitive with two other evolutionary algorithm based methods, namely weight-coded evolutionary algorithm (W-EA), and edge-set representation evolutionary algorithm (S-EA) on the same class of data set
Classifying diabetes using data mining algorithms
Across the globe, diabetes is recognized as one of the many causes of deaths, especially in Third World countries as there is a lack of treatment for diabetes, especially in the early stages. In study, the presence of diabetes will be classified within the community, thus contributing to the existing technology within the healthcare system. Our discovery can help doctors to predict the existence of diabetes accurately and alert patients to seek early treatments. Four data mining algorithms were used within this study which consists of both single and ensemble classifiers. The two single classifiers are decision tree, and logistic regression classifier while the ensemble classifiers are random forest, and stacking. These classifiers are chosen as they are efficient and high in performance. This research uses the PIMA diabetes dataset as it can be obtained by the general public. The stratify cross-validation is used to ensure the efficiency of the models. Ensemble classifiers show better or similar testing results compared to single classifiers. From data visualisation, two important features are discovere
Sentiment Analysis of E-Wallet Companies: Exploring Customer Ratings and Perceptions
This sentiment analysis research reports a systematic study of customer reviews
of unstructured data for seven popular e-wallet companies, including Alipay, Google Wallet,
Grab Superapp, PayPal, Samsung Wallet, Shopee MY, and Touch 'n Go eWallet. Previously,
companies faced challenges in effectively utilizing customer reviews to understand and assess
customer sentiment toward their products or services. However, with advancements in
sentiment analysis techniques, companies can harness the power of customer reviews to gain
valuable insights and improve their offerings. The purpose of this study is to explore the use
of sentiment analysis in e-wallet companies, where understanding customer sentiment is
crucial for enhancing user experiences and driving business success. The research methods
employed in this study start by collecting customer reviews spanning four years, from 2019
to 2022. Next, four data pre-processing methods are applied to transform the raw review data
into a suitable format for sentiment analysis: data standardization, tokenization, stop word
removal, and lemmatization. Sentiment analysis methods were then used to classify reviews
as positive, neutral, or negative for each e-wallet company. This study introduced a novel
method using rating accuracy to evaluate the polarity sentiment classifications. The results of
this study revealed that all e-wallet companies had high rating accuracies for positive
sentiments, indicating positive customer sentiment towards their services. However, the rating
accuracies for negative sentiments were lower, suggesting challenges in accurately predicting
and classifying negative customer sentiment. For neutral sentiments, the rating accuracies
were generally low, except for Alipay in 2019, which demonstrated the highest accuracy in
capturing customer-neutral sentiment. The evaluation of the findings of this study has
important implications for both theory and practice in the e-wallet industry. The practical
implications of this study offer concrete guidance for e-wallet companies to enhance their
services based on customer sentiments. In contrast, the theoretical implications highlight the
need for ongoing research and innovation in sentiment analysis methods that consider other
than customer ratings within the e-wallet industry. This study contributes to sentiment analysis
by introducing rating accuracy as a measure for evaluating customer reviews accurately. The
contribution will provide a more comprehensive understanding of customer sentiments. The
methods employed in this study can be applied to enhance sentiment analysis in various
customer relationship management domains
Effects of Coronavirus Disease on Trade for New Zealand
The coronavirus disease 2019 (COVID-19) is a humanitarian crisis that is spreading throughout the world. COVID-19 will be worse to countries that have weak healthcare and economic systems. Countries that are highly affected by coronavirus disease will have problems with international trade since the virus has a high infection rate. This will have effects on the trading economy which will cause export restrictions and trade barriers which make the country trade worse and can cause livelihood problems for the country. But there are countries that handle the pandemic excellently and manage to control the outbreak. Therefore, this research studies one country which is New Zealand on how the coronavirus disease affects their trading economy. This research consists of five phases of research methodology to be conducted before presenting the final findings. The five phases are dataset collection, data preprocessing, decision tree regressor, apriori algorithm under association rule mining and finally data visualizations. Using decision tree regressor, apriori algorithm and data visualizations for results, the outcomes of the findings show that the trade for New Zealand is not badly affected by the coronavirus pandemic and two association rules that support their economy have been discovered
Fast numerical threshold search algorithm for C4.5
This paper presents a new algorithm to improve the speed of threshold searching process in C4.5 by using the technique of genetic algorithms. In the threshold searching process in C4.5, the values in a numerical attribute are sorted first and then the mid-point between every two consecutive values is calculated and designated as a candidate threshold. This process can be time consuming and it is not practical for large data. Our algorithm generates a population of possible thresholds and converges to the best threshold value rapidly. Our experimental results have shown that significant time reduction has been achieved by using our algorithm in threshold searching process