Search CORE

207 research outputs found

Feature Selection for Document Classification : Case Study of Meta-heuristic Intelligence and Traditional Approaches

Author: Khin Sandar Kyaw
Publication venue: 'Faculty of Medicine Prince of Songkla University'
Publication date: 01/01/2020
Field of study

Doctor of Philosophy (Computer Engineering), 2020Nowadays, the culture for accessing news around the world is changed from paper to electronic format and the rate of publication for newspapers and magazines on website are increased dramatically. Meanwhile, text feature selection for the automatic document classification (ADC) is becoming a big challenge because of the unstructured nature of text feature, which is called “multi-dimension feature problem”. On the other hand, various powerful schemes dealing with text feature selection are being developed continuously nowadays, but there still exists a research gap for “optimization of feature selection problem (OFSP)”, which can be looked for the global optimal features. Meanwhile, the capacity of meta-heuristic intelligence for knowledge discovery process (KDP) is also become the critical role to overcome NP-hard problem of OFSP by providing effective performance and efficient computation time. Therefore, the idea of meta-heuristic based approach for optimization of feature selection is proposed in this research to search the global optimal features for ADC. In this thesis, case study of meta-heuristic intelligence and traditional approaches for feature selection optimization process in document classification is observed. It includes eleven meta-heuristic algorithms such as Ant Colony search, Artificial Bee Colony search, Bat search, Cuckoo search, Evolutionary search, Elephant search, Firefly search, Flower search, Genetic search, Rhinoceros search, and Wolf search, for searching the optimal feature subset for document classification. Then, the results of proposed model are compared with three traditional search algorithms like Best First search (BFS), Greedy Stepwise (GS), and Ranker search (RS). In addition, the framework of data mining is applied. It involves data preprocessing, feature engineering, building learning model and evaluating the performance of proposed meta-heuristic intelligence-based feature selection using various performance and computation complexity evaluation schemes. In data processing, tokenization, stop-words handling, stemming and lemmatizing, and normalization are applied. In feature engineering process, n-gram TF-IDF feature extraction is used for implementing feature vector and both filter and wrapper approach are applied for observing different cases. In addition, three different classifiers like J48, Naïve Bayes, and Support Vector Machine, are used for building the document classification model. According to the results, the proposed system can reduce the number of selected features dramatically that can deteriorate learning model performance. In addition, the selected global subset features can yield better performance than traditional search according to single objective function of proposed model

Normal parameter reduction algorithm in soft set based on hybrid binary particle swarm and biogeography optimizer

Author: A Asemi
Abdulghani Ali Ahmed
Abdullah Alghushami
Ali Safaa Sadiq
AS Sadiq
AS Sadiq
BM Ayyub
BM Miller
C Cortes
CA Fulmer
D Chen
D Dasgupta
D Simon
DE Goldberg
DL Streiner
DP Bertekas
G Nemhauser
G Wang
G-G Wang
G-G Wang
G-G Wang
G-G Wang
H Ma
H Min
H Xu
I Văduva
IC Parmee
JG Del Junco
JH Holland
K Babitha
K Dalkir
K-M Osei-Bryson
LA Wolsey
M Batrouni
M-Y Chang
MAT Mohammed
Mohammed Adam Tahir
N Zhang
O Castillo
P Gottschalk
P Maji
P Mohapatra
PK Ammu
R Akerkar
R Horst
R Maier
S Mirjalili
S Mirjalili
S Mirjalili
S Mirjalili
S Mirjalili
S Mirjalili
S Mirjalili
S Mirjalili
SS Nika
T Herawan
TL Fine
V Merminod
XS Yang
Y Wei
Y-C Chen
Z Kong
Z Kong
Z-H Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/08/2019
Field of study

© 2019, Springer-Verlag London Ltd., part of Springer Nature. Existing classification techniques that are proposed previously for eliminating data inconsistency could not achieve an efficient parameter reduction in soft set theory, which effects on the obtained decisions. Meanwhile, the computational cost made during combination generation process of soft sets could cause machine infinite state, which is known as nondeterministic polynomial time. The contributions of this study are mainly focused on minimizing choices costs through adjusting the original classifications by decision partition order and enhancing the probability of searching domain space using a developed Markov chain model. Furthermore, this study introduces an efficient soft set reduction-based binary particle swarm optimized by biogeography-based optimizer (SSR-BPSO-BBO) algorithm that generates an accurate decision for optimal and sub-optimal choices. The results show that the decision partition order technique is performing better in parameter reduction up to 50%, while other algorithms could not obtain high reduction rates in some scenarios. In terms of accuracy, the proposed SSR-BPSO-BBO algorithm outperforms the other optimization algorithms in achieving high accuracy percentage of a given soft dataset. On the other hand, the proposed Markov chain model could significantly represent the robustness of our parameter reduction technique in obtaining the optimal decision and minimizing the search domain.Published versio

Computational Intelligence in Healthcare

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

The number of patient health data has been estimated to have reached 2314 exabytes by 2020. Traditional data analysis techniques are unsuitable to extract useful information from such a vast quantity of data. Thus, intelligent data analysis methods combining human expertise and computational models for accurate and in-depth data analysis are necessary. The technological revolution and medical advances made by combining vast quantities of available data, cloud computing services, and AI-based solutions can provide expert insight and analysis on a mass scale and at a relatively low cost. Computational intelligence (CI) methods, such as fuzzy models, artificial neural networks, evolutionary algorithms, and probabilistic methods, have recently emerged as promising tools for the development and application of intelligent systems in healthcare practice. CI-based systems can learn from data and evolve according to changes in the environments by taking into account the uncertainty characterizing health data, including omics data, clinical data, sensor, and imaging data. The use of CI in healthcare can improve the processing of such data to develop intelligent solutions for prevention, diagnosis, treatment, and follow-up, as well as for the analysis of administrative processes. The present Special Issue on computational intelligence for healthcare is intended to show the potential and the practical impacts of CI techniques in challenging healthcare applications

Early Prediction of Diabetes Using Deep Learning Convolution Neural Network and Harris Hawks Optimization

Author: R Murugadoss
Publication venue: 'Penerbit UTHM'
Publication date: 30/01/2021
Field of study

 Owing to the gravity of the diabetic disease the minimal level symptoms for diabetic failure in the early stage must be forecasted. The prediction system instantaneous and prior must thus be developed to eliminate serious medical factors. Information gathered from Pima Indian Diabetic dataset are synthesized through a profound learning approach that provides features for diabetic level information. Metadata is used to enhance the recognition process for the profound learned features. The distinct details retrieved by integrated machine and computer technology, including glucose level, health information, age, insulin level, etc. Due to the efficacious Hawks Optimization Algorithm (HOA), the data's insignificant participation in diabetic diagnostic processes is minimized in process analysis luminosity. Diabetic disease has been categorized with Deep Learning Convolution Networks (DLCNN) from among the chosen diabetic characteristics. The process output developed is measured on the basis of test results in terms of error rate, sensitivity, specificity and accuracy