14 research outputs found

    Gene selection via improved nuclear reaction optimization algorithm for cancer classification in high-dimensional data

    No full text
    Abstract RNA Sequencing (RNA-Seq) has been considered a revolutionary technique in gene profiling and quantification. It offers a comprehensive view of the transcriptome, making it a more expansive technique in comparison with micro-array. Genes that discriminate malignancy and normal can be deduced using quantitative gene expression. However, this data is a high-dimensional dense matrix; each sample has a dimension of more than 20,000 genes. Dealing with this data poses challenges. This paper proposes RBNRO-DE (Relief Binary NRO based on Differential Evolution) for handling the gene selection strategy on (rnaseqv2 illuminahiseq rnaseqv2 un edu Level 3 RSEM genes normalized) with more than 20,000 genes to pick the best informative genes and assess them through 22 cancer datasets. The k-nearest Neighbor (k-NN) and Support Vector Machine (SVM) are applied to assess the quality of the selected genes. Binary versions of the most common meta-heuristic algorithms have been compared with the proposed RBNRO-DE algorithm. In most of the 22 cancer datasets, the RBNRO-DE algorithm based on k-NN and SVM classifiers achieved optimal convergence and classification accuracy up to 100% integrated with a feature reduction size down to 98%, which is very evident when compared to its counterparts, according to Wilcoxon’s rank-sum test (5% significance level)

    An adaptive hybrid african vultures-aquila optimizer with Xgb-Tree algorithm for fake news detection

    No full text
    Abstract Online platforms and social networking have increased in the contemporary years. They are now a major news source worldwide, leading to the online proliferation of Fake News (FNs). These FNs are alarming because they fundamentally reshape public opinion, which may cause customers to leave these online platforms, threatening the reputations of several organizations and industries. This rapid dissemination of FNs makes it imperative for automated systems to detect them, encouraging many researchers to propose various systems to classify news articles and detect FNs automatically. In this paper, a Fake News Detection (FND) methodology is presented based on an effective IBAVO-AO algorithm, which stands for hybridization of African Vultures Optimization (AVO) and Aquila Optimization (AO) algorithms, with an extreme gradient boosting Tree (Xgb-Tree) classifier. The suggested methodology involves three main phases: Initially, the unstructured FNs dataset is analyzed, and the essential features are extracted by tokenizing, encoding, and padding the input news words into a sequence of integers utilizing the GLOVE approach. Then, the extracted features are filtered using the effective Relief algorithm to select only the appropriate ones. Finally, the recovered features are used to classify the news items using the suggested IBAVO-AO algorithm based on the Xgb-Tree classifier. Hence, the suggested methodology is distinguished from prior models in that it performs automatic data pre-processing, optimization, and classification tasks. The proposed methodology is carried out on the ISOT-FNs dataset, containing more than 44 thousand multiple news articles divided into truthful and fake. We validated the proposed methodology’s reliability by examining numerous evaluation metrics involving accuracy, fitness values, the number of selected features, Kappa, Precision, Recall, F1-score, Specificity, Sensitivity, ROC_AUC, and MCC. Then, the proposed methodology is compared against the most common meta-heuristic optimization algorithms utilizing the ISOT-FNs. The experimental results reveal that the suggested methodology achieved optimal classification accuracy and F1-score and successfully categorized more than 92.5% of news articles compared to its peers. This study will assist researchers in expanding their understanding of meta-heuristic optimization algorithms applications for FND. Graphical Abstrac

    Credit card fraud detection using the brown bear optimization algorithm

    No full text
    Fraud detection in banking systems is crucial for financial stability, customer protection, reputation management, and regulatory compliance. Machine Learning (ML) is vital in improving data analysis, real-time fraud detection, and developing fraud techniques by learning from data and adjusting detection strategies accordingly. Feature Selection (FS) is essential for enhancing fraud detection through ML to achieve optimal model accuracy. This is because it helps to eliminate the negative impact of redundant and irrelevant attributes. To enhance the accuracy of the given dataset, the researchers utilized multiple methods to determine the most fitting features. However, it is important to note that when implementing these methods on datasets with larger feature sizes, they may encounter issues with local optimality. Despite this, the researchers continue to work on improving the effectiveness of these methods. This study presents an effective methodology based on the Brown-Bear Optimization (BBO) algorithm to enhance the capacity to accurately identify financial CCF transactions by recognizing pertinent features. BBO has balanced capabilities to reduce dimensionality while enhancing classification accuracy. It is improved by adjusting the positions randomly to enhance exploration and exploitation capabilities, and then it is cloned into a binary variant named Binary BBOA (BBBOA). The Support Vector Machine (SVM), k-nearest Neighbor (k-NN), and Xgb-tree are the ML classifiers used with the suggested methodology. On the Australian credit dataset, the proposed methodology is compared with the basic BBOA and ten current optimizers, such as Binary African Vultures Optimization (BAVO), Binary Salp Swarm Algorithm (BSSA), Binary Atom Search Optimization (BASO), Binary Henry Gas Solubility Optimization (BHGSO), Binary Harris Hawks Optimization (BHHO), Binary Bat Algorithm (BBA), Binary Particle Swarm Optimization (BPSO), Binary Grasshopper Optimization Algorithm (BGOA), and Binary Sailfish Optimizer (BSFO). Regarding Wilcoxon’s rank-sum test (α=0.05), the superiority and effective consequence of the presented methodology are clear on the utilized dataset and got an accuracy of classification up to 91% in the utilized dataset combined with an attribute reduction length down to 67%. The proposed methodology is further validated using 10 benchmark datasets and outperformed its competitors in most utilized datasets regarding different performance measures. In the end, the proposed methodology is further validated using ten benchmark datasets from the UCI repository. It outperformed its competitors in most of the utilized datasets regarding different performance measures

    CCFD: Efficient Credit Card Fraud Detection Using Meta-Heuristic Techniques and Machine Learning Algorithms

    No full text
    This study addresses the critical challenge of data imbalance in credit card fraud detection (CCFD), a significant impediment to accurate and reliable fraud prediction models. Fraud detection (FD) is a complex problem due to the constantly evolving tactics of fraudsters and the rarity of fraudulent transactions compared to legitimate ones. Efficiently detecting fraud is crucial to minimize financial losses and ensure secure transactions. By developing a framework that transitions from imbalanced to balanced data, the research enhances the performance and reliability of FD mechanisms. The strategic application of Meta-heuristic optimization (MHO) techniques was accomplished by analyzing a dataset from Kaggle’s CCF benchmark datasets, which included data from European credit-cardholders. They evaluated their capability to pinpoint the smallest, most relevant set of features, analyzing their impact on prediction accuracy, fitness values, number of selected features, and computational time. The study evaluates the effectiveness of 15 MHO techniques, utilizing 9 transfer functions (TFs) that identify the most relevant subset of features for fraud prediction. Two machine learning (ML) classifiers, random forest (RF) and support vector machine (SVM), are used to evaluate the impact of the chosen features on predictive accuracy. The result indicated a substantial improvement in model efficiency, achieving a classification accuracy of up to 97% and reducing the feature size by up to 90%. In addition, it underscored the critical role of feature selection in optimizing fraud detection systems (FDSs) and adapting to the challenges posed by data imbalance. Additionally, this research highlights how machine learning continues to evolve, revolutionizing FDSs with innovative solutions that deliver significantly enhanced capabilities
    corecore