33 research outputs found

    A Knowledge Representation Model Based on Select and Test Algorithm for Diagnosing Breast Cancer

    Get PDF
    There exist several terminal diseases whose fatality rate escalates with time of which breast cancer is a frontline disease among such. Computer aided systems have also been well researched through the use intelligent algorithms capable of detecting, diagnosing, and proffering treatment for breast cancer.  While good research breakthrough has been attained in terms of algorithmic solution towards diagnosis of breast cancer, however, not much has been done to sufficiently model knowledge frameworks for diagnostic algorithms that are knowledge-based. While Select and Test (ST) algorithm have proven relevant for implementing diagnostic systems, through support for reasoning, however the knowledge representation pattern that enables inference of missing or ambiguous data still limits the effectiveness of ST algorithm. This paper therefore proposes a knowledge representation model to systematically model knowledge to aid the performance of ST algorithm. Our proposal is specifically targeted at developing systematic knowledge representation for breast cancer. The approach uses the ontology web language (OWL) to implement the design of the knowledge model proposed.   This study aims at carefully crafting a knowledge model whose implementation seamlessly work with ST algorithm. Furthermore, this study adapted the proposed model into an implementation of ST algorithm an obtained an improved performance compared to the simple knowledge model proposed by the author of ST algorithm. Our knowledge mode resulted in an accuracy gain of 23.5% and obtained and AUC of (0.49, 1.0). This proposed model has therefore shown that combining an inference-oriented knowledge model with an inference-oriented reasoning algorithm improves the performance of computer aided diagnostic (CADx) systems. In future, we intend to enhance the proposed model to support rules. Keywords— Semantic web, ontology, OWL, breast cancer, Select and Test (ST) algorithm, knowledge representatio

    Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review

    Full text link
    In this paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 98% were articles with at least 482 citations published in 903 journals during the past 30 years. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent

    Binary Ebola Optimization Search Algorithm for Feature Selection and Classification Problems

    No full text
    In the past decade, the extraction of valuable information from online biomedical datasets has exponentially increased due to the evolution of data processing devices and the utilization of machine learning capabilities to find useful information in these datasets. However, these datasets present a variety of features, dimensionalities, shapes, noise, and heterogeneity. As a result, deriving relevant information remains a problem, since multiple features bottleneck the classification process. Despite their adaptability, current state-of-the-art classifiers have failed to address the problem, giving rise to the exploration of binary optimization algorithms. This study proposes a novel approach to binarizing the Ebola optimization search algorithm. The binary Ebola search optimization algorithm (BEOSA) uses two newly formulated S-shape and V-shape transfer functions to investigate mutations of the infected population in the exploitation and exploration phases, respectively. A model is designed to show a representation of the binary search space and the mapping of the algorithm from the continuous space to the discrete space. Mathematical models are formulated to demonstrate the fitness and cost functions used for evaluating the algorithm. Using 22 benchmark datasets consisting of low, medium and high dimensional data, we exhaustively experimented with the proposed BEOSA method and six other recent similar feature selection methods. The experimental results show that the BEOSA and its variant BIEOSA were highly competitive with different state-of-the-art binary optimization algorithms. A comparative analysis of the classification accuracy obtained for eight binary optimizers showed that BEOSA performed competitively compared to other methods on nine datasets. Evaluation reports on all methods revealed that BEOSA was the top performer, obtaining the best values on eight datasets and eight fitness and cost functions. Computation for the average number of features selected showed that BEOSA outperformed other methods on 11 datasets when population sizes of 75 and 100 were used. Findings from the study revealed that BEOSA is effective in handling the challenge of feature selection in high-dimensional datasets

    Binary Ebola Optimization Search Algorithm for Feature Selection and Classification Problems

    No full text
    In the past decade, the extraction of valuable information from online biomedical datasets has exponentially increased due to the evolution of data processing devices and the utilization of machine learning capabilities to find useful information in these datasets. However, these datasets present a variety of features, dimensionalities, shapes, noise, and heterogeneity. As a result, deriving relevant information remains a problem, since multiple features bottleneck the classification process. Despite their adaptability, current state-of-the-art classifiers have failed to address the problem, giving rise to the exploration of binary optimization algorithms. This study proposes a novel approach to binarizing the Ebola optimization search algorithm. The binary Ebola search optimization algorithm (BEOSA) uses two newly formulated S-shape and V-shape transfer functions to investigate mutations of the infected population in the exploitation and exploration phases, respectively. A model is designed to show a representation of the binary search space and the mapping of the algorithm from the continuous space to the discrete space. Mathematical models are formulated to demonstrate the fitness and cost functions used for evaluating the algorithm. Using 22 benchmark datasets consisting of low, medium and high dimensional data, we exhaustively experimented with the proposed BEOSA method and six other recent similar feature selection methods. The experimental results show that the BEOSA and its variant BIEOSA were highly competitive with different state-of-the-art binary optimization algorithms. A comparative analysis of the classification accuracy obtained for eight binary optimizers showed that BEOSA performed competitively compared to other methods on nine datasets. Evaluation reports on all methods revealed that BEOSA was the top performer, obtaining the best values on eight datasets and eight fitness and cost functions. Computation for the average number of features selected showed that BEOSA outperformed other methods on 11 datasets when population sizes of 75 and 100 were used. Findings from the study revealed that BEOSA is effective in handling the challenge of feature selection in high-dimensional datasets

    A twin convolutional neural network with hybrid binary optimizer for multimodal breast cancer digital image classification

    No full text
    There is a wide application of deep learning technique to unimodal medical image analysis with significant classification accuracy performance observed. However, real-world diagnosis of some chronic diseases such as breast cancer often require multimodal data streams with different modalities of visual and textual content. Mammography, magnetic resonance imaging (MRI) and image-guided breast biopsy represent a few of multimodal visual streams considered by physicians in isolating cases of breast cancer. Unfortunately, most studies applying deep learning techniques to solving classification problems in digital breast images have often narrowed their study to unimodal samples. This is understood considering the challenging nature of multimodal image abnormality classification where the fusion of high dimension heterogeneous features learned needs to be projected into a common representation space. This paper presents a novel deep learning approach combining a dual/twin convolutional neural network (TwinCNN) framework to address the challenge of breast cancer image classification from multi-modalities. First, modality-based feature learning was achieved by extracting both low and high levels features using the networks embedded with TwinCNN. Secondly, to address the notorious problem of high dimensionality associated with the extracted features, binary optimization method is adapted to effectively eliminate non-discriminant features in the search space. Furthermore, a novel method for feature fusion is applied to computationally leverage the ground-truth and predicted labels for each sample to enable multimodality classification. To evaluate the proposed method, digital mammography images and digital histopathology breast biopsy samples from benchmark datasets namely MIAS and BreakHis respectively. Experimental results obtained showed that the classification accuracy and area under the curve (AUC) for the single modalities yielded 0.755 and 0.861871 for histology, and 0.791 and 0.638 for mammography. Furthermore, the study investigated classification accuracy resulting from the fused feature method, and the result obtained showed that 0.977, 0.913, and 0.667 for histology, mammography, and multimodality respectively. The findings from the study confirmed that multimodal image classification based on combination of image features and predicted label improves performance. In addition, the contribution of the study shows that feature dimensionality reduction based on binary optimizer supports the elimination of non-discriminant features capable of bottle-necking the classifier

    A novel binary greater cane rat algorithm for feature selection

    No full text
    There is a surge in the application of population-based metaheuristic algorithms to find the optimal feature subset from high dimensional datasets. Many of these approaches cannot properly scale especially as they are expected to maintain two opposing goals: maximizing the accuracy of classification while at the same time minimizing the number of feature subsets selected. In this study, a novel binary greater cane rat algorithm (GCRA), inspired by intelligent nocturnal behavior of the GCR which significantly affects their foraging and mating activities. They leave trails to food sources, shelters, and water as they forage, and this information is kept by the dominant. Also, they split into male and female groups during mating season is during abundant food supply and near water source. This information is modeled into and effective method for selecting the optimal feature subset from high-dimensional datasets using two different approaches. Firstly, five variants of binary GCRA are developed using one each from the family of S-shaped, V-shaped, U-shaped, Z-shaped, and quadratic transfer functions to binarize the GCRA. Secondly, the threshold which maps a variable to 0 or 1 is used to develop a variant of GCRA. The performance of the six (6) variants were evaluated using 12 datasets with different dimensionalities. The experimental results show the stability of all the proposed methods as they generally performed competitively. However, the threshold version known as BGCRA showed better performance in yielding the highest accuracy of classification on 9 of the 12 datasets utilized in the study and performed second in selecting the least number of important feature sets. It also showed superiority over other variants in yielding the least average fitness values in 11 of 12 (91.6%) of the datasets used. Hence, the BGCRA was utilized for further comparative analysis against 5 other popular feature selection (FS) algorithms with outstanding performance in terms of producing the highest mean accuracy of classification on 91.6% (11 of 12) of the datasets, 100% least average fitness values, and 91.6% in selecting the least average number of significant features. The results were also validated by statistical tests which showed that the BGCRA is significantly superior compared to other methods

    Illustration of the transformed binary images of normal, benign, and malignant samples into grayscale.

    No full text
    Illustration of the transformed binary images of normal, benign, and malignant samples into grayscale.</p

    Convergent curves of EOSA on standard benchmark functions over 1, 50, 100, 200, 300, 400 and 500 epochs.

    No full text
    Convergent curves of EOSA on standard benchmark functions over 1, 50, 100, 200, 300, 400 and 500 epochs.</p

    CNN hyperparameter configuration.

    No full text
    Recently, research has shown an increased spread of non-communicable diseases such as cancer. Lung cancer diagnosis and detection has become one of the biggest obstacles in recent years. Early lung cancer diagnosis and detection would reliably promote safety and the survival of many lives globally. The precise classification of lung cancer using medical images will help physicians select suitable therapy to reduce cancer mortality. Much work has been carried out in lung cancer detection using CNN. However, lung cancer prediction still becomes difficult due to the multifaceted designs in the CT scan. Moreover, CNN models have challenges that affect their performance, including choosing the optimal architecture, selecting suitable model parameters, and picking the best values for weights and biases. To address the problem of selecting optimal weight and bias combination required for classification of lung cancer in CT images, this study proposes a hybrid metaheuristic and CNN algorithm. We first designed a CNN architecture and then computed the solution vector of the model. The resulting solution vector was passed to the Ebola optimization search algorithm (EOSA) to select the best combination of weights and bias to train the CNN model to handle the classification problem. After thoroughly training the EOSA-CNN hybrid model, we obtained the optimal configuration, which yielded good performance. Experimentation with the publicly accessible Iraq-Oncology Teaching Hospital / National Center for Cancer Diseases (IQ-OTH/NCCD) lung cancer dataset showed that the EOSA metaheuristic algorithm yielded a classification accuracy of 0.9321. Similarly, the performance comparisons of EOSA-CNN with other methods, namely, GA-CNN, LCBO-CNN, MVO-CNN, SBO-CNN, WOA-CNN, and the classical CNN, were also computed and presented. The result showed that EOSA-CNN achieved a specificity of 0.7941, 0.97951, 0.9328, and sensitivity of 0.9038, 0.13333, and 0.9071 for normal, benign, and malignant cases, respectively. This confirms that the hybrid algorithm provides a good solution for the classification of lung cancer.</div
    corecore