11 research outputs found

    A new model for large dataset dimensionality reduction based on teaching learning-based optimization and logistic regression

    Get PDF
    One of the human diseases with a high rate of mortality each year is breast cancer (BC). Among all the forms of cancer, BC is the commonest cause of death among women globally. Some of the effective ways of data classification are data mining and classification methods. These methods are particularly efficient in the medical field due to the presence of irrelevant and redundant attributes in medical datasets. Such redundant attributes are not needed to obtain an accurate estimation of disease diagnosis. Teaching learning-based optimization (TLBO) is a new metaheuristic that has been successfully applied to several intractable optimization problems in recent years. This paper presents the use of a multi-objective TLBO algorithm for the selection of feature subsets in automatic BC diagnosis. For the classification task in this work, the logistic regression (LR) method was deployed. From the results, the projected method produced better BC dataset classification accuracy (classified into malignant and benign). This result showed that the projected TLBO is an efficient features optimization technique for sustaining data-based decision-making systems

    Current Studies and Applications of Krill Herd and Gravitational Search Algorithms in Healthcare

    Full text link
    Nature-Inspired Computing or NIC for short is a relatively young field that tries to discover fresh methods of computing by researching how natural phenomena function to find solutions to complicated issues in many contexts. As a consequence of this, ground-breaking research has been conducted in a variety of domains, including synthetic immune functions, neural networks, the intelligence of swarm, as well as computing of evolutionary. In the domains of biology, physics, engineering, economics, and management, NIC techniques are used. In real-world classification, optimization, forecasting, and clustering, as well as engineering and science issues, meta-heuristics algorithms are successful, efficient, and resilient. There are two active NIC patterns: the gravitational search algorithm and the Krill herd algorithm. The study on using the Krill Herd Algorithm (KH) and the Gravitational Search Algorithm (GSA) in medicine and healthcare is given a worldwide and historical review in this publication. Comprehensive surveys have been conducted on some other nature-inspired algorithms, including KH and GSA. The various versions of the KH and GSA algorithms and their applications in healthcare are thoroughly reviewed in the present article. Nonetheless, no survey research on KH and GSA in the healthcare field has been undertaken. As a result, this work conducts a thorough review of KH and GSA to assist researchers in using them in diverse domains or hybridizing them with other popular algorithms. It also provides an in-depth examination of the KH and GSA in terms of application, modification, and hybridization. It is important to note that the goal of the study is to offer a viewpoint on GSA with KH, particularly for academics interested in investigating the capabilities and performance of the algorithm in the healthcare and medical domains.Comment: 35 page

    A Hybrid Metaheuristics based technique for Mutation Based Disease Classification

    Get PDF
    Due to recent advancements in computational biology, DNA microarray technology has evolved as a useful tool in the detection of mutation among various complex diseases like cancer. The availability of thousands of microarray datasets makes this field an active area of research. Early cancer detection can reduce the mortality rate and the treatment cost. Cancer classification is a process to provide a detailed overview of the disease microenvironment for better diagnosis. However, the gene microarray datasets suffer from a curse of dimensionality problems also the classification models are prone to be overfitted due to small sample size and large feature space. To address these issues, the authors have proposed an Improved Binary Competitive Swarm Optimization Whale Optimization Algorithm (IBCSOWOA) for cancer classification, in which IBCSO has been employed to reduce the informative gene subset originated from using minimum redundancy maximum relevance (mRMR) as filter method. The IBCSOWOA technique has been tested on an artificial neural network (ANN) model and the whale optimization algorithm (WOA) is used for parameter tuning of the model. The performance of the proposed IBCSOWOA is tested on six different mutation-based microarray datasets and compared with existing disease prediction methods. The experimental results indicate the superiority of the proposed technique over the existing nature-inspired methods in terms of optimal feature subset, classification accuracy, and convergence rate. The proposed technique has illustrated above 98% accuracy in all six datasets with the highest accuracy of 99.45% in the Lung cancer dataset

    Comparison of microarray breast cancer classification using support vector machine and logistic regression with LASSO and boruta feature selection

    Get PDF
    Breast cancer is the most frequent cancer diagnosis amongst women worldwide. Despite the advancement of medical diagnostic and prognostic tools for early detection and treatment of breast cancer patients, research on development of better and more reliable tools is still actively conducted globally. The breast cancer classification is significantly important in ensuring reliable diagnostic system. Preliminary research on the usage of machine learning classifier and feature selection method for breast cancer classification is conducted here. Two feature selection methods namely Boruta and LASSO and SVM and LR classifier are studied. A breast cancer dataset from GEO web is adopted in this study. The findings show that LASSO with LR gives the best accuracy using this dataset

    Multiobjective Evolutionary Algorithms applied to Feature Selection in Microarrays Cancer Data

    Get PDF
    El análisis de microarrays de expresión de genes es un tópico actual para el diagnóstico y clasificación del cáncer humano. Un microarray de datos de expresión de genes consiste en una matriz de miles de características de las cuales la mayoría es irrelevante para clasificar patrones de expresiones de genes. La elección de un subconjunto mínimo de características para clasificación es una tarea dificultosa. En este trabajo, se realiza una comparación entre dos algoritmos evolutivos multiobjetivo aplicados a conjuntos de expresiones de genes populares en la literatura (linfoma, leucemia y colon). Con el objetivo de remover las características con fuerte correlación se realiza una etapa de pre-procesamiento. Se muestra un análisis extenso y detallado de los resultados obtenidos para los algoritmos multiobjetivo seleccionados.Microarray analysis of gene expression is a current topic for diagnosing and classification of human cancer. A gene expression data microarray consists of an array of thousands of features of which most are irrelevant for classifying patterns of gene expressions. Choosing a minimal subset of features for classification is a difficult task. In this work, a comparison is made between two multi-objective evolutionary algorithms applied to sets of gene expressions popular in the literature (lymphoma, leukemia, and colon). In order to remove the strongly correlated characteristics, a pre-processing stage is performed. An extensive and detailed analysis of the results obtained for the selected multi-objective algorithms is shown.Fil: Dussaut, Julieta Sol. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional del Sur; ArgentinaFil: Ponzoni, Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional del Sur; ArgentinaFil: Olivera, Ana Carolina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentina. Universidad Nacional de Cuyo; ArgentinaFil: Vidal, Pablo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; Argentina. Universidad Nacional de Cuyo; Argentin

    Pertanika Journal of Science & Technology

    Get PDF

    Pertanika Journal of Science & Technology

    Get PDF

    Cooperative co-evolution for feature selection in big data with random feature grouping

    Get PDF
    © 2020, The Author(s). A massive amount of data is generated with the evolution of modern technologies. This high-throughput data generation results in Big Data, which consist of many features (attributes). However, irrelevant features may degrade the classification performance of machine learning (ML) algorithms. Feature selection (FS) is a technique used to select a subset of relevant features that represent the dataset. Evolutionary algorithms (EAs) are widely used search strategies in this domain. A variant of EAs, called cooperative co-evolution (CC), which uses a divide-and-conquer approach, is a good choice for optimization problems. The existing solutions have poor performance because of some limitations, such as not considering feature interactions, dealing with only an even number of features, and decomposing the dataset statically. In this paper, a novel random feature grouping (RFG) has been introduced with its three variants to dynamically decompose Big Data datasets and to ensure the probability of grouping interacting features into the same subcomponent. RFG can be used in CC-based FS processes, hence called Cooperative Co-Evolutionary-Based Feature Selection with Random Feature Grouping (CCFSRFG). Experiment analysis was performed using six widely used ML classifiers on seven different datasets from the UCI ML repository and Princeton University Genomics repository with and without FS. The experimental results indicate that in most cases [i.e., with naïve Bayes (NB), support vector machine (SVM), k-Nearest Neighbor (k-NN), J48, and random forest (RF)] the proposed CCFSRFG-1 outperforms an existing solution (a CC-based FS, called CCEAFS) and CCFSRFG-2, and also when using all features in terms of accuracy, sensitivity, and specificity

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC

    Applied Methuerstic computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC
    corecore