3,890 research outputs found

    Identification of disease-causing genes using microarray data mining and gene ontology

    Get PDF
    Background: One of the best and most accurate methods for identifying disease-causing genes is monitoring gene expression values in different samples using microarray technology. One of the shortcomings of microarray data is that they provide a small quantity of samples with respect to the number of genes. This problem reduces the classification accuracy of the methods, so gene selection is essential to improve the predictive accuracy and to identify potential marker genes for a disease. Among numerous existing methods for gene selection, support vector machine-based recursive feature elimination (SVMRFE) has become one of the leading methods, but its performance can be reduced because of the small sample size, noisy data and the fact that the method does not remove redundant genes. Methods: We propose a novel framework for gene selection which uses the advantageous features of conventional methods and addresses their weaknesses. In fact, we have combined the Fisher method and SVMRFE to utilize the advantages of a filtering method as well as an embedded method. Furthermore, we have added a redundancy reduction stage to address the weakness of the Fisher method and SVMRFE. In addition to gene expression values, the proposed method uses Gene Ontology which is a reliable source of information on genes. The use of Gene Ontology can compensate, in part, for the limitations of microarrays, such as having a small number of samples and erroneous measurement results. Results: The proposed method has been applied to colon, Diffuse Large B-Cell Lymphoma (DLBCL) and prostate cancer datasets. The empirical results show that our method has improved classification performance in terms of accuracy, sensitivity and specificity. In addition, the study of the molecular function of selected genes strengthened the hypothesis that these genes are involved in the process of cancer growth. Conclusions: The proposed method addresses the weakness of conventional methods by adding a redundancy reduction stage and utilizing Gene Ontology information. It predicts marker genes for colon, DLBCL and prostate cancer with a high accuracy. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help in the search for a cure for cancers

    ANTIDS: Self-Organized Ant-based Clustering Model for Intrusion Detection System

    Full text link
    Security of computers and the networks that connect them is increasingly becoming of great significance. Computer security is defined as the protection of computing systems against threats to confidentiality, integrity, and availability. There are two types of intruders: the external intruders who are unauthorized users of the machines they attack, and internal intruders, who have permission to access the system with some restrictions. Due to the fact that it is more and more improbable to a system administrator to recognize and manually intervene to stop an attack, there is an increasing recognition that ID systems should have a lot to earn on following its basic principles on the behavior of complex natural systems, namely in what refers to self-organization, allowing for a real distributed and collective perception of this phenomena. With that aim in mind, the present work presents a self-organized ant colony based intrusion detection system (ANTIDS) to detect intrusions in a network infrastructure. The performance is compared among conventional soft computing paradigms like Decision Trees, Support Vector Machines and Linear Genetic Programming to model fast, online and efficient intrusion detection systems.Comment: 13 pages, 3 figures, Swarm Intelligence and Patterns (SIP)- special track at WSTST 2005, Muroran, JAPA

    A Clustering System for Dynamic Data Streams Based on Metaheuristic Optimisation

    Get PDF
    open access articleThis article presents the Optimised Stream clustering algorithm (OpStream), a novel approach to cluster dynamic data streams. The proposed system displays desirable features, such as a low number of parameters and good scalability capabilities to both high-dimensional data and numbers of clusters in the dataset, and it is based on a hybrid structure using deterministic clustering methods and stochastic optimisation approaches to optimally centre the clusters. Similar to other state-of-the-art methods available in the literature, it uses “microclusters” and other established techniques, such as density based clustering. Unlike other methods, it makes use of metaheuristic optimisation to maximise performances during the initialisation phase, which precedes the classic online phase. Experimental results show that OpStream outperforms the state-of-the-art methods in several cases, and it is always competitive against other comparison algorithms regardless of the chosen optimisation method. Three variants of OpStream, each coming with a different optimisation algorithm, are presented in this study. A thorough sensitive analysis is performed by using the best variant to point out OpStream’s robustness to noise and resiliency to parameter changes

    An Optimisation-Driven Prediction Method for Automated Diagnosis and Prognosis

    Get PDF
    open access articleThis article presents a novel hybrid classification paradigm for medical diagnoses and prognoses prediction. The core mechanism of the proposed method relies on a centroid classification algorithm whose logic is exploited to formulate the classification task as a real-valued optimisation problem. A novel metaheuristic combining the algorithmic structure of Swarm Intelligence optimisers with the probabilistic search models of Estimation of Distribution Algorithms is designed to optimise such a problem, thus leading to high-accuracy predictions. This method is tested over 11 medical datasets and compared against 14 cherry-picked classification algorithms. Results show that the proposed approach is competitive and superior to the state-of-the-art on several occasions

    An approach based on tunicate swarm algorithm to solve partitional clustering problem

    Get PDF
    The tunicate swarm algorithm (TSA) is a newly proposed population-based swarm optimizer for solving global optimization problems. TSA uses best solution in the population in order improve the intensification and diversification of the tunicates. Thus, the possibility of finding a better position for search agents has increased. The aim of the clustering algorithms is to distributed the data instances into some groups according to similar and dissimilar features of instances. Therefore, with a proper clustering algorithm the dataset will be separated to some groups and it’s expected that the similarities of groups will be minimum. In this work, firstly, an approach based on TSA has proposed for solving partitional clustering problem. Then, the TSA is implemented on ten different clustering problems taken from UCI Machine Learning Repository, and the clustering performance of the TSA is compared with the performances of the three well known clustering algorithms such as fuzzy c-means, k-means and k-medoids. The experimental results and comparisons show that the TSA based approach is highly competitive and robust optimizer for solving the partitional clustering problems

    SUBIC: A Supervised Bi-Clustering Approach for Precision Medicine

    Full text link
    Traditional medicine typically applies one-size-fits-all treatment for the entire patient population whereas precision medicine develops tailored treatment schemes for different patient subgroups. The fact that some factors may be more significant for a specific patient subgroup motivates clinicians and medical researchers to develop new approaches to subgroup detection and analysis, which is an effective strategy to personalize treatment. In this study, we propose a novel patient subgroup detection method, called Supervised Biclustring (SUBIC) using convex optimization and apply our approach to detect patient subgroups and prioritize risk factors for hypertension (HTN) in a vulnerable demographic subgroup (African-American). Our approach not only finds patient subgroups with guidance of a clinically relevant target variable but also identifies and prioritizes risk factors by pursuing sparsity of the input variables and encouraging similarity among the input variables and between the input and target variable

    Big data analytics:Computational intelligence techniques and application areas

    Get PDF
    Big Data has significant impact in developing functional smart cities and supporting modern societies. In this paper, we investigate the importance of Big Data in modern life and economy, and discuss challenges arising from Big Data utilization. Different computational intelligence techniques have been considered as tools for Big Data analytics. We also explore the powerful combination of Big Data and Computational Intelligence (CI) and identify a number of areas, where novel applications in real world smart city problems can be developed by utilizing these powerful tools and techniques. We present a case study for intelligent transportation in the context of a smart city, and a novel data modelling methodology based on a biologically inspired universal generative modelling approach called Hierarchical Spatial-Temporal State Machine (HSTSM). We further discuss various implications of policy, protection, valuation and commercialization related to Big Data, its applications and deployment
    • …
    corecore