1,582 research outputs found
Schema theory based data engineering in gene expression programming for big data analytics
Gene expression programming (GEP) is a data driven evolutionary technique that well suits for correlation mining. Parallel GEPs are proposed to speed up the evolution process using a cluster of computers or a computer with multiple CPU cores. However, the generation structure of chromosomes and the size of input data are two issues that tend to be neglected when speeding up GEP in evolution. To fill the research gap, this paper proposes three guiding principles to elaborate the computation nature of GEP in evolution based on an analysis of GEP schema theory. As a result, a novel data engineered GEP is developed which follows closely the generation structure of chromosomes in parallelization and considers the input data size in segmentation. Experimental results on two data sets with complementary features show that the data engineered GEP speeds up the evolution process significantly without loss of accuracy in data correlation mining. Based on the experimental tests, a computation model of the data engineered GEP is further developed to demonstrate its high scalability in dealing with potential big data using a large number of CPU cores
Artificial immune systems based committee machine for classification application
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A new adaptive learning Artificial Immune System (AIS) based committee machine is developed in this thesis. The new proposed approach efficiently tackles the general problem of clustering high-dimensional data. In addition, it helps on deriving useful decision and results related to other application domains such classification and prediction. Artificial Immune System (AIS) is a branch of computational intelligence field inspired by the biological immune system, and has gained increasing interest among researchers in the development of immune-based models and techniques to solve diverse complex computational or engineering problems. This work presents some applications of AIS techniques to health problems, and a thorough survey of existing AIS models and algorithms. The main focus of this research is devoted to building an ensemble model integrating different AIS techniques (i.e. Artificial Immune Networks, Clonal Selection, and Negative Selection) for classification applications to achieve better classification results. A new AIS-based ensemble architecture with adaptive learning features is proposed by integrating different learning and adaptation techniques to overcome individual limitations and to achieve synergetic effects through the combination of these techniques. Various techniques related to the design and enhancements of the new adaptive learning architecture are studied, including a neuro-fuzzy based detector and an optimizer using particle swarm optimization method to achieve enhanced classification performance. An evaluation study was conducted to show the performance of the new proposed adaptive learning ensemble and to compare it to alternative combining techniques. Several experiments are presented using different medical datasets for the classification problem and findings and outcomes are discussed. The new adaptive learning architecture improves the accuracy of the ensemble. Moreover, there is an improvement over the existing aggregation techniques. The outcomes, assumptions and limitations of the proposed methods with its implications for further research in this area draw this research to its conclusion
An Evolutionary Algorithm to Generate Ellipsoid Detectors for Negative Selection
Negative selection is a process from the biological immune system that can be applied to two-class (self and nonself) classification problems. Negative selection uses only one class (self) for training, which results in detectors for the other class (nonself). This paradigm is especially useful for problems in which only one class is available for training, such as network intrusion detection. Previous work has investigated hyper-rectangles and hyper-spheres as geometric detectors. This work proposes ellipsoids as geometric detectors. First, the author establishes a mathematical model for ellipsoids. He develops an algorithm to generate ellipsoids by training on only one class of data. Ellipsoid mutation operators, an objective function, and a convergence technique are described for the evolutionary algorithm that generates ellipsoid detectors. Testing on several data sets validates this approach by showing that the algorithm generates good ellipsoid detectors. Against artificial data sets, the detectors generated by the algorithm match more than 90% of nonself data with no false alarms. Against a subset of data from the 1999 DARPA MIT intrusion detection data, the ellipsoids generated by the algorithm detected approximately 98% of nonself (intrusions) with an approximate 0% false alarm rate
Recommended from our members
Nature inspired computational intelligence for financial contagion modelling
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Financial contagion refers to a scenario in which small shocks, which initially affect only a few financial institutions or a particular region of the economy, spread to the rest of the financial sector and other countries whose economies were previously healthy. This resembles the âtransmissionâ of a medical disease. Financial contagion happens both at domestic level and international level. At domestic level, usually the failure of a domestic bank or financial intermediary triggers transmission by defaulting on inter-bank liabilities, selling assets in a fire sale, and undermining confidence in similar banks. An example of this phenomenon is the failure of Lehman Brothers and the subsequent turmoil in the US financial markets. International financial contagion happens in both advanced economies and developing economies, and is the transmission of financial crises across financial markets. Within the current globalise financial system, with large volumes of cash flow and cross-regional operations of large banks and hedge funds, financial contagion usually happens simultaneously among both domestic institutions and across countries. There is no conclusive definition of financial contagion, most research papers study contagion by analyzing the change in the variance-covariance matrix during the period of market turmoil. King and Wadhwani (1990) first test the correlations between the US, UK and Japan, during the US stock market crash of 1987. Boyer (1997) finds significant increases in correlation during financial crises, and reinforces a definition of financial contagion as a correlation changing during the crash period. Forbes and Rigobon (2002) give a definition of financial contagion. In their work, the term interdependence is used as the alternative to contagion. They claim that for the period they study, there is no contagion but only interdependence. Interdependence leads to common price movements during periods both of stability and turmoil. In the past two decades, many studies (e.g. Kaminsky et at., 1998; Kaminsky 1999) have developed early warning systems focused on the origins of financial crises rather than on financial contagion. Further authors (e.g. Forbes and Rigobon, 2002; Caporale et al, 2005), on the other hand, have focused on studying contagion or interdependence. In this thesis, an overall mechanism is proposed that simulates characteristics of propagating crisis through contagion. Within that scope, a new co-evolutionary market model is developed, where some of the technical traders change their behaviour during crisis to transform into herd traders making their decisions based on market sentiment rather than underlying strategies or factors. The thesis focuses on the transformation of market interdependence into contagion and on the contagion effects. The author first build a multi-national platform to allow different type of players to trade implementing their own rules and considering information from the domestic and a foreign market. Tradersâ strategies and the performance of the simulated domestic market are trained using historical prices on both markets, and optimizing artificial marketâs parameters through immune - particle swarm optimization techniques (I-PSO). The author also introduces a mechanism contributing to the transformation of technical into herd traders. A generalized auto-regressive conditional heteroscedasticity - copula (GARCH-copula) is further applied to calculate the tail dependence between the affected market and the origin of the crisis, and that parameter is used in the fitness function for selecting the best solutions within the evolving population of possible model parameters, and therefore in the optimization criteria for contagion simulation. The overall model is also applied in predictive mode, where the author optimize in the pre-crisis period using data from the domestic market and the crisis-origin foreign market, and predict in the crisis period using data from the foreign market and predicting the affected domestic market
Evolutionary Computation
This book presents several recent advances on Evolutionary Computation, specially evolution-based optimization methods and hybrid algorithms for several applications, from optimization and learning to pattern recognition and bioinformatics. This book also presents new algorithms based on several analogies and metafores, where one of them is based on philosophy, specifically on the philosophy of praxis and dialectics. In this book it is also presented interesting applications on bioinformatics, specially the use of particle swarms to discover gene expression patterns in DNA microarrays. Therefore, this book features representative work on the field of evolutionary computation and applied sciences. The intended audience is graduate, undergraduate, researchers, and anyone who wishes to become familiar with the latest research work on this field
A brief history of learning classifier systems: from CS-1 to XCS and its variants
© 2015, Springer-Verlag Berlin Heidelberg. The direction set by Wilsonâs XCS is that modern Learning Classifier Systems can be characterized by their use of rule accuracy as the utility metric for the search algorithm(s) discovering useful rules. Such searching typically takes place within the restricted space of co-active rules for efficiency. This paper gives an overview of the evolution of Learning Classifier Systems up to XCS, and then of some of the subsequent developments of Wilsonâs algorithm to different types of learning
Component Thermodynamical Selection Based Gene Expression Programming for Function Finding
Gene expression programming (GEP), improved genetic programming (GP), has become a popular tool for data mining. However, like other evolutionary algorithms, it tends to suffer from premature convergence and slow convergence rate when solving complex problems. In this paper, we propose an enhanced GEP algorithm, called CTSGEP, which is inspired by the principle of minimal free energy in thermodynamics. In CTSGEP, it employs a component thermodynamical selection (CTS) operator to quantitatively keep a balance between the selective pressure and the population diversity during the evolution process. Experiments are conducted on several benchmark datasets from the UCI machine learning repository. The results show that the performance of CTSGEP is better than the conventional GEP and some GEP variations
Inferring the clonal identity of single cells from RNA-seq data with Unique Molecular Identifiers
Cancer is an evolutionary disease, in which heterogeneous populations of tumor cells can emerge, proliferate, and disappear depending on selective and neutral processes. This principle has been observed in many studies of acute myeloid leukemia (AML), which is the most common blood cancer in adults. Clonal heterogeneity and evolution have been proposed to play a role in the high relapse rate of this type of cancer. In order to understand this feature, it is crucial to have adequate clinical and experimental models that can provide enough data to elucidate the evolutionary history of a tumor, such as patient-derived xenografts (PDX). These models can be combined with high-resolution sequencing technologies, such as single-cell RNA-seq, to provide a detailed view of the heterogeneity and molecular features of the tumor. However, adequate analytical tools have to be applied and developed in order to fully exploit such datasets.
Here I present the analysis of the clonal heterogeneity of an AML patient and the corresponding PDX model, which was treated with multiple rounds of chemotherapy. This model allowed to study the response of the tumor populations to the pressure induced by the therapy, and the possible evolutionary forces behind it. Datasets for these AML samples were generated with multiple types of sequencing methods, one of which was single-cell RNA sequencing. To enable the analysis of somatic mutations and clonal populations in this kind of data, I developed a software package, which is capable of extracting and proofreading variant sequences by making use of Unique Molecular Identifiers (UMIs), which are sequence barcodes that allow to distinguish reads that come from PCR amplification duplicates. The benefits of employing this proofreading approach for variant calling and for inferring the clonal identity of single cells were demonstrated. Finally, I applied to the analysis of the single-cell data of the AML PDX samples that were treated with chemotherapy, as well as other datasets with UMI-based sequencing
- âŠ