499 research outputs found

    Enhancing SAEAs with Unevaluated Solutions: A Case Study of Relation Model for Expensive Optimization

    Full text link
    Surrogate-assisted evolutionary algorithms (SAEAs) hold significant importance in resolving expensive optimization problems~(EOPs). Extensive efforts have been devoted to improving the efficacy of SAEAs through the development of proficient model-assisted selection methods. However, generating high-quality solutions is a prerequisite for selection. The fundamental paradigm of evaluating a limited number of solutions in each generation within SAEAs reduces the variance of adjacent populations, thus impacting the quality of offspring solutions. This is a frequently encountered issue, yet it has not gained widespread attention. This paper presents a framework using unevaluated solutions to enhance the efficiency of SAEAs. The surrogate model is employed to identify high-quality solutions for direct generation of new solutions without evaluation. To ensure dependable selection, we have introduced two tailored relation models for the selection of the optimal solution and the unevaluated population. A comprehensive experimental analysis is performed on two test suites, which showcases the superiority of the relation model over regression and classification models in the selection phase. Furthermore, the surrogate-selected unevaluated solutions with high potential have been shown to significantly enhance the efficiency of the algorithm.Comment: 18 pages, 9 figure

    Evolutionary Multiobjective Optimization Driven by Generative Adversarial Networks (GANs)

    Get PDF
    Recently, increasing works have proposed to drive evolutionary algorithms using machine learning models. Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models. Since it usually requires a certain amount of data (i.e. the candidate solutions generated by the algorithms) for model training, the performance deteriorates rapidly with the increase of the problem scales, due to the curse of dimensionality. To address this issue, we propose a multi-objective evolutionary algorithm driven by the generative adversarial networks (GANs). At each generation of the proposed algorithm, the parent solutions are first classified into real and fake samples to train the GANs; then the offspring solutions are sampled by the trained GANs. Thanks to the powerful generative ability of the GANs, our proposed algorithm is capable of generating promising offspring solutions in high-dimensional decision space with limited training data. The proposed algorithm is tested on 10 benchmark problems with up to 200 decision variables. Experimental results on these test problems demonstrate the effectiveness of the proposed algorithm

    A feature selection strategy for the analysis of spectra from a photoacoustic sensing system

    Get PDF
    In the frame of the EU project CUSTOM, a new sensor system for the detection of drug precursors in gaseous samples is being developed, which also includes an External Cavity-Quantum Cascade Laser Photo Acoustic Sensor (ECQCLPAS). In order to define the characteristics of the laser source, the optimal wavenumbers within the most effective 200 cm -1 range in the mid-infrared region must be identified, in order to lead to optimal detection of the drug precursor molecules in presence of interfering species and of variable composition of the surrounding atmosphere. To this aim, based on simulations made with FT-IR spectra taken from literature, a complex multivariate analysis strategy has been developed to select the optimal wavenumbers. Firstly, the synergistic use of Experimental Design and of Signal Processing techniques led to a dataset of 5000 simulated spectra of mixtures of 33 different gases (including the 4 target molecules). After a preselection, devoted to disregard noisy regions due to small interfering molecules, the simulated mixtures were then used to select the optimal wavenumber range, by maximizing the classification efficiency, as estimated by Partial Least Squares - Discriminant Analysis. A moving window 200 cm -1 wide was used for this purpose. Finally, the optimal wavenumber values were identified within the selected range, using a feature selection approach based on Genetic Algorithms and on resampling. The work made will be relatively easily turned to the spectra actually recorded with the newly developed EC-QCLPAS instrument. Furthermore, the proposed approach allows progressive adaptation of the spectral dataset to real situations, even accounting for specific, different environments

    New Methods for the Prediction and Classification of Protein Domains

    Get PDF

    Deducing corticotropin-releasing hormone receptor type 1 signaling networks from gene expression data by usage of genetic algorithms and graphical Gaussian models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Dysregulation of the hypothalamic-pituitary-adrenal (HPA) axis is a hallmark of complex and multifactorial psychiatric diseases such as anxiety and mood disorders. About 50-60% of patients with major depression show HPA axis dysfunction, i.e. hyperactivity and impaired negative feedback regulation. The neuropeptide corticotropin-releasing hormone (CRH) and its receptor type 1 (CRHR1) are key regulators of this neuroendocrine stress axis. Therefore, we analyzed CRH/CRHR1-dependent gene expression data obtained from the pituitary corticotrope cell line AtT-20, a well-established <it>in vitro </it>model for CRHR1-mediated signal transduction. To extract significantly regulated genes from a genome-wide microarray data set and to deduce underlying CRHR1-dependent signaling networks, we combined supervised and unsupervised algorithms.</p> <p>Results</p> <p>We present an efficient variable selection strategy by consecutively applying univariate as well as multivariate methods followed by graphical models. First, feature preselection was used to exclude genes not differentially regulated over time from the dataset. For multivariate variable selection a maximum likelihood (MLHD) discriminant function within GALGO, an R package based on a genetic algorithm (GA), was chosen. The topmost genes representing major nodes in the expression network were ranked to find highly separating candidate genes. By using groups of five genes (chromosome size) in the discriminant function and repeating the genetic algorithm separately four times we found eleven genes occurring at least in three of the top ranked result lists of the four repetitions. In addition, we compared the results of GA/MLHD with the alternative optimization algorithms greedy selection and simulated annealing as well as with the state-of-the-art method random forest. In every case we obtained a clear overlap of the selected genes independently confirming the results of MLHD in combination with a genetic algorithm.</p> <p>With two unsupervised algorithms, principal component analysis and graphical Gaussian models, putative interactions of the candidate genes were determined and reconstructed by literature mining. Differential regulation of six candidate genes was validated by qRT-PCR.</p> <p>Conclusions</p> <p>The combination of supervised and unsupervised algorithms in this study allowed extracting a small subset of meaningful candidate genes from the genome-wide expression data set. Thereby, variable selection using different optimization algorithms based on linear classifiers as well as the nonlinear random forest method resulted in congruent candidate genes. The calculated interacting network connecting these new target genes was bioinformatically mapped to known CRHR1-dependent signaling pathways. Additionally, the differential expression of the identified target genes was confirmed experimentally.</p

    Learning an L1-regularized Gaussian Bayesian Network in the Equivalence Class Space

    Get PDF
    Learning the structure of a graphical model from data is a common task in a wide range of practical applications. In this paper, we focus on Gaussian Bayesian networks, i.e., on continuous data and directed acyclic graphs with a joint probability density of all variables given by a Gaussian. We propose to work in an equivalence class search space, specifically using the k-greedy equivalence search algorithm. This, combined with regularization techniques to guide the structure search, can learn sparse networks close to the one that generated the data. We provide results on some synthetic networks and on modeling the gene network of the two biological pathways regulating the biosynthesis of isoprenoids for the Arabidopsis thaliana plan

    evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R

    Get PDF
    Commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as splits are chosen to maximize homogeneity at the next step only. An alternative way to search over the parameter space of trees is to use global optimization methods like evolutionary algorithms. This paper describes the evtree package, which implements an evolutionary algorithm for learning globally optimal classification and regression trees in R. Computationally intensive tasks are fully computed in C++ while the partykit package is leveraged for representing the resulting trees in R, providing unified infrastructure for summaries, visualizations, and predictions. evtree is compared to the open-source CART implementation rpart, conditional inference trees (ctree), and the open-source C4.5 implementation J48. A benchmark study of predictive accuracy and complexity is carried out in which evtree achieved at least similar and most of the time better results compared to rpart, ctree, and J48. Furthermore, the usefulness of evtree in practice is illustrated in a textbook customer classification task
    corecore