42 research outputs found

    A review of estimation of distribution algorithms in bioinformatics

    Get PDF
    Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain

    On the application of estimation of distribution algorithms to multi-marker tagging SNP selection

    Get PDF
    This paper presents an algorithm for the automatic selection of a minimal subset of tagging single nucleotide polymorphisms (SNPs) using an estimation of distribution algorithm (EDA). The EDA stochastically searches the constrained space of possible feasible solutions and takes advantage of the underlying topological structure defined by the SNP correlations to model the problem interactions. The algorithm is evaluated across the HapMap reference panel data sets. The introduced algorithm is effective for the identification of minimal multi-marker SNP sets, which considerably reduce the dimension of the tagging SNP set in comparison with single-marker sets. New reduced tagging sets are obtained for all the HapMap SNP regions considered. We also show that the information extracted from the interaction graph representing the correlations between the SNPs can help to improve the efficiency of the optimization algorithm. keywords: SNPs, tagging SNP selection, multi-marker selection, estimation of distribution algorithms, HapMap

    Substructural local search in discrete estimation of distribution algorithms

    Get PDF
    Tese dout., Engenharia Electrónica e Computação, Universidade do Algarve, 2009SFRH/BD/16980/2004The last decade has seen the rise and consolidation of a new trend of stochastic optimizers known as estimation of distribution algorithms (EDAs). In essence, EDAs build probabilistic models of promising solutions and sample from the corresponding probability distributions to obtain new solutions. This approach has brought a new view to evolutionary computation because, while solving a given problem with an EDA, the user has access to a set of models that reveal probabilistic dependencies between variables, an important source of information about the problem. This dissertation proposes the integration of substructural local search (SLS) in EDAs to speedup the convergence to optimal solutions. Substructural neighborhoods are de ned by the structure of the probabilistic models used in EDAs, generating adaptive neighborhoods capable of automatic discovery and exploitation of problem regularities. Speci cally, the thesis focuses on the extended compact genetic algorithm and the Bayesian optimization algorithm. The utility of SLS in EDAs is investigated for a number of boundedly di cult problems with modularity, overlapping, and hierarchy, while considering important aspects such as scaling and noise. The results show that SLS can substantially reduce the number of function evaluations required to solve some of these problems. More importantly, the speedups obtained can scale up to the square root of the problem size O( p `).Fundação para a Ciência e Tecnologia (FCT

    Regularized model learning in EDAs for continuous and multi-objective optimization

    Get PDF
    Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods

    Explicit Building Block Multiobjective Evolutionary Computation: Methods and Applications

    Get PDF
    This dissertation presents principles, techniques, and performance of evolutionary computation optimization methods. Concentration is on concepts, design formulation, and prescription for multiobjective problem solving and explicit building block (BB) multiobjective evolutionary algorithms (MOEAs). Current state-of-the-art explicit BB MOEAs are addressed in the innovative design, execution, and testing of a new multiobjective explicit BB MOEA. Evolutionary computation concepts examined are algorithm convergence, population diversity and sizing, genotype and phenotype partitioning, archiving, BB concepts, parallel evolutionary algorithm (EA) models, robustness, visualization of evolutionary process, and performance in terms of effectiveness and efficiency. The main result of this research is the development of a more robust algorithm where MOEA concepts are implicitly employed. Testing shows that the new MOEA can be more effective and efficient than previous state-of-the-art explicit BB MOEAs for selected test suite multiobjective optimization problems (MOPs) and U.S. Air Force applications. Other contributions include the extension of explicit BB definitions to clarify the meanings for good single and multiobjective BBs. A new visualization technique is developed for viewing genotype, phenotype, and the evolutionary process in finding Pareto front vectors while tracking the size of the BBs. The visualization technique is the result of a BB tracing mechanism integrated into the new MOEA that enables one to determine the required BB sizes and assign an approximation epistasis level for solving a particular problem. The culmination of this research is explicit BB state-of-the-art MOEA technology based on the MOEA design, BB classifier type assessment, solution evolution visualization, and insight into MOEA test metric validation and usage as applied to test suite, deception, bioinformatics, unmanned vehicle flight pattern, and digital symbol set design MOPs

    Competent Program Evolution, Doctoral Dissertation, December 2006

    Get PDF
    Heuristic optimization methods are adaptive when they sample problem solutions based on knowledge of the search space gathered from past sampling. Recently, competent evolutionary optimization methods have been developed that adapt via probabilistic modeling of the search space. However, their effectiveness requires the existence of a compact problem decomposition in terms of prespecified solution parameters. How can we use these techniques to effectively and reliably solve program learning problems, given that program spaces will rarely have compact decompositions? One method is to manually build a problem-specific representation that is more tractable than the general space. But can this process be automated? My thesis is that the properties of programs and program spaces can be leveraged as inductive bias to reduce the burden of manual representation-building, leading to competent program evolution. The central contributions of this dissertation are a synthesis of the requirements for competent program evolution, and the design of a procedure, meta-optimizing semantic evolutionary search (MOSES), that meets these requirements. In support of my thesis, experimental results are provided to analyze and verify the effectiveness of MOSES, demonstrating scalability and real-world applicability

    BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference

    Get PDF

    k-Means

    Get PDF
    corecore