53 research outputs found

    A review of estimation of distribution algorithms in bioinformatics

    Get PDF
    Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain

    Study on the predictions of gene function and protein structure using multi-SVM and hybrid EDA

    Get PDF
    制度:新 ; 報告番号:甲3199号 ; 学位の種類:博士(工学) ; 授与年月日:2011/3/15 ; 早大学位記番号:新549

    Automatic characterization of drug/amino acid interactions by energy decomposition analysis

    Full text link
    The computational study of drug/protein interactions is fundamental to understand the mode of action of drugs and design new ones. In this study, we have developed a python code aimed at characterizing the nature of drug/amino acids interactions in an accurate and automatic way. Specifically, the code is interfaced with different software packages to compute the interaction energy quantum mechanically, and obtain its different contributions, namely, Pauli repulsion, electrostatic and polarisation terms, by an energy decomposition analysis based on one-electron and two-electron deformation densities. The code was tested by investigating the nature of the interaction between the glycine amino acid and 250 drugs. An energy-structure relationship analysis reveals that the strength of the electrostatic and polarisation contributions is related with the presence of small and large size heteroatoms, respectively, in the structure of the drugLR and JJN acknowledge the Comunidad de Madrid for funding through the Attraction of Talent Program (Grant ref 2018-T1/BMD-10261) and the Spanish Ministry of Science and Innovation (Project PID2020-117806GA-I00). MM thanks Xunta de Galicia for fnancial support through the project GRC2019/2

    Machine Learning for Fluid Mechanics

    Full text link
    The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from field measurements, experiments and large-scale simulations at multiple spatiotemporal scales. Machine learning offers a wealth of techniques to extract information from data that could be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. It outlines fundamental machine learning methodologies and discusses their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experimentation, and simulation. Machine learning provides a powerful information processing framework that can enrich, and possibly even transform, current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202

    Regularized model learning in EDAs for continuous and multi-objective optimization

    Get PDF
    Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods

    k-Means

    Get PDF
    The k-means clustering algorithm (k-means for short) provides a method offinding structure in input examples. It is also called the Lloyd–Forgy algorithm as it was independently introduced by both Stuart Lloyd and Edward Forgy. k-means, like other algorithms you will study in this part of the book, is an unsupervised learning algorithm and, as such, does not require labels associated with input examples. Recall that unsupervised learning algorithms provide a way of discovering some inherent structure in the input examples. This is in contrast with supervised learning algorithms, which require input examples and associated labels so as to fit a hypothesis function that maps input examples to one or more output variables

    Metaheuristics for NP-hard combinatorial optimization problems

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore