405 research outputs found

    First-principles molecular structure search with a genetic algorithm

    Full text link
    The identification of low-energy conformers for a given molecule is a fundamental problem in computational chemistry and cheminformatics. We assess here a conformer search that employs a genetic algorithm for sampling the low-energy segment of the conformation space of molecules. The algorithm is designed to work with first-principles methods, facilitated by the incorporation of local optimization and blacklisting conformers to prevent repeated evaluations of very similar solutions. The aim of the search is not only to find the global minimum, but to predict all conformers within an energy window above the global minimum. The performance of the search strategy is: (i) evaluated for a reference data set extracted from a database with amino acid dipeptide conformers obtained by an extensive combined force field and first-principles search and (ii) compared to the performance of a systematic search and a random conformer generator for the example of a drug-like ligand with 43 atoms, 8 rotatable bonds and 1 cis/trans bond

    Molecular Design of Crosslinked Copolymers

    Get PDF
    A complete methodology for the computational molecular design (CMD) of crosslinked polymers is developed and implemented. The methodology is applied to the design of novel polymers for restorative dental materials. The computational molecular design of crosslinked polymers using optimization techniques is a new area of research. The first part of this project seeks to develop a novel data structure capable of adequately storing a complete description of the crosslinked polymer structure. Numerical descriptors of polymer structure are then calculated from the data structure. Statistical methods are used to relate the structural descriptors to experimentally measured properties. An important part of this project is to show that useful property prediction models can be developed for crosslinked polymers. Desirable property target values are then set for a specific application. Finally, the structure-property relations are combined with a Tabu search optimization algorithm to design improved polymers. Tabu search allows much flexibility in the problem formulations, so a major goal of this project is to show that Tabu search is a effective method for crosslinked polymer design. To implement the molecular design procedure, a software package is developed. The software allows for easy graphical entry of polymer structures and property data, and contains a Tabu search optimization routine. Since computational molecular design of crosslinked polymers is a relatively new area of research, the software is designed to be easily modified to allow for extensive numerical experimentation. Finally, the computational design methodology is demonstrated for the design of polymers for restorative dental applications. Using the computational molecular design methodology developed in this project, several monomers are found that may offer a significant improvement over a standard HEMA/bisGMA formulation. The results of the case study show that the new data structure for crosslinked polymers is effective for calculation of topological descriptors and roperty models can be developed for crosslinked polymers. Tabu search is also shown to be an effective optimization method

    On the role of metaheuristic optimization in bioinformatics

    Get PDF
    Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics

    Generalized Lorenz-Mie theory : application to scattering and resonances of photonic complexes

    Get PDF
    Les structures photoniques complexes permettent de façonner la propagation lumineuse à l’échelle de la longueur d’onde au moyen de processus de diffusion et d’interférence. Cette fonctionnalité à l’échelle nanoscopique ouvre la voie à de multiples applications, allant des communications optiques aux biosenseurs. Cette thèse porte principalement sur la modélisation numérique de structures photoniques complexes constituées d’arrangements bidimensionnels de cylindres diélectriques. Deux applications sont privilégiées, soit la conception de dispositifs basés sur des cristaux photoniques pour la manipulation de faisceaux, de même que la réalisation de sources lasers compactes basées sur des molécules photoniques. Ces structures optiques peuvent être analysées au moyen de la théorie de Lorenz-Mie généralisée, une méthode numérique permettant d’exploiter la symétrie cylindrique des diffuseurs sous-jacents. Cette dissertation débute par une description de la théorie de Lorenz-Mie généralisée, obtenue des équations de Maxwell de l’électromagnétisme. D’autres outils théoriques utiles sont également présentés, soit une nouvelle formulation des équations de Maxwell-Bloch pour la modélisation de milieux actifs appelée SALT (steady state ab initio laser theory). Une description sommaire des algorithmes d’optimisation dits métaheuristiques conclut le matériel introductif de la thèse. Nous présentons ensuite la conception et l’optimisation de dispositifs intégrés permettant la génération de faisceaux d’amplitude, de phase et de degré de polarisation contrôlés. Le problème d’optimisation combinatoire associé est solutionné numériquement au moyen de deux métaheuristiques, l’algorithme génétique et la recherche tabou. Une étude théorique des propriétés de micro-lasers basés sur des molécules photoniques – constituées d’un arrangement simple de cylindres actifs – est finalement présentée. En combinant la théorie de Lorenz-Mie et SALT, nous démontrons que les propriétés physiques de ces lasers, plus spécifiquement leur seuil, leur spectre et leur profil d’émission, peuvent être affectés de façon nontriviale par les paramètres du milieu actif sous-jacent. Cette conclusion est hors d’atteinte de l’approche établie qui consiste à calculer les étatsméta-stables de l’équation de Helmholtz et leur facteur de qualité. Une perspective sur la modélisation de milieux photoniques désordonnés conclut cette dissertation.Complex photonic media mold the flow of light at the wavelength scale using multiple scattering and interference effects. This functionality at the nano-scale level paves the way for various applications, ranging from optical communications to biosensing. This thesis is mainly concerned with the numerical modeling of photonic complexes based on twodimensional arrays of cylindrical scatterers. Two applications are considered, namely the use of photonic-crystal-like devices for the design of integrated beam shaping elements, as well as active photonic molecules for the realization of compact laser sources. These photonic structures can be readily analyzed using the 2D Generalized Lorenz-Mie theory (2D-GLMT), a numerical scheme which exploits the symmetry of the underlying cylindrical structures. We begin this thesis by presenting the electromagnetic theory behind 2D-GLMT.Other useful frameworks are also presented, including a recently formulated stationary version of theMaxwell-Bloch equations called steady-state ab initio laser theory (SALT).Metaheuristics, optimization algorithms based on empirical rules for exploring large solution spaces, are also discussed. After laying down the theoretical content, we proceed to the design and optimization of beam shaping devices based on engineered photonic-crystal-like structures. The combinatorial optimization problem associated to beam shaping is tackled using the genetic algorithm (GA) as well as tabu search (TS). Our results show the possibility to design integrated beam shapers tailored for the control of the amplitude, phase and polarization profile of the output beam. A theoretical and numerical study of the lasing characteristics of photonic molecules – composed of a few coupled optically active cylinders – is also presented. Using a combination of 2D-GLMT and SALT, it is shown that the physical properties of photonic molecule lasers, specifically their threshold, spectrum and emission profile, can be significantly affected by the underlying gain medium parameters. These findings are out of reach of the established approach of computing the meta-stable states of the Helmholtz equation and their quality factor. This dissertation is concluded with a research outlook concerning themodeling of disordered photonicmedia

    Optimizing parameters in fuzzy k-means for clustering microarray data.

    Get PDF
    Rapid advances of microarray technologies are making it possible to analyze and manipulate large amounts of gene expression data. Clustering algorithms, such as hierarchical clustering, self-organizing maps, k-means clustering and fuzzy k-means clustering, have become important tools for expression analysis of microarray data. However, the need of prior knowledge of the number of clusters, k, and the fuzziness parameter, b, limits the usage of fuzzy clustering. Few approaches have been proposed for assigning best possible values for such parameters. In this thesis, we use simulated annealing and fuzzy k-means clustering to determine the optimal parameters, namely the number of clusters, k, and the fuzziness parameter, b. To assess the performance of our method, we have used synthetic and real gene experiment data sets. To improve our approach, two methods, searching with Tabu List and Shrinking the scope of randomization, are applied. Our results show that a nearly-optimal pair of k and b can be obtained without exploring the entire search space.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .Y37. Source: Masters Abstracts International, Volume: 44-03, page: 1419. Thesis (M.Sc.)--University of Windsor (Canada), 2005

    Exploration of Reaction Pathways and Chemical Transformation Networks

    Full text link
    For the investigation of chemical reaction networks, the identification of all relevant intermediates and elementary reactions is mandatory. Many algorithmic approaches exist that perform explorations efficiently and automatedly. These approaches differ in their application range, the level of completeness of the exploration, as well as the amount of heuristics and human intervention required. Here, we describe and compare the different approaches based on these criteria. Future directions leveraging the strengths of chemical heuristics, human interaction, and physical rigor are discussed.Comment: 48 pages, 4 figure

    Development of Computer-Aided Molecular Design Methods for Bioengineering Applications

    Get PDF
    Computer-aided molecular design (CAMD) offers a methodology for rational product design. The CAMD procedure consists of pre-design, design and post-design phases. CAMD was used to address two bioengineering problems: design of excipients for lyophilized protein formulations and design of ionic liquids for use in bioseparations. Protein stability remains a major concern during protein drug development. Lyophilization, or freeze-drying, is often sought to improve chemical stability. However, lyophilization can result in protein aggregation. Excipients, or additives, are included to stabilize proteins in lyophilized formulations. CAMD was used to rationally select or design excipients for lyophilized protein formulations. The use of solvents to aid separation is common in chemical processes. Ionic liquids offer a class of molecules with tunable properties that can be altered to find optimal solvents for a given application. CAMD was used to design ionic liquids for extractive distillation and in situ extractive fermentation processes. The pre-design phase involves experimental data gathering and problem formulation. When available, data was obtained from literature sources. For excipient design, data of percent protein monomer remaining post-lyophilization was measured for a variety of protein-excipient combinations. In problem formulation, the objective was to minimize the difference between the properties of the designed molecule and the target property values. Problem formulations resulted in either mixed-integer linear programs (MILPs) or mixed-integer non-linear programs (MINLPs). The design phase consists of the forward problem and the reverse problem. In the forward problem, linear quantitative structure-property relationships (QSPRs) were developed using connectivity indices. Chiral connectivity indices were used for excipient property models to improve fit and incorporate three-dimensional structural information. Descriptor selection methods were employed to find models that minimized Mallow's Cp statistic, obtaining models with good fit while avoiding overfitting. Cross-validation was performed to access predictive capabilities. Model development was also performed to develop group contribution models and non-linear QSPRs. A UNIFAC model was developed to predict the thermodynamic properties of ionic liquids. In the reverse problem of the design phase, molecules were proposed with optimal property values. Deterministic methods were used to design ionic liquids entrainers for azeotropic distillation. Tabu search, a stochastic optimization method, was applied to both ionic liquid and excipient design to provide novel molecular candidates. Tabu search was also compared to a genetic algorithm for CAMD applications. Tuning was performed using a test case to determine parameter values for both methods. After tuning, both stochastic methods were used with design cases to provide optimal excipient stabilizers for lyophilized protein formulations. Results suggested that the genetic algorithm provided a faster time to solution while the tabu search provides quality solutions more consistently. The post-design phase provides solution analysis and verification. Process simulation was used to evaluate the energy requirements of azeotropic separations using designed ionic liquids. Results demonstrated that less energy was required than processes using conventional entrainers or ionic liquids that were not optimally designed. Molecular simulation was used to guide protein formulation design and may prove to be a useful tool in post-design verification. Finally, prediction intervals were used for properties predicted from linear QSPRs to quantify the prediction error in the CAMD solutions. Overlapping prediction intervals indicate solutions with statistically similar property values. Prediction interval analysis showed that tabu search returns many results with statistically similar property values in the design of carbohydrate glass formers for lyophilized protein formulations. The best solutions from tabu search and the genetic algorithm were shown to be statistically similar for all design cases considered. Overall the CAMD method developed here provides a comprehensive framework for the design of novel molecules for bioengineering approaches

    Hardware Accelerated Molecular Docking: A Survey

    Get PDF
    • …
    corecore