8 research outputs found

    A Study of Ordered Gene Problems Featuring DNA Error Correction and DNA Fragment Assembly with a Variety of Heuristics, Genetic Algorithm Variations, and Dynamic Representations

    Get PDF
    Ordered gene problems are a very common classification of optimization problems. Because of their popularity countless algorithms have been developed in an attempt to find high quality solutions to the problems. It is also common to see many different types of problems reduced to ordered gene style problems as there are many popular heuristics and metaheuristics for them due to their popularity. Multiple ordered gene problems are studied, namely, the travelling salesman problem, bin packing problem, and graph colouring problem. In addition, two bioinformatics problems not traditionally seen as ordered gene problems are studied: DNA error correction and DNA fragment assembly. These problems are studied with multiple variations and combinations of heuristics and metaheuristics with two distinct types or representations. The majority of the algorithms are built around the Recentering- Restarting Genetic Algorithm. The algorithm variations were successful on all problems studied, and particularly for the two bioinformatics problems. For DNA Error Correction multiple cases were found with 100% of the codes being corrected. The algorithm variations were also able to beat all other state-of-the-art DNA Fragment Assemblers on 13 out of 16 benchmark problem instances

    On the role of metaheuristic optimization in bioinformatics

    Get PDF
    Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics

    Finding Nonlinear Relationships in Functional Magnetic Resonance Imaging Data with Genetic Programming

    Get PDF
    The human brain is a complex, nonlinear dynamic chaotic system that is poorly understood. When faced with these difficult to understand systems, it is common to observe the system and develop models such that the underlying system might be deciphered. When observing neurological activity within the brain with functional magnetic resonance imaging (fMRI), it is common to develop linear models of functional connectivity; however, these models are incapable of describing the nonlinearities we know to exist within the system. A genetic programming (GP) system was developed to perform symbolic regression on recorded fMRI data. Symbolic regression makes fewer assumptions than traditional linear tools and can describe nonlinearities within the system. Although GP is a powerful form of machine learning that has many drawbacks (computational cost, overfitting, stochastic), it may provide new insights into the underlying system being studied. The contents of this thesis are presented in an integrated article format. For all articles, data from the Human Connectome Project were used. In the first article, nonlinear models for 507 subjects performing a motor task were created. These nonlinear models generated by GP contained fewer ROI than what would be found with traditional, linear tools. It was found that the generated nonlinear models would not fit the data as well as the linear models; however, when compared to linear models containing a similar number of ROI, the nonlinear models performed better. Ten subjects performing 7 tasks were studied in article two. After improvements to the GP system, the generated nonlinear models outperformed the linear models in many cases and were never significantly worse than the linear models. Forty subjects performing 7 tasks were studied in article three. Newly generated nonlinear models were applied to unseen data from the same subject performing the same task (intrasubject generalization) and many nonlinear models generalized to unseen data better than the linear models. The nonlinear models were applied to unseen data from other subjects performing the same task (intersubject generalization) and were not capable of generalizing as well as the linear

    Contributions `a la r´esolution de probl`emes d’optimisation combinatoires NP-difficiles

    Get PDF
    Cette th�ese porte sur des algorithmes e�caces pour la r�esolution de probl�emes d'optimisation combinatoires NP-di�ciles, avec deux contributions. La premi�ere contribution consiste en la proposition d'un nouvel algorithme multiob- jectif hybride combinant un algorithme g�en�etique avec un op�erateur de recherche bas�e sur l'optimisation par essaims de particules. L'objectif de cette hybridation est de surmonter les situations de convergence lente des algorithmes g�en�etiques multiobjectifs lors de la r�e- solution de probl�emes di�ciles �a plus de deux objectifs. Dans le sch�ema hybride propos�e, un algorithme g�en�etique multiobjectif Pareto applique p�eriodiquement un algorithme d'op- timisation par essaim de particules pour optimiser une fonction d'adaptation scalaire sur une population archive. Deux variantes de cet algorithme hybride sont propos�ees et adap- t�ees pour la r�esolution du probl�eme du sac �a dos multiobjectif. Les r�esultats exp�erimentaux prouvent que les algorithmes hybrides sont plus performants que les algorithmes standards. La seconde contribution concerne l'am�elioration d'un algorithme heuristique de recherche locale dit PALS (pour l'anglais Problem Aware Local Search) sp�eci�que au probl�eme d'as- semblage de fragments d'ADN, un probl�eme d'optimisation combinatoire NP-di�cile en bio-informatique des s�equences. Deux modi�cations �a PALS sont propos�ees. La premi�ere modi�cation permet d'�eviter les ph�enom�enes de convergence pr�ematur�ee vers des optima lo- caux. La seconde modi�cation conduit �a une r�eduction signi�cative des temps de calcul tout en conservant la pr�ecision des r�esultats. Apr�es des exp�erimentations r�ealis�ees sur les jeux de donn�ees disponibles dans la litt�erature, nos nouvelles variantes de PALS se r�ev�elent tr�es comp�etitives par rapport aux variantes existantes et �a d'autres algorithmes d'assemblage

    Recentering and Restarting Genetic Algorithm variations for DNA Fragment Assembly

    No full text

    Restarting and recentering genetic algorithm variations for DNA fragment assembly: The necessity of a multi-strategy approach

    Get PDF
    DNA Fragment assembly – an NP-Hard problem – is one of the major steps in of DNA sequencing. Multiple strategies have been used for this problem, including greedy graph-based algorithms, deBruijn graphs, and the overlap-layout-consensus approach. This study focuses on the overlap-layout-consensus approach. Heuristics and computational intelligence methods are combined to exploit their respective benefits. These algorithm combinations were able to produce high quality results surpassing the best results obtained by a number of competitive algorithms specially designed and tuned for this problem on thirteen of sixteen popular benchmarks. This work also reinforces the necessity of using multiple search strategies as it is clearly observed that algorithm performance is dependent on problem instance; without a deeper look into many searches, top solutions could be missed entirely.Natural Sciences and Engineering Research Council of Canad

    Restarting and recentering genetic algorithm variations for DNA fragment assembly: The necessity of a multi-strategy approach

    Get PDF
    DNA Fragment assembly – an NP-Hard problem – is one of the major steps in of DNA sequencing. Multiple strategies have been used for this problem, including greedy graph-based algorithms, deBruijn graphs, and the overlap-layout-consensus approach. This study focuses on the overlap-layout-consensus approach. Heuristics and computational intelligence methods are combined to exploit their respective benefits. These algorithm combinations were able to produce high quality results surpassing the best results obtained by a number of competitive algorithms specially designed and tuned for this problem on thirteen of sixteen popular benchmarks. This work also reinforces the necessity of using multiple search strategies as it is clearly observed that algorithm performance is dependent on problem instance; without a deeper look into many searches, top solutions could be missed entirely
    corecore