4 research outputs found

    High performance genetic programming on GPU

    Full text link
    The availability of low cost powerful parallel graphics cards has stimulated the port of Genetic Programming (GP) on Graphics Processing Units (GPUs). Our work focuses on the possibilities offered by Nvidia G80 GPUs when pro-grammed in the CUDA language. We compare two par-allelization schemes that evaluate several GP programs in parallel. We show that the fine grain distribution of compu-tations over the elementary processors greatly impacts per-formances. We also present memory and representation op-timizations that further enhance computation speed, up to 2.8 billion GP operations per second. The code has been developed with the well known ECJ library

    Graphics Processing Unit–Enhanced Genetic Algorithms for Solving the Temporal Dynamics of Gene Regulatory Networks

    Get PDF
    Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes—master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)—is carried out for this problem. Several procedures that optimize the use of the GPU’s resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent sequential single-core implementation running on a recent Intel i7 CPU. This work can provide useful guidance to researchers in biology, medicine, or bioinformatics in how to take advantage of the parallelization on massively parallel devices and GPUs to apply novel metaheuristic algorithms powered by nature for real-world applications (like the method to solve the temporal dynamics of GRNs)

    Contribution à la résolution de problèmes d'optimisation combinatoire : méthodes séquentielles et parallèles

    Get PDF
    Les problèmes d'optimisation combinatoire sont souvent des problèmes très difficiles dont la résolution par des méthodes exactes peut s'avérer très longue ou peu réaliste. L'utilisation de méthodes heuristiques permet d'obtenir des solutions de bonne qualité en un temps de résolution raisonnable. Les heuristiques sont aussi très utiles pour le développement de méthodes exactes fondées sur des techniques d'évaluation et de séparation. Nous nous sommes intéressés dans un premier temps à proposer une méthode heuristique pour le problème du sac à dos multiple MKP. L'approche proposée est comparée à l'heuristique MTHM et au solveur CPLEX. Dans un deuxième temps nous présentons la mise en oeuvre parallèle d'une méthode exacte de résolution de problèmes d'optimisation combinatoire de type sac à dos sur architecture GPU. La mise en oeuvre CPU-GPU de la méthode de Branch and Bound pour la résolution de problèmes de sac à dos a montré une accélération de 51 sur une carte graphique Nvidia Tesla C2050. Nous présentons aussi une mise en oeuvre CPU-GPU de la méthode du Simplexe pour la résolution de problèmes de programmation linéaire. Cette dernière offre une accélération de 12.7 sur une carte graphique Nvidia Tesla C2050. Enfin, nous proposons une mise en oeuvre multi-GPU de l'algorithme du Simplexe, mettant à contribution plusieurs cartes graphiques présentes dans une même machine (2 cartes Nvidia Tesla C2050 dans notre cas). Outre l'accélération obtenue par rapport à la mise en oeuvre séquentielle de la méthode du Simplexe, une efficacité de 96.5 % est obtenue, en passant d'une carte à deux cartes graphiques.Combinatorial optimization problems are difficult problems whose solution by exact methods can be time consuming or not realistic. The use of heuristics permits one to obtain good quality solutions in a reasonable time. Heuristics are also very useful for the development of exact methods based on branch and bound techniques. The first part of this thesis concerns the Multiple Knapsack Problem (MKP). We propose here a heuristic called RCH which yields a good solution for the MKP problem. This approach is compared to the MTHM heuristic and CPLEX solver. The second part of this thesis concerns parallel implementation of an exact method for solving combinatorial optimization problems like knapsack problems on GPU architecture. The parallel implementation of the Branch and Bound method via CUDA for knapsack problems is proposed. Experimental results show a speedup of 51 for difficult problems using a Nvidia Tesla C2050 (448 cores). A CPU-GPU implementation of the simplex method for solving linear programming problems is also proposed. This implementation offers a speedup around 12.7 on a Tesla C2050 board. Finally, we propose a multi-GPU implementation of the simplex algorithm via CUDA. An efficiency of 96.5% is obtained when passing from one GPU to two GPUs
    corecore