699 research outputs found

    GPU Computing for Parallel Local Search Metaheuristics

    Get PDF
    International audienceLocal search metaheuristics (LSMs) are efficient methods for solving complex problems in science and industry. They allow significantly to reduce the size of the search space to be explored and the search time. Nevertheless, the resolution time remains prohibitive when dealing with large problem instances. Therefore, the use of GPU-based massively parallel computing is a major complementary way to speed up the search. However, GPU computing for LSMs is rarely investigated in the literature. In this paper, we introduce a new guideline for the design and implementation of effective LSMs on GPU. Very efficient approaches are proposed for CPU-GPU data transfer optimization, thread control, mapping of neighboring solutions to GPU threads and memory management. These approaches have been experimented using four well-known combinatorial and continuous optimization problems and four GPU configurations. Compared to a CPU-based execution, accelerations up to x80 are reported for the large combinatorial problems and up to x240 for a continuous problem. Finally, extensive experiments demonstrate the strong potential of GPU-based LSMs compared to cluster or grid-based parallel architectures

    METADOCK: A parallel metaheuristic schema for virtual screening methods

    Get PDF
    Virtual screening through molecular docking can be translated into an optimization problem, which can be tackled with metaheuristic methods. The interaction between two chemical compounds (typically a protein, enzyme or receptor, and a small molecule, or ligand) is calculated by using highly computationally demanding scoring functions that are computed at several binding spots located throughout the protein surface. This paper introduces METADOCK, a novel molecular docking methodology based on parameterized and parallel metaheuristics and designed to leverage heterogeneous computers based on heterogeneous architectures. The application decides the optimization technique at running time by setting a configuration schema. Our proposed solution finds a good workload balance via dynamic assignment of jobs to heterogeneous resources which perform independent metaheuristic executions when computing different molecular interactions required by the scoring functions in use. A cooperative scheduling of jobs optimizes the quality of the solution and the overall performance of the simulation, so opening a new path for further developments of virtual screening methods on high-performance contemporary heterogeneous platforms.Ingeniería, Industria y Construcció

    Parallel Local Search on GPU

    Get PDF
    www.lifl.fr/~luongLocal search algorithms are a class of algorithms to solve complex optimization problems in science and industry. Even if these metaheuristics allow to significantly reduce the computational time of the solution exploration space, the iterative process remains costly when very large problem instances are dealt with. As a solution, graphics processing units (GPUs) represent an efficient alternative for calculations instead of traditional CPU. This paper presents a new methodology to design and implement local search algorithms on GPU. Methods such as tabu search, hill climbing or iterated local search present similar concepts that can be parallelized on GPU and then a general cooperative model can be highlighted. In addition to single-solution based metaheuristics on GPU, this model can be extended with a hybrid multi-core and multi-GPU approach for multiple local search methods such as multistart. The conclusions from both GPU and multi-GPU experiments indicate significant speed-ups compared to CPU approaches

    GPU accelerated Nature Inspired Methods for Modelling Large Scale Bi-Directional Pedestrian Movement

    Full text link
    Pedestrian movement, although ubiquitous and well-studied, is still not that well understood due to the complicating nature of the embedded social dynamics. Interest among researchers in simulating pedestrian movement and interactions has grown significantly in part due to increased computational and visualization capabilities afforded by high power computing. Different approaches have been adopted to simulate pedestrian movement under various circumstances and interactions. In the present work, bi-directional crowd movement is simulated where an equal numbers of individuals try to reach the opposite sides of an environment. Two movement methods are considered. First a Least Effort Model (LEM) is investigated where agents try to take an optimal path with as minimal changes from their intended path as possible. Following this, a modified form of Ant Colony Optimization (ACO) is proposed, where individuals are guided by a goal of reaching the other side in a least effort mode as well as a pheromone trail left by predecessors. The basic idea is to increase agent interaction, thereby more closely reflecting a real world scenario. The methodology utilizes Graphics Processing Units (GPUs) for general purpose computing using the CUDA platform. Because of the inherent parallel properties associated with pedestrian movement such as proximate interactions of individuals on a 2D grid, GPUs are well suited. The main feature of the implementation undertaken here is that the parallelism is data driven. The data driven implementation leads to a speedup up to 18x compared to its sequential counterpart running on a single threaded CPU. The numbers of pedestrians considered in the model ranged from 2K to 100K representing numbers typical of mass gathering events. A detailed discussion addresses implementation challenges faced and averted

    METADOCK 2: a high-throughput parallel metaheuristic scheme for molecular docking

    Full text link
    [EN] Motivation Molecular docking methods are extensively used to predict the interaction between protein-ligand systems in terms of structure and binding affinity, through the optimization of a physics-based scoring function. However, the computational requirements of these simulations grow exponentially with: (i) the global optimization procedure, (ii) the number and degrees of freedom of molecular conformations generated and (iii) the mathematical complexity of the scoring function. Results In this work, we introduce a novel molecular docking method named METADOCK 2, which incorporates several novel features, such as (i) a ligand-dependent blind docking approach that exhaustively scans the whole protein surface to detect novel allosteric sites, (ii) an optimization method to enable the use of a wide branch of metaheuristics and (iii) a heterogeneous implementation based on multicore CPUs and multiple graphics processing units. Two representative scoring functions implemented in METADOCK 2 are extensively evaluated in terms of computational performance and accuracy using several benchmarks (such as the well-known DUD) against AutoDock 4.2 and AutoDock Vina. Results place METADOCK 2 as an efficient and accurate docking methodology able to deal with complex systems where computational demands are staggering and which outperforms both AutoDock Vina and AutoDock 4.This work was partially supported by the Fundación Séneca del Centro de Coordinación de la Investigación de la Región de Murcia [Projects 20813/PI/ 18, 20988/PI/18, 20524/PDC/18] and by the Spanish Ministry of Science, Innovation and Universities [TIN2016-78799-P (AEI/FEDER, UE), CTQ2017-87974-R]. The authors thankfully acknowledge the computer resources at CTE-POWER and the technical support provided by Barcelona Supercomputing Center - Centro Nacional de Supercomputación [RES-BCV2018-3-0008].Imbernón, B.; Serrano, A.; Bueno-Crespo, A.; Abellán, JL.; Pérez-Sánchez, H.; Cecilia-Canales, JM. (2020). METADOCK 2: a high-throughput parallel metaheuristic scheme for molecular docking. Bioinformatics. 1-6. https://doi.org/10.1093/bioinformatics/btz958S16Bianchi, L., Dorigo, M., Gambardella, L. M., & Gutjahr, W. J. (2008). A survey on metaheuristics for stochastic combinatorial optimization. Natural Computing, 8(2), 239-287. doi:10.1007/s11047-008-9098-4Cecilia, J. M., Llanes, A., Abellán, J. L., Gómez-Luna, J., Chang, L.-W., & Hwu, W.-M. W. (2018). High-throughput Ant Colony Optimization on graphics processing units. Journal of Parallel and Distributed Computing, 113, 261-274. doi:10.1016/j.jpdc.2017.12.002Desiraju, G., & Steiner, T. (2001). The Weak Hydrogen Bond. doi:10.1093/acprof:oso/9780198509707.001.0001Eisenberg, D., & McLachlan, A. D. (1986). Solvation energy in protein folding and binding. Nature, 319(6050), 199-203. doi:10.1038/319199a0Ewing, T. J. A., Makino, S., Skillman, A. G., & Kuntz, I. D. (2001). Journal of Computer-Aided Molecular Design, 15(5), 411-428. doi:10.1023/a:1011115820450Friesner, R. A., Banks, J. L., Murphy, R. B., Halgren, T. A., Klicic, J. J., Mainz, D. T., … Shenkin, P. S. (2004). Glide:  A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy. Journal of Medicinal Chemistry, 47(7), 1739-1749. doi:10.1021/jm0306430Guerrero, G. D., Imbernón, B., Pérez-Sánchez, H., Sanz, F., García, J. M., & Cecilia, J. M. (2014). A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing. BioMed Research International, 2014, 1-8. doi:10.1155/2014/474219Hauser, A. S., & Windshügel, B. (2016). LEADS-PEP: A Benchmark Data Set for Assessment of Peptide Docking Performance. Journal of Chemical Information and Modeling, 56(1), 188-200. doi:10.1021/acs.jcim.5b00234Llanes, A., Muñoz, A., Bueno-Crespo, A., García-Valverde, T., Sánchez, A., Arcas-Túnez, F., … M. Cecilia, J. (2016). Soft Computing Techniques for the Protein Folding Problem on High Performance Computing Architectures. Current Drug Targets, 17(14), 1626-1648. doi:10.2174/1389450117666160201114028McIntosh-Smith, S., Price, J., Sessions, R. B., & Ibarra, A. A. (2014). High performance in silico virtual drug screening on many-core processors. The International Journal of High Performance Computing Applications, 29(2), 119-134. doi:10.1177/1094342014528252Mehler, E. L., & Solmajer, T. (1991). Electrostatic effects in proteins: comparison of dielectric and charge models. «Protein Engineering, Design and Selection», 4(8), 903-910. doi:10.1093/protein/4.8.903Morris, G. M., Goodsell, D. S., Halliday, R. S., Huey, R., Hart, W. E., Belew, R. K., & Olson, A. J. (1998). Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry, 19(14), 1639-1662. doi:10.1002/(sici)1096-987x(19981115)19:143.0.co;2-bMysinger, M. M., Carchia, M., Irwin, J. J., & Shoichet, B. K. (2012). Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking. Journal of Medicinal Chemistry, 55(14), 6582-6594. doi:10.1021/jm300687eO’Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R. (2011). Open Babel: An open chemical toolbox. Journal of Cheminformatics, 3(1). doi:10.1186/1758-2946-3-33Sakurai, Y., Kolokoltsov, A. A., Chen, C.-C., Tidwell, M. W., Bauta, W. E., Klugbauer, N., … Davey, R. A. (2015). Two-pore channels control Ebola virus host cell entry and are drug targets for disease treatment. Science, 347(6225), 995-998. doi:10.1126/science.1258758Sánchez-Linares, I., Pérez-Sánchez, H., Cecilia, J. M., & García, J. M. (2012). High-Throughput parallel blind Virtual Screening using BINDSURF. BMC Bioinformatics, 13(S14). doi:10.1186/1471-2105-13-s14-s13Sliwoski, G., Kothiwale, S., Meiler, J., & Lowe, E. W. (2013). Computational Methods in Drug Discovery. Pharmacological Reviews, 66(1), 334-395. doi:10.1124/pr.112.007336Sörensen, K. (2013). Metaheuristics-the metaphor exposed. International Transactions in Operational Research, 22(1), 3-18. doi:10.1111/itor.12001Yuan, S., Chan, J. F.-W., den-Haan, H., Chik, K. K.-H., Zhang, A. J., Chan, C. C.-S., … Yuen, K.-Y. (2017). Structure-based discovery of clinically approved drugs as Zika virus NS2B-NS3 protease inhibitors that potently inhibit Zika virus infection in vitro and in vivo. Antiviral Research, 145, 33-43. doi:10.1016/j.antiviral.2017.07.00

    Exploiting Heterogeneous Parallelism on Hybrid Metaheuristics for Vector Autoregression Models

    Get PDF
    In the last years, the huge amount of data available in many disciplines makes the mathematical modeling, and, more concretely, econometric models, a very important technique to explain those data. One of the most used of those econometric techniques is the Vector Autoregression Models (VAR) which are multi-equation models that linearly describe the interactions and behavior of a group of variables by using their past. Traditionally, Ordinary Least Squares and Maximum likelihood estimators have been used in the estimation of VAR models. These techniques are consistent and asymptotically efficient under ideal conditions of the data and the identification problem. Otherwise, these techniques would yield inconsistent parameter estimations. This paper considers the estimation of a VAR model by minimizing the difference between the dependent variables in a certain time, and the expression of their own past and the exogenous variables of the model (in this case denoted as VARX model). The solution of this optimization problem is approached through hybrid metaheuristics. The high computational cost due to the huge amount of data makes it necessary to exploit High-Performance Computing for the acceleration of methods to obtain the models. The parameterized, parallel implementation of the metaheuristics and the matrix formulation ease the simultaneous exploitation of parallelism for groups of hybrid metaheuristics. Multilevel and heterogeneous parallelism are exploited in multicore CPU plus multiGPU nodes, with the optimum combination of the different parallelism parameters depending on the particular metaheuristic and the problem it is applied to.This work was supported by the Spanish MICINN and AEI, as well as European Commission FEDER funds, under grant RTI2018-098156-B-C53 and grant TIN2016-80565-R

    Enhancing large-scale docking simulation on heterogeneous systems: An MPI vs rCUDA study

    Full text link
    [EN] Virtual Screening (VS) methods can considerably aid clinical research by predicting how ligands interact with pharmacological targets, thus accelerating the slow and critical process of finding new drugs. VS methods screen large databases of chemical compounds to find a candidate that interacts with a given target. The computational requirements of VS models, along with the size of the databases, containing up to millions of biological macromolecular structures, means computer clusters are a must. However, programming current clusters of computers is no easy task, as they have become heterogeneous and distributed systems where various programming models need to be used together to fully leverage their resources. This paper evaluates several strategies to provide peak performance to a GPU-based molecular docking application called METADOCK in heterogeneous clusters of computers based on CPU and NVIDIA Graphics Processing Units (GPUs). Our developments start with an OpenMP, MPI and CUDA METADOCK version as a baseline case of cluster utilization. Next, we explore the virtualized GPUs provided by the rCUDA framework in order to facilitate the programming process. rCUDA allows us to use remote GPUs, i.e. installed in other nodes of the cluster, as if they were installed in the local node, so enabling access to them using only OpenMP and CUDA. Finally, several load balancing strategies are analyzed in a search to enhance performance. Our results reveal that the use of middleware like rCUDA is a convincing alternative to leveraging heterogeneous clusters, as it offers even better performance than traditional approaches and also makes it easier to program these emerging clusters.This work is jointly supported by the Fundacion Seneca (Agencia Regional de Ciencia y Tecnologia, Region de Murcia) under grant 18946/JLI/13, and by the Spanish MEC and European Commission FEDER under grants TIN2015-66972-C5-3-R and TIN2016-78799-P (AEI/FEDER, UE). We also thank NVIDIA for hardware donation under GPU Educational Center 2014-2016 and Research Center 2015-2016. Furthermore, researchers from Universitat Politecnica de Valencia are supported by the Generalitat Valenciana under Grant PROMETEO/2017/077. Authors are also grateful for the generous support provided by Mellanox Technologies Inc.Imbernón, B.; Prades Gasulla, J.; Gimenez Canovas, D.; Cecilia, JM.; Silla Jiménez, F. (2018). Enhancing large-scale docking simulation on heterogeneous systems: An MPI vs rCUDA study. Future Generation Computer Systems. 79:26-37. https://doi.org/10.1016/j.future.2017.08.050S26377

    A GPU-based Iterated Tabu Search for Solving the Quadratic 3-dimensional Assignment Problem

    Get PDF
    International audienceThe quadratic 3-dimensional assignment problem (Q3AP) is an extension of the well-known NP-hard quadratic assignment problem. It has been proved to be one of the most difficult combinatorial optimization problems. Local search (LS) algorithms are a class of heuristics which have been successfully applied to solve such hard optimization problem. These methods handle with a single solution iteratively improved by exploring its neighborhood in the solution space. In this paper, we propose an iterated tabu search for solving the Q3AP. The design of this algorithm is essentially based on a new large neighborhood structure. Indeed, in LS heuristics, designing operators to explore large promising regions of the search space may improve the quality of the obtained solutions. However, designing such neighborhood is at the expense of a highly computationally process. Therefore, the use of graphics processing units (GPUs) provides an efficient complementary way to speed up the search. The proposed GPU-based iterated tabu search has been experimented on 5 different Q3AP instances. The obtained results are convincing both in terms of efficiency, quality and robustness of the provided solutions at run time

    Parallel Hybrid Evolutionary Algorithms on GPU

    Get PDF
    International audienceOver the last years, interest in hybrid metaheuristics has risen considerably in the field of optimization. Combinations of methods such as evolutionary algorithms and local searches have provided very powerful search algorithms. However, due to their complexity, the computational time of the solution search exploration remains exorbitant when large problem instances are to be solved. Therefore, the use of GPU-based parallel computing is required as a complementary way to speed up the search. This paper presents a new methodology to design and implement efficiently and effectively hybrid evolutionary algorithms on GPU accelerators. The methodology enables efficient mappings of the explored search space onto the GPU memory hierarchy. The experimental results show that the approach is very efficient especially for large problem instances
    corecore