114 research outputs found

    Parallelization of Ant System for GPU under the PRAM Model

    Get PDF
    We study the parallelized ant system algorithm solving the traveling salesman problem on n cities. First, following the series of recent results for the graphics processing unit, we show that they translate to the PRAM (parallel random access machine) model. In addition, we develop a novel pheromone matrix update method under the PRAM CREW (concurrent-read exclusive-write) model and translate it to the graphics processing unit without atomic instructions. As a consequence, we give new asymptotic bounds for the parallel ant system, resulting in step complexities O(n Ĺ‚g Ĺ‚g n) on CRCW (concurrent-read concurrent-write) and O(n Ĺ‚g n) on CREW variants of PRAM using n2 processors in both cases. Finally, we present an experimental comparison with the currently known pheromone matrix update methods on the graphics processing unit and obtain encouraging results

    An Efficient Ant Colony Optimization Framework for HPC Environments

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Combinatorial optimization problems arise in many disciplines, both in the basic sciences and in applied fields such as engineering and economics. One of the most popular combinatorial optimization methods is the Ant Colony Optimization (ACO) metaheuristic. Its parallel nature makes it especially attractive for implementation and execution in High Performance Computing (HPC) environments. Here we present a novel parallel ACO strategy making use of efficient asynchronous decentralized cooperative mechanisms. This strategy seeks to fulfill two objectives: (i) acceleration of the computations by performing the ants’ solution construction in parallel; (ii) convergence improvement through the stimulation of the diversification in the search and the cooperation between different colonies. The two main features of the proposal, decentralization and desynchronization, enable a more effective and efficient response in environments where resources are highly coupled. Examples of such infrastructures include both traditional HPC clusters, and also new distributed environments, such as cloud infrastructures, or even local computer networks. The proposal has been evaluated using the popular Traveling Salesman Problem (TSP), as a well-known NP-hard problem widely used in the literature to test combinatorial optimization methods. An exhaustive evaluation has been carried out using three medium and large size instances from the TSPLIB library, and the experiments show encouraging results with superlinear speedups compared to the sequential algorithm (e.g. speedups of 18 with 16 cores), and a very good scalability (experiments were performed with up to 384 cores improving execution time even at that scale).This work was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00 / AEI / 10.13039/501100011033), and by Xunta de Galicia and FEDER funds of the EU (Centro de Investigación de Galicia accreditation 2019–2022, ref. ED431G 2019/01; Consolidation Program of Competitive Reference Groups, ref. ED431C 2021/30). JRB acknowledges funding from the Ministry of Science and Innovation of Spain MCIN / AEI / 10.13039/501100011033 through grant PID2020-117271RB-C22 (BIODYNAMICS), and from MCIN / AEI / 10.13039/501100011033 and “ERDF A way of making Europe” through grant DPI2017-82896-C2-2-R (SYNBIOCONTROL). Authors also acknowledge the Galician Supercomputing Center (CESGA) for the access to its facilities. Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2021/3

    Re-engineering the ant colony optimization for CMP architectures

    Full text link
    [EN] The ant colony optimization (ACO) is inspired by the behavior of real ants, and as a bioinspired method, its underlying computation is massively parallel by definition. This paper shows re-engineering strategies to migrate the ACO algorithm applied to the Traveling Salesman Problem to modern Intel-based multi- and many-core architectures in a step-by-step methodology. The paper provides detailed guidelines on how to optimize the algorithm for the intra-node (thread and vector) parallelization, showing the performance scalability along with the number of cores on different Intel architectures, reporting up to 5.5x speedup factor between the Intel Xeon Phi Knights Landing and Intel Xeon v2. Moreover, parallel efficiency is provided for all targeted architectures, finding that core load imbalance, memory bandwidth limitations, and NUMA effects on data placement are some of the key factors limiting performance. Finally, a distributed implementation is also presented, reaching up to 2.96x speedup factor when running the code on 3 nodes over the single-node counterpart version. In the latter case, the parallel efficiency is affected by the synchronization frequency, which also affects the quality of the solution found by the distributed implementation.This work was partially supported by the Fundación Séneca, Agencia de Ciencia y Tecnología de la Región de Murcia under Project 20813/PI/18, and by Spanish Ministry of Science, Innovation and Universities as well as European Commission FEDER funds under Grants TIN2015-66972-C5-3-R, RTI2018-098156-B-C53, TIN2016-78799-P (AEI/FEDER, UE), and RTC-2017-6389-5. We acknowledge the excellent work done by Victor Montesinos while he was doing a research internship supported by the University of Murcia.Cecilia-Canales, JM.; García Carrasco, JM. (2020). Re-engineering the ant colony optimization for CMP architectures. The Journal of Supercomputing (Online). 76(6):4581-4602. https://doi.org/10.1007/s11227-019-02869-8S45814602766Yang XS (2010) Nature-inspired metaheuristic algorithms. Luniver Press, LebanonAkila M, Anusha P, Sindhu M, Selvan Krishnasamy T (2017) Examination of PSO, GA-PSO and ACO algorithms for the design optimization of printed antennas. In: IEEE Applied Electromagnetics Conference (AEMC)Dorigo M, Stützle T (2004) Ant colony optimization. A bradford book. The MIT Press, CambridgeCecilia JM, García JM, Nisbet A, Amos M, Ujaldón M (2013) Enhancing data parallelism for ant colony optimization on GPUs. J Parallel Distrib Comput 73(1):42–51Dawson L, Stewart I (2013) Improving ant colony optimization performance on the GPU using CUDA. In: IEEE Conference on Evolutionary Computation, pp 1901–1908Llanes A, Cecilia JM, Sánchez A, García JM, Amos M, Ujaldón M (2016) Dynamic load balancing on heterogeneous clusters for parallel ant colony optimization. Cluster Comput 19(1):1–11Cecilia JM, Llanes A, Abellán JL, Gómez-Luna J, Chang L, Hwu WW (2018) High-throughput ant colony optimization on graphics processing units. J Parallel Distrib Comput 113:261–274Lloyd H, Amos M (2016) A Highly Parallelized and Vectorized Implementation of Max–Min Ant System on Intel Xeon Phi. In: IEEE computational intelligenceTirado F, Barrientos RJ, González P, Mora M (2017) Efficient exploitation of the Xeon Phi architecture for the ant colony optimization (ACO) metaheuristic. J Supercomput 73(11):5053–5070Montesinos V, García JM (2018) Vectorization strategies for ant colony optimization on intel architectures. Parallel Computing is Everywhere. IOS Press, Amsterdam, pp 400–409Lawler E, Lenstra J, Kan A, Shmoys D (1987) The Traveling salesman problem. Wiley, New YorkMontesinos V (June 2018) Performance analysis of ant colony optimization on intel architectures. Master’s Thesis, University of Murcia (Spain)Lloyd H, Amos M (2017) Analysis of independent roulette selection in parallel ant colony optimization. In: Genetic and Evolutionary Computation Conference, ACM, pp 19–26Dorigo M (1992) Optimization, learning and natural algorithms. Ph.D. Thesis, Politecnico di Milano, ItalyDuran A, Klemm M (2012) The intel many integrated core architecture. In: Internal Conference on High Performance Computing and Simulation (HPCS), pp 365–366The OpenMP API specification for parallel programming. URL: https://www.openmp.org . [Last accessed 14 June 2018]The Message Passing Interface (MPI) standard. URL: http://www.mcs.anl.gov/research/projects/mpi/ . [Last accessed 15 June 2018]Vladimirov A, Asai R (2016) Clustering modes in Knights landing processors: developer’s guide. Colfax international. URL: https://colfaxresearch.com/knl-numa/ . [Last accessed: 16 June 2018]Intel Developer Zone. URL: https://software.intel.com/en-us/modern-code . [Last accessed 02 Oct 2018]Pearce M (2018) What is code modernization? Intel developer zone. URL: http://software.intel.com/en-us/articles/what-is-code-modernization . [Last accessed 15 Feb 2018]Stützle T ACOTSP v1.03. Last accessed 15 Feb 2018. URL: http://iridia.ulb.ac.be/~mdorigo/ACO/downloads/ACOTSP-1.03.tgzReinelt G (1991) TSPLIB—a traveling salesman problem library. ORSA J Comput 3:376–384Crainic TG, Toulouse M (2003) Parallel strategies for meta-heuristics. State-of-the-art handbook in metaheuristics. Kluwer Academic Publishers, Dordrecht, pp 475–513Delévacq A, Delisle P, Gravel M, Krajecki M (2013) Parallel ant colony optimization on graphics processing units. J Parallel Distrib Comput 73(1):52–61Skinderowicz R (2016) The GPU-based parallel ant colony system. J Parallel Distrib Comput 98:48–60Zhou Y, He F, Hou N, Qiu Y (2018) Parallel ant colony optimization on multi-core SIMD CPUs. Future Gener Comput Syst 79:473–487Peake J, Amos M, Yiapanis P, Lloyd H (2018) Vectorized candidate set selection for parallel ant colony optimization. In: Genetic and Evolutionary Computation Conference, ACM, pp 1300–1306Stützle T (1998) Parallelization strategies for ant colony optimization. In: Eiben AE, Bäck T, Schoenauer M, Schwefel HP (eds) Parallel problem solving from nature—PPSN V. PPSN. Lecture Notes in Computer Science, vol 1498. Springer, Berlin, HeidelbergAbdelkafi O, Lepagnot J, Idoumghar L (2014) Multi-level parallelization for hybrid ACO. In: Siarry P, Idoumghar L, Lepagnot J (eds) Swarm Intelligence Based Optimization. ICSIBO 2014. Lecture Notes in Computer Science, vol 8472. Springer, ChamMichel R, Middendorf M (1998) An island model based ant system with lookahead for the shortest super sequence problem. In: Eiben AE, Bäck T, Schoenauer M, Schwefel HP (eds) Parallel problem solving from nature— PPSN V. PPSN. Lecture Notes in Computer Science, vol 1498. Springer, Berlin, HeidelbergChen L, Sun H, Wang S (2008) Parallel implementation of ant colony optimization on MPP. In: International Conference on Machine Learning and CyberneticsLin Y, Cai H, Xiao J, Zhang J (2007) Pseudo parallel ant colony optimization for continuous functions. In: International Conference on Natural Computatio

    Parallelization Strategies for Ant Colony Optimisation on GPUs

    Full text link
    Ant Colony Optimisation (ACO) is an effective population-based meta-heuristic for the solution of a wide variety of problems. As a population-based algorithm, its computation is intrinsically massively parallel, and it is there- fore theoretically well-suited for implementation on Graphics Processing Units (GPUs). The ACO algorithm comprises two main stages: Tour construction and Pheromone update. The former has been previously implemented on the GPU, using a task-based parallelism approach. However, up until now, the latter has always been implemented on the CPU. In this paper, we discuss several parallelisation strategies for both stages of the ACO algorithm on the GPU. We propose an alternative data-based parallelism scheme for Tour construction, which fits better on the GPU architecture. We also describe novel GPU programming strategies for the Pheromone update stage. Our results show a total speed-up exceeding 28x for the Tour construction stage, and 20x for Pheromone update, and suggest that ACO is a potentially fruitful area for future research in the GPU domain.Comment: Accepted by 14th International Workshop on Nature Inspired Distributed Computing (NIDISC 2011), held in conjunction with the 25th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS 2011

    Ant Colony Optimization

    Get PDF
    Ant Colony Optimization (ACO) is the best example of how studies aimed at understanding and modeling the behavior of ants and other social insects can provide inspiration for the development of computational algorithms for the solution of difficult mathematical problems. Introduced by Marco Dorigo in his PhD thesis (1992) and initially applied to the travelling salesman problem, the ACO field has experienced a tremendous growth, standing today as an important nature-inspired stochastic metaheuristic for hard optimization problems. This book presents state-of-the-art ACO methods and is divided into two parts: (I) Techniques, which includes parallel implementations, and (II) Applications, where recent contributions of ACO to diverse fields, such as traffic congestion and control, structural optimization, manufacturing, and genomics are presented

    Applying ACO To Large Scale TSP Instances

    Full text link
    Ant Colony Optimisation (ACO) is a well known metaheuristic that has proven successful at solving Travelling Salesman Problems (TSP). However, ACO suffers from two issues; the first is that the technique has significant memory requirements for storing pheromone levels on edges between cities and second, the iterative probabilistic nature of choosing which city to visit next at every step is computationally expensive. This restricts ACO from solving larger TSP instances. This paper will present a methodology for deploying ACO on larger TSP instances by removing the high memory requirements, exploiting parallel CPU hardware and introducing a significant efficiency saving measure. The approach results in greater accuracy and speed. This enables the proposed ACO approach to tackle TSP instances of up to 200K cities within reasonable timescales using a single CPU. Speedups of as much as 1200 fold are achieved by the technique

    A Parallel Meta-Heuristic Approach to Reduce Vehicle Travel Time in Smart Cities

    Get PDF
    The development of the smart city concept and inhabitants’ need to reduce travel time, in addition to society’s awareness of the importance of reducing fuel consumption and respecting the environment, have led to a new approach to the classic travelling salesman problem (TSP) applied to urban environments. This problem can be formulated as “Given a list of geographic points and the distances between each pair of points, what is the shortest possible route that visits each point and returns to the departure point?”. At present, with the development of Internet of Things (IoT) devices and increased capabilities of sensors, a large amount of data and measurements are available, allowing researchers to model accurately the routes to choose. In this work, the aim is to provide a solution to the TSP in smart city environments using a modified version of the metaheuristic optimization algorithm Teacher Learner Based Optimization (TLBO). In addition, to improve performance, the solution is implemented by means of a parallel graphics processing unit (GPU) architecture, specifically a Compute Unified Device Architecture (CUDA) implementation.This research was supported by the Spanish Ministry of Science, Innovation and Universities and the Research State Agency under Grant RTI2018-098156-B-C54 co-financed by FEDER funds, and by the Spanish Ministry of Economy and Competitiveness under Grant TIN2017-89266-R, co-financed by FEDER funds
    • …
    corecore